Entries tagged with “xml” from Tools of Change for Publishing

Some Tasty Bits from the StartWithXML UK Survey

We've got some raw results from the StartWithXML survey in the UK, and they are very different in some respects from the US survey we did. Some salient points:

  • 48.7% of the respondents were in the STM market, followed by trade (24.4%) and college (16%).
  • The bulk of respondents were from large houses - 50.4% - and the rest were evenly divided between midsized and small presses.
  • Nearly 55% of the respondents considered themselves "tech-proficient." As most of them were from production or management, this was not surprising. We did have a significant number of editorial respondents, however - 19.3%.
  • To 40.6% of our respondents, digital publishing is "very important - it informs all we do." Meanwhile, 59.4% of respondents are grappling with its impact in their companies. Only 17.8% of respondents say that they do not focus on the downstream uses of their book content, but on the print volume alone.
  • As far as expanded editions are concerned, 53.5% of publishers say they don't offer these. And 69.3% do not offer more than the basic ONIX marketing content (cover image, description, first chapter, table of contents) in their digital marketing efforts.
  • Over 73% of publishers do not have a formalized (formalised, if you're in the UK) DAM system.
  • And over 50% do not maintain files in an XML format.
  • Nearly 69% of respondents have problems retrieving files from storage, and have to institute workarounds. But over 56% look at XML as a way of complementing CMS and DAM tools they have already invested in.


CSS in an XML Workflow

At the StartWithXML Forum in New York in January, Rebecca Goldthwaite of Cengage gave a great demonstration of how Cengage uses CSS in their XML workflow. Many publishers regard style sheets as an invitation to create cookie-cutter book production, with the fear that all their books will look the same. This is emphatically a myth. Have a look at her seventh slide for examples of how one stylesheet can actually create many different looks.

CSS Zen Garden has been up for a while (Liza Daly used this model to create the EPUB Zen Garden a few months ago). It's a sort of CSS sandbox where graphic designers can play with style sheets and render the same content in very different forms. Clicking on the four links below will demonstrate what CSS can do:

It's well worth checking out and maybe having some graphic designers play around with it.

StartWithXML is Going to London

StartWithXML will be continuing in London! On September 2nd, at the British Library, we'll be conducting a one-day forum similar to the one we held in New York last January, but with a British publishing focus. Our sponsors for this event include Klopotek, MarkLogic, PLS, BIC, Publishers' Association, and of course O'Reilly.

We're still in the process of firming up our speakers, but we do have information posted here. Additionally, if you are a British publisher or service provider, there's a survey for you here.

As we get more news, we'll add it here - meanwhile, we're continuing to research and gather information about where publishers are in the StartWithXML process.

New on O'Reilly Labs: Open Feedback Publishing System

O'Reilly engineer Keith Fahlgren has formally launched our new Open Feedback Publishing System over on O'Reilly Labs:

Over the last few years, traditional publishing has been moving closer to the web and learning a lot of lessons from blogs and wikis, in particular. Today we're happy to announce another small step in that direction: our first manuscript (Programming Scala) is now available for public reading and feedback as part of our Open Feedback Publishing System. The idea is simple: improve in-progress books by engaging the community in a collaborative dialog with the authors out in the open. To do this, we followed the model of the Django Book, Real World Haskell, and Mercurial: The Definitive Guide (among others) and built a system to regularly publish the whole manuscript online as HTML with a comment box under every paragraph, sidebar, figure, and table.

You can see the system in action at the site for our upcoming book Programming Scala.

Open XML API for O'Reilly Metadata

In addition to Bookworm, O'Reilly Labs now includes an RDF-based API into all of O'Reilly's books:

Most publishers are familiar with the ONIX standard for exchanging metadata about books among trading partners. Anyone who's actually spent time working with ONIX knows that its syntax is abstruse at best. While ONIX does use XML, there are more modern, more general, and more immediately comprehensible standards out there, particularly for the basic details like "author," "title," and "edition." One of those standards is RDF, or "Resource Description Framework." This experimental O'Reilly Product Metadata Interface (OPMI) exposes RDF for all of O'Reilly's titles, organized by ISBN.

If anyone onsite (or otherwise) puts anything interesting together with the data, we'll be happy to feature it here on the TOC Blog, just let us know in the comments.

At TOC: Bookworm Online EPUB Reader Now Part of O'Reilly Labs

Update: There are now 400+ shiny DRM-free EPUB books from O'Reilly if you want to give Bookworm a test drive. Much of what's on our complete list with a green "E" next to it is available in EPUB and is Bookworm-friendly (the rest is just PDF for now, but you'll get the EPUB as a free update when it's available). (And get an extra 20% off through Feb. 20 with code EBKDSC, which is 40% off the print price.) More about our ebook bundles (free lifetime updates! No DRM! Kindle-compatible!) over here.


Regular readers know we're big fans of the Bookworm online EPUB reader. With Bookworm, you upload and organize your ebooks, and can read them online as well as a variety of mobile devices (iPhone shown below). It's open source, and built on top of well-documented and supported frameworks and standards:




bookworm




IMG_0001.PNG


You can even pick up where you left off reading as you move across devices.


As more content becomes available in EPUB format, tools like Bookworm encourage standards compliance (by rejecting invalid EPUB), and offer an alternative to proprietary ebook management reading/management systems like Digital Editions or Sony's eBook Library Software. (There's also Calibre, an open-source desktop ebook management system, which like Bookworm is built with Python.)


We liked Bookworm so much that we invited principal developer (and TOC speaker) Liza Daly to bring it into O'Reilly Labs, the R&D space that we're re-launching at this year's TOC Conference. From her post on the Labs blog:



From the beginning, O'Reilly has been an enthusiastic supporter of the project. Uniting the two under the Labs banner is a natural fit.



What does this mean for Bookworm's future?



Most importantly, core Bookworm code will remain open-source. If you would like to use Bookworm code, even commercially, you're encouraged to do so.



As part of the Labs project, we may add some features that won't be part of the core open-source package. Most other changes will be free and BSD-licensed. We're just beginning to think about where we can take this project.



I'll remain as the primary developer of Bookworm, but I hope that the added exposure O'Reilly brings to the project will encourage wider participation, not just of code but of ideas. I'm looking forward to taking ebook innovation to new places in 2009.



In addition to Bookworm, we've also opened up an RDF-based view of the public metadata for our books. Nearly all of this data was already available in a scattershot way from our catalog pages, the book's copyright page, Safari Books Online, and other sources -- our new "O'Reilly Product Metadata Interface" brings it all together in a standard, computer-friendly format.


This is just the beginning of a variety of experiments and pilot projects we have planned for the months ahead.

StartWithXML Research Report Now Available for Sale

If you weren't able to attend the StartWithXML Forum last month in New York, the accompanying research report is available for sale. The report covers topics like:

  1. Where am I and where do I want to end up?
  2. How much benefit do I want to obtain from content reuse and repurposing?
  3. How much work do I want to do myself?
  4. How much time and money will this take?
StartWithXML: Making the Case for Applying XML to a Publishing Workflow

When you purchase the report, you get it as our full eBook Bundle, including PDF, EPUB, and Kindle-compatible Mobipocket formats.

If you're ready for a deeper dive into XML, there are two very complementary tutorials lined up during next week's TOC Conference:

And if that's still not enough angle brackets for you, check out the Introduction to XML course from the O'Reilly School of Technology, which earns you four CEUs (Continuing Education Units) and a CEU letter from the University of Illinois Office of Continuing Education. Save $50 with discount code SWXML09.

Webcast Video: Essential Tools of an XML Workflow

Below you'll find the full recording from the TOC webcast, "Essential Tools of an XML Workflow," with Laura Dawson.

Read more…

New York Times Opens "Best Sellers API"

The New York Times on Tuesday opened up its "Best Sellers API," offering programmatic access to best-seller data (going back to 1930!) from the Times:

The Times Best Sellers API gives you quick access to current and past best-seller lists in 11 different categories, such as Hardcover Nonfiction and Paperback Mass-Market Fiction. The initial launch offers every weekly list since June 2008, and in the coming months, we plan to add data going back to 1930 (thanks to the hard work of our Books staff). The API also offers details about specific best sellers, including historical rank information and links to New York Times reviews and excerpts. And these aren't just canned responses; they're searchable and sortable, with even more robust options coming in the next release.

I'm a huge fan of what the Times has done to embrace open architecture and data formats (and Nick Bilton, from the Times' R&D Lab, will be a keynote speaker at next month's TOC Conference), and this is a great example of what content creators and curators (i.e., publishers) can do to give customers the opportunity to create new value on top of that content. We've offered an API for our Safari Books Online product for several years now, and have some very interesting internal projects percolating to take things a step further.

Presentations from the StartWithXML Forum

The following slides accompanied many of the presentations during the StartWithXML forum, held Jan. 13, 2009 in New York City.



XML--Why Bother?

David Young, Hachette Book Group USA

As Chairman and CEO of one of America's leading trade publishers, David Young presents the executive perspective on the role of XML technologies in the increasingly complex business of creating and selling books.

XML--Why Bother?
View SlideShare presentation or Upload your own. (tags: publishing books)



An Introduction to StartWithXML

Michael Healy, Book Industry Study Group

Introduction to some of the key terms and concepts needed to understand the day's program.



ROI Drivers for a StartWithXML Production Process

Brian O'Leary, Magellan Media Consulting Partners

Overview of the key components that provide the return on investment in an XML workflow.



Saving Money by Adopting an XML-Based Meta Data Workflow

Werner Fischer, Klopotek North America

Presented as part of the "StartWithXML ROI: Savings" panel.



Starting with XML: The Benefits of Automating Composition with Standard Stylesheets

Rebecca Goldthwaite, Cengage Learning

Presented as part of the "StartWithXML ROI: Savings" panel.



Leveraging XML for IP Rights

Steve Kotrch, Simon & Schuster

Presented as part of the "StartWithXML ROI: Savings" panel.



Marketing Books In A World Of Discoverability

Evan Schnittman, Oxford University Press

Presented as part of the "StartWithXML ROI: Revenues" panel.



Supporting Multi-Format Publishing

Leslie Hulse, HarperCollins Publishers

Presented as part of the "StartWithXML ROI: Revenues" panel.



Online Licensing Strategies: The Path to Digital Revenue

Bill O'Brien, Copyright Clearance Center

Presented as part of the "StartWithXML ROI: Revenues" panel.



Digital Book Printing: The New Economics Of Print-On-Demand

David Taylor, Lightning Source Inc.

Presented as part of the "StartWithXML ROI: Revenues" panel.



The View from the Front Lines

Ken Brooks, Cengage Learning

As a publishing technology pioneer and SVP, Global Production and Manufacturing at one of America's largest educational publishers, Ken Brooks presents lessons for the publishing industry at large based on his experiences implementing successful, large-scale XML production processes.



StartWithXML Solutions Overview

Brian O'Leary, Magellan Media Consulting Partners

Overview of the many publishing technology solutions providers and how their offerings support an XML workflow.



XML Workflow Foundations: Efficient Title Management Practices

Doug Lessing, Firebrand Technologies

Presented as part of the "StartWithXML Solutions: Tools" panel.



Building an XML workflow: Tools and Key Considerations

Steve Waldron, Klopotek North America

Presented as part of the "StartWithXML Solutions: Tools" panel.



DAM for Production vs DAM for Distribution

Scott Cook, codeMantra

Presented as part of the "StartWithXML Solutions: Tools" panel.



O'Reilly XML Toolchain

Andrew Savikas, O'Reilly Media

Presented as part of the "StartWithXML Solutions: Tools" panel.

O'Reilly XML Toolchain
View SlideShare presentation or Upload your own. (tags: swxml09 xml)



StartWithXML Readiness Checklist

Brian O'Leary, Magellan Media Consulting Partners

Checklist of the key issues publishers should consider before implementing an XML production process.

StartWithXML Readiness Checklist
View SlideShare presentation or Upload your own. (tags: swxml09 xml)



Tagging and Chunking Best Practices

Laura Dawson, LJNDawson

Presented as part of the "StartWithXML Solutions: Methods" panel.



The Evolving Role of Authors and Editors

Phil Madans, Hachette Book Group

Presented as part of the "StartWithXML Solutions: Methods" panel.



How Wiley Uses Word to Invite Authors, Engage Editors, Improve Production, and Put XML at the Source of Its Content

Frank Grazioli, John Wiley & Sons

Presented as part of the "StartWithXML Solutions: Methods" panel.

Coverage of StartWithXML

Turns out I was not the only one on Twitter for the StartwithXML Forum on January 13th. Joe Bachana was tweeting as well. Kind of interesting to see the posts side-by-side. David Rothman of Teleread also has some great things to say, as does Richard Curtis over at e-reads.

We also got nice coverage from PW, as well as Publishers Lunch.

Slides will be up soon!

BeyondPrint Offers Helpful Review of StartWithXML

George Alexander, who attended the StartWithXML forum in New York on Tuesday and made quick work of reading the research paper (thank you!), offers a helpful review of both.

In his review, George also offers a view he shared with the StartWithXML team the day after the forum: the current tools are not yet ready for widespread use, and the forum and the research paper were largely silent on his concerns.

I think that George makes an important point about the tools for authoring and editing. I responded yesterday to say that what may have felt like a "middling" position at the forum reflects a range of opinion within the project team.

At the forum, O'Reilly's Andrew Savikas, for example, advocated use of XML authoring tools in his afternoon remarks, showing some examples of what worked. In contrast, Laura Dawson, who co-wrote the research paper, is more critical of the tools, something she made clear in her comments. I'm somewhat in the middle, feeling that the tools are not necessarily ready for widespread deployment, but that balanced changes in processes, technology/tools and organizational structures can provide a path to moving the tagging work upstream.

One thing less evident at the forum or in the paper is the healthy discussion that took place within the team about this issue. At one point in the e-mail exchanges, I wrote (paraphrasing) that "waiting until the tools are "ready" isn't the right answer; people developing the tools will improve them when publishers in adequate numbers use the tools and advocate for better and more features.

When I presented the "solutions" grid in the afternoon, I pointed out that the bulk of the most developed software and systems are in the production editorial and operational areas, but that upstream options were becoming more available. I stopped short of saying "not ready," in part because I don't want publishers to hear me and walk out saying "we'll wait until the tools come on line" and let production worry about tagging until then. Changing workflows is painful, and people are prone to avoiding pain. That's smart in the short term and potentially disastrous in the mid-term, so I stuck with the recommendation to push upstream as much and as fast as you can.

We view the research paper as a living document, and we expect to revise it based on feedback from the forum as well as an evolving understanding of the number of case studies that the paper and forum started to capture. Look for a subsequent draft to articulate a position on XML tools that may not match what George sees but more clearly captures the project team's thinking.

Slides from "Essential Tools of an XML Workflow" Webcast

Laura Dawson has made her slides available from the recent TOC Webcast, "Essential Tools of an XML Workflow." A complete recording of the event will be posted here soon.


View SlideShare presentation or Upload your own. (tags: xml swxml)

Read more…

[TOC Webcast] Essential Tools of an XML Workflow

Tools of Change for Publishing, in conjunction with StartWithXML, will host "Essential Tools of an XML Workflow," a free webcast with presenter Laura Dawson, on Thursday, Dec. 11 at 1 p.m. eastern (10 a.m. pacific).

Webcast Overview

This webcast is for those publishers who have made the decision to pursue digital channels for their content. What tools are out there? What do all those acronyms mean? How can publishers implement new strategies without disrupting current workflows? Here we'll explore the alphabet soup of digital publishing, sort out the tools that are most useful, and help publishers find some solid ground.

Register for free.

Webcast Video: What Publishers Need to Know about Digitization

Below you'll find the full recording from the recent TOC Webcast, "What Publishers Need to Know about Digitization," with Liza Daly.

Read more…

A Correction!

Frank Grazioli, of Wiley, writes in to correct my last post about taxonomies:

Wiley has been exploring taxonomies for its travel content business; the cooking/psych/accounting spaces might be our next logical opportunities because the disciplines are well developed, specific, etc., that content is authored or edited in fairly controlled templates that map to our own XML content models and our belief in content models and XML has evolved that "lighter" and "more agile" are better than taggy and dense. As you so aptly point to the contextuality and "rigor" of taxonomies, these tools would allow our XML to "slip on the right jacket" for the occasion. I apologize if we led you to believe that we already have firm taxonomies in place for the three areas you specify--I wouldn't want readers/event guests to get that impression anyway.

Slides from "What Publishers Need to Know about Digitization" Webcast

TOC will be posting a complete recording of the presentation, but in the meantime I've posted the slides from yesterday's webcast, "What publishers need to know about digitization" on Slideshare.

Thanks to everyone who attended and especially to those who asked so many excellent questions.

View SlideShare presentation or Upload your own. (tags: schema epub)

Read more…

[TOC Webcast] Tomorrow: What Publishers Need to Know About Digitization

Webcast: What Publishers Need to Know about Digitization, with Liza DalyTools of Change for Publishing will host a free webcast tomorrow at 1 p.m. eastern (10 a.m. pacific). Digitization expert Liza Daly will discuss "What Publishers Need to Know About Digitization."

No prior experience is assumed in this overview of the conversion process. Topics will include:

  • What's XML and do you need it?
  • What's the cost-benefit analysis versus PDF or other formats?
  • What should you consider when selecting a vendor?
  • Should you use a centralized platform or go on your own?
  • How can you monetize your digital offerings?

Slots are limited, so register for free today.

Beyond the Tag Cloud

This is an excerpt from our research paper, which will publish in concert with the StartWithXML Forum on January 13th at the McGraw-Hill Auditorium in New York. Early bird discounting for BISG members is ending soon!

A good taxonomy is the backbone of your business -- it's how you sort your content. It allows for effective merchandising, effective marketing -- you can aim your content with the precision of a pool cue. It allows for inventorying your content -- so you know what you have ... and what you need. With your content tagged and organized, you know where everything is and how to deploy it.

Taxonomies are contextually sensitive and rigorous -- and in establishing your own, it helps to look at what other industries are doing. Wiley has adopted accounting and cooking and psychology taxonomies from those industries to organize information in its professional development titles. Educational publishers are increasingly arranging their textbooks around "learning objects" -- taxonomized pedagogical goals developed by educators themselves. Even the BISAC codes -- which are part of the ONIX system of organizing book information and therefore an XML-based taxonomy -- are developed very carefully and consensually among book industry professionals in monthly meetings.

An important aspect of taxonomy development is scope notes. Terms need definition and clarity around how they're going to be used. Documenting your taxonomy -- what you mean when you say "porcelain" (collectible china, dental work, household fixtures?), parent-child relationships between categories, and why you choose certain terms over others -- is important for the long term. Future editors and authors will need to know why your taxonomy has developed as it has.

Consistency in application is also crucial. Drop-down menus (as opposed to free-text fields) enforce structure and ensure that users don't come up with their own terms that pollute your taxonomy with duplicates or irrelevancies (or misspellings).

An advantage to using XML is that you don't have to accomplish everything at once, perfectly, from the outset. You will not be able to tag your documents thoroughly right off the bat -- who can know everything in advance? The act of tagging is recursive, and depends on market and company needs. XML allows for this flexibility. Depending on how you envision chunking and re-use, you'll tag your documents differently with each iteration. Unlike the "fire and forget" model, iterative tagging means that your books are living documents.

Another Position: XML Alone is Not Enough

George Lossius, the CEO of Publishing Technology PLC, wrote a very thoughtful post about our StartWithXML project for the new UK blog, BookBrunch. He comments after a report on the presentation I did at Frankfurt about our project.

George's point is that XML "is not enough." Books will live in a larger world also using XML and highly internal standards and procedures for XML use, internal to a company or internal to the book business, do not necessarily equip a publisher to live in the larger world of the semantic web.

We don't disagree with George's premise that XML can be used to position publishers better for the semantic web. The question for all publishers will be how much they can take on how fast, particularly in pursuit of models and opportunities that haven't really emerged yet. But the most forward-thinking always lead the target a bit, and George's post enumerates one aspect of that.

We urge our readers to check out George's post. And we encourage George to put his XML commentary right here on this blog; we're delighted to receive it.

Stay Connected
RSS TOC RSS Feeds
 News Posts
 Commentary Posts
 Combined Feed
 New to RSS?
Newsletter Subscribe to the TOC newsletter.
Tarsier Icon Follow TOC on Twitter.
Newsletter Join the TOC Facebook group.
Newsletter Join the TOC LinkedIn group.
TOC Widget Get the TOC Headline Widget.
Search
Tag Cloud