Entries tagged with “tagging” from Tools of Change for Publishing
Taxonomies and Starting With XML
This is an excerpt from a blog post I wrote last week on taxonomies and chunking.
Last October, the StartWithXML team wrote a post called "To Chunk or Not To Chunk," where we discussed tagging and infrastructure issues, and a discussion ensued about what happens when you don't know what you'll be using chunks for. How do you tag those?
Later, in our StartwithXML One-Day Forum, we included a presentation on tagging and chunking best practices, where it was pointed out that no taxonomy for chunk-level content currently exists.
We have taxonomies for book-level content. These include formalized code sets such as theLibrary of Congress subject codes, the BISAC codes, the Dewey Decimal System, among others. There are also informal code sets, like the tag sets on Shelfari or Library Thing. There are proprietary taxonomies at Amazon and B&N.com that enable effective browsing.
But nothing like this exists for sub-book-level content. It's never been traded before. We've never really needed a taxonomy for it before.
Other industries that traditionally distribute "chunks" have their own taxonomies that might prove useful in building a book-chunk schema. These include the IPTC news codes, which identify the content of a particular news story -- that's the closest analogy I can find for small gobbets of content that require organization.
Industries have proprietary taxonomies to identify certain concepts -- culinary arts, music, agriculture, engineering, the sciences, literature and criticism, education, and on and on and on. But these do not necessarily identify concepts within a book.
Some might argue that we don't necessarily need taxonomies -- why can't we use natural-language search and the semantic Web to "bubble up" the "right" concepts? I'd argue that words don't always mean what we think they mean. A classic example from my library days is the term "mercury." That could mean the planet, the car or the element. Proponents of semantic search would say that the context in which "mercury" is mentioned should take care of defining that term. I'd say that's true in about 50 percent of all cases but not definitively true enough in 75-100%.
My original post gets into more detail about why taxonomies are important search tools, and how the digitization of books requires a good taxonomy ... and who should do it.
A Correction!
Frank Grazioli, of Wiley, writes in to correct my last post about taxonomies:
Wiley has been exploring taxonomies for its travel content business; the cooking/psych/accounting spaces might be our next logical opportunities because the disciplines are well developed, specific, etc., that content is authored or edited in fairly controlled templates that map to our own XML content models and our belief in content models and XML has evolved that "lighter" and "more agile" are better than taggy and dense. As you so aptly point to the contextuality and "rigor" of taxonomies, these tools would allow our XML to "slip on the right jacket" for the occasion. I apologize if we led you to believe that we already have firm taxonomies in place for the three areas you specify--I wouldn't want readers/event guests to get that impression anyway.
Tagging the Real World through Barcode Apps
Earlier this week, Peter Brantley noted an interesting barcode application for Android phones that connects the ISBN data on a physical book with Google Book Search listings. This merging of the physical and digital worlds isn't novel -- other companies offer similar applications -- but the discussion surrounding these apps tends to focus on retail threats and opportunities rather than broader uses.
Speaking as an unabashed content geek, I find the information curation possibilities from this digital-physical merge particularly interesting. The Web has provided an assortment of organization tools -- RSS feeds, readers, tags, categories, etc. -- that help me find and synthesize a vast amount of information. But the same can't be said for the real world. If something pops onto my radar while I'm sitting in front of the TV or shopping at a store, I need to open a browser (assuming I have a computer or phone), punch in the information and save it for later retrieval. This isn't an arduous task, but it lacks the elegance of scanning and tagging Web-based data.
My online efficiency increased exponentially a few years ago when I incorporated RSS feeds and readers into my daily routine. Instead of tediously visiting particular sites or running open-ended search queries, I could now gather useful sources in one application and sort that data into segments geared toward my own needs. Not to get too syrupy here, but it was an eye-opening experience that revealed a new depth to the Web. These barcode apps offer similar possibilities for seamlessly accessing the physical world's stored information. Armed with a cell phone and a data plan, those of us who are curation minded can expand the boundaries of discoverability into an untapped region.
Beyond the Tag Cloud
This is an excerpt from our research paper, which will publish in concert with the StartWithXML Forum on January 13th at the McGraw-Hill Auditorium in New York. Early bird discounting for BISG members is ending soon!
A good taxonomy is the backbone of your business -- it's how you sort your content. It allows for effective merchandising, effective marketing -- you can aim your content with the precision of a pool cue. It allows for inventorying your content -- so you know what you have ... and what you need. With your content tagged and organized, you know where everything is and how to deploy it.
Taxonomies are contextually sensitive and rigorous -- and in establishing your own, it helps to look at what other industries are doing. Wiley has adopted accounting and cooking and psychology taxonomies from those industries to organize information in its professional development titles. Educational publishers are increasingly arranging their textbooks around "learning objects" -- taxonomized pedagogical goals developed by educators themselves. Even the BISAC codes -- which are part of the ONIX system of organizing book information and therefore an XML-based taxonomy -- are developed very carefully and consensually among book industry professionals in monthly meetings.
An important aspect of taxonomy development is scope notes. Terms need definition and clarity around how they're going to be used. Documenting your taxonomy -- what you mean when you say "porcelain" (collectible china, dental work, household fixtures?), parent-child relationships between categories, and why you choose certain terms over others -- is important for the long term. Future editors and authors will need to know why your taxonomy has developed as it has.
Consistency in application is also crucial. Drop-down menus (as opposed to free-text fields) enforce structure and ensure that users don't come up with their own terms that pollute your taxonomy with duplicates or irrelevancies (or misspellings).
An advantage to using XML is that you don't have to accomplish everything at once, perfectly, from the outset. You will not be able to tag your documents thoroughly right off the bat -- who can know everything in advance? The act of tagging is recursive, and depends on market and company needs. XML allows for this flexibility. Depending on how you envision chunking and re-use, you'll tag your documents differently with each iteration. Unlike the "fire and forget" model, iterative tagging means that your books are living documents.
Why You Should Care About XML
Since we began talking about the StartWithXML project, a few offline comments have come in suggesting that imposing XML on authors (and editors for that matter) won't work.
When framed that way, I'm in violent agreement. I would never argue that authors and editors should or will become fluent in XML or be expected to manually mark-up their content. I naively tried fighting that battle before, and was consistently defeated soundly. It is simply too much "extra" work that gets in the way of the writing process.
But there are several reasons why it's really really important for publishers to start paying attention to XML right now, and across their entire workflow:
- XML is here to stay, for the reasonably forseeable future. While it's always dangerous to attempt to predict expiration dates on technology, I think it's fair to assume XML will have a shelf life at least as long as ASCII, which has been with us for more than 40 years, and isn't going anywhere soon.
- Web publishing and print publishing are converging, and writing and production for print will be much more influenced by the Web than vice-versa. It will only get harder to succeed in publishing without putting the Web on par with (or ahead of) print as the primary target. The longer you wait to get that content into Web-friendly and re-usable XML, the worse.
Many in publishing balk at bringing XML "up the stack" to the production, editing, or even the authoring stage. And with good reason; XML isn't really meant to be created or edited by hand (though a nice feature is that in a pinch it easily can be). There are two places to look for useful clues about how XML will actually fit into a publisher's workflow: Web publishing and the "alpha geeks."
Read more…News Roundup: Customizable Magazine Service Launches, French E-Reader Includes Subscriptions, Library Tags Online-Offline Recommendations
Maghound Customizable Magazine Service Launches
Maghound, a customizable magazine service from Time Inc., is now available. From Folio:
The membership pricing is tiered-- three titles for $4.95 a month, five titles for $7.95, seven titles for $9.95, and $1 per title for eight titles or more. Memberships can be entirely managed online, as well as by email and phone, from changing magazine title selections to updating personal information and placing magazine delivery on hold for a temporary period. (Continue reading)
France Telecom E-Reader Includes Subscriptions
France Telecom's Read & Go trial service bundles e-reader hardware with a subscription to mobile content. From BusinessWeek:
The trial of the prototype will wrap up this month, and by 2009, France Telecom aims to start distributing the Read & Go in conjunction with a subscription-based news service of the same name. For a monthly charge similar to a mobile service plan, customers will receive an over-the-air stream of aggregated content from a wide assortment of information sources. Alongside the articles will be ads that help defray the cost of the service. (Continue reading)
Library Uses Tags to Link Online-Offline Recommendations
LibraryTechNZ mentions an interesting engagement of a European library with its community, something that bookstores could also do:
The library at the Hague in the Netherlands has introduced a simple form of tagging in real life. They now have two returns drop-boxes. One is for all items, and the other is for amazing books. Staff take the 'amazing' books and put them in the 'amazing books' display for visitors to browse. But they also tag them 'amazing' in the Library's collection database.
Library Uses Tags to Link Online-Offline Recommendations
LibraryTechNZ mentions an interesting engagement of a European library with its community, something that bookstores could also do:
The library at the Hague in the Netherlands has introduced a simple form of tagging in real life. They now have two returns drop-boxes. One is for all items, and the other is for amazing books. Staff take the 'amazing' books and put them in the 'amazing books' display for visitors to browse. But they also tag them 'amazing' in the Library's collection database.
Simplifying Semantic Tagging
Adaptive Blue has released a header/meta tag scheme to simplify semantic tagging of content items. From ReadWriteWeb:
Semantic web company Adaptive Blue has published what it hopes will become a standard for publishers who want to signal in their header tags when a webpage is primarily about a particular book, film, wine or other type of objects ... Called AB Meta, the format was developed in concert with a number of other web companies and is aimed to be part of a larger effort to pick up where existing Semantic Web and microformats markup leaves off. It's simple and extensible.
- Stay Connected
-

TOC RSS Feeds
News Posts
Commentary Posts
Combined Feed
New to RSS?
Subscribe to the TOC newsletter. 
Follow TOC on Twitter. 
Join the TOC Facebook group. 
Join the TOC LinkedIn group. 
Get the TOC Headline Widget.
- Search
-
