Entries tagged with “ebook formats” from Tools of Change for Publishing
O'Reilly iPhone App Tips and Tricks
As Andrew has discussed in some detail recently on this blog, O'Reilly has started publishing many books as iPhone/iPod Touch apps. Over the past couple of months, we've received a considerable amount of feedback from customers who have purchased the apps.To address some of the most common questions we get, I recently added a page on oreilly.com. I cover three main topics:
- "Hidden" features -- handy things you can do that aren't always obvious in the UI
- Long code lines -- my attempt to help users deal with the question we get most often on the support queues
- Extracting the EPUB files -- yes, there is an EPUB file in that app, and you can get to it quite easily
Ebook Piracy is Up Because Ebook Demand is Up
My email, twitter, and "real-world" information stream is abuzz today with references to a New York Times story about the increase in piracy of ebooks:
“It’s exponentially up,” said David Young, chief executive of Hachette Book Group, whose Little, Brown division publishes the “Twilight” series by Stephenie Meyer, a favorite among digital pirates. “Our legal department is spending an ever-increasing time policing sites where copyrighted material is being presented.”
John Wiley & Sons, a textbook publisher that also issues the “Dummies” series, employs three full-time staff members to trawl for unauthorized copies. Gary M. Rinck, general counsel, said that in the last month, the company had sent notices on more than 5,000 titles — five times more than a year ago — asking various sites to take down digital versions of Wiley’s books.
The reason there's an "exponential" increase in piracy of ebooks is because there's an exponential increase in demand for ebooks:
That's not a bad thing! It's an indicator of unmet demand (and in particular for non-DRM encrypted content). I know I have no interest in buying an ebook that's locked to a single vendor or device, and I'm sure many of these "pirates" feel the same. This is a good time to revisit Tim O'Reilly's seminal Piracy is Progressive Taxation, which includes the following lessons:
- Obscurity is a far greater threat to authors and creative artists than piracy.
- Piracy is progressive taxation.
- Customers want to do the right thing, if they can.
- Shoplifting is a bigger threat than piracy.
- File sharing networks don't threaten book, music, or film publishing. They threaten existing publishers.
- "Free" is eventually replaced by a higher-quality paid service.
- "There's more than one way to do it."
I'm not suggesting publishers stop sending those DMCA notices; but 3 full-time staffers? Putting those resources toward building new ways to meet that demand is a much better investment.
Coincidentally, our research report Impact of P2P and Free Distribution on Book Sales is now available.
Pragmatic Programmers Now Doing "Ebook Bundles"
It's great to see other publishers picking up on the "ebook bundle" concept and including multiple formats -- the Pragmatic Programmers are now selling a combo of EPUB, PDF, and Kindle-compatible Mobipocket files for their ebooks. I especially like the way they've phrased it:
You’ve bought a license to the content, not to a particular file format, so you are free to enjoy that content on whatever device, using whatever display technology you choose.
Well said.
Format Comparison: PDF, EPUB, and Mobi Downloads from Ebook Bundles
We've been selling PDFs of our books on oreilly.com for several years, but this summer began selling "ebook bundles" of many titles, which include PDF, EPUB, and Mobipocket versions. Here's some weekly data (I can't share the vertical scale) on the relative breakdown of actual downloads from those bundles (PDF, Mobi, and EPUB are Light, Medium, and Dark respectively). PDF is still the format of choice for most people, though EPUB is getting respectable usage, with Mobi in third:
The numbers at the bottom are weeks (200901 is the first week of 2009). This is only among titles offered in all three formats -- the majority of our ebooks are currently still only available as PDF, though we expect to release several hundred more in bundle form over the next few months (not that you should wait to buy of course -- you'll get all the formats as they come available ...).
An important point to note, via Allen Noren, our VP who runs oreilly.com, is that a substantial portion of our electronic sales come from overseas, where getting a print version is often difficult or cost-prohibitive:
I know you've heard me say it before, but we became an international publisher, in a way we were not previously, when we started selling books in digital format. We're in a unique position vs most publishers, who only have US or NA rights, but it's worth nothing.
Duly noted.
Q&A with Hadrien Gardeur, Co-Founder of Feedbooks
Feedbooks is a Web-based service that converts, catalogs and distributes ebooks in a variety of formats. Co-founder Hadrien Gardeur discusses Feedbook's system and future services in the following Q&A.
How would you define your company? Is Feedbooks a distributor? A digitizing service? A social network? Something else?
Probably all three. We already distribute a massive number of ebooks and most of our users currently use Feedbooks to discover and download public domain or Creative Commons licensed ebooks. But we're also working on various tools for authors and small publishers to create ebooks. We'd like to turn our readers into potential authors, and create a service where new authors can distribute their creations to a large user base.
Who is your typical user?
Do we really have a typical user? We probably used to have typical users when we mostly provided ebooks for dedicated reading devices: heavy readers. But that's not the case anymore, now that we've extended the service to the iPhone, too.
Why did you start Feedbooks?
We've seen a lot of very exciting services for music and video these last few years and I really believe that there's a huge potential for ebooks too, thanks to E Ink-based devices and multi-purposes platforms such as the iPhone and Android. I love reading and I'd like to create a great service where anyone can discover new books, and where authors can easily connect with readers.
Your Web site lists support for the Kindle, the Sony Reader, the iRex iLiad, the Cybook Gen3, the iPhone and other smartphones. How are you able to support all of these devices?
We use an abstract representation, somehow similar to DTBook, to store all of our books. We can generate a file on the fly based on this representation. Adding new formats is fairly easy thanks to this technology. We were the first service to distribute books in EPUB for this reason.
Which ebook format is most popular with your users? Which e-reader is most popular?
EPUB and the iPhone are probably the most popular right now thanks to our seamless integration into Stanza. The most popular dedicated device is the Kindle.
Have established book publishers used your service to create ebook editions?
No, we're still working on those features. I expect major publishers to use XML+XSLT or Adobe InDesign rather than a dedicated service. We're creating our publishing feature with the end-user or small publishers in mind rather than major publishers.
Do you plan to sell ebooks?
We do. I believe that free content and user-generated content in general shouldn't be in a different environment than the rest of ebooks. It makes a lot more sense to have both in the same environment and create an optimal experience for the user.
When will sales begin?
No specific date yet, we'd rather focus on building a good service first and then add this component.
Print on demand (POD) services seem like a logical extension for Feedbooks. Is this something you're planing?
Sure, I consider POD as another potential format for our platform. It's a lot easier to turn an ebook into a POD book than the other way around.
The Feedbooks RSS tool appears to be targeted at Kindle users who want to receive updated news and information from RSS feeds. Do you anticipate other uses for this tool, such as a blog-to-book service?
It's not targeted at Kindle users only. I use it every day on a Sony Reader, and it's actually quite popular with the iPhone, too. I've been experimenting with blog-to-book, there's a lot of such "blooks" (blog+book, serialized novels using blogs) out there. I created a catalog entry for Stanza to test how the readers react to these serialized novels. Such a tool could probably be very interesting for publishers, too.
Feedbooks and Lexcycle, the company behind the Stanza e-reader, have a close working relationship. How did this come together?
Lexcycle launched the iPhone version of Stanza a few days before we decided to release the first version of our new API. Marc [Prud'hommeaux, principal developer at Lexcycle] contacted me: they were looking for content that could be directly integrated into Stanza's online catalog. We exchanged a lot of e-mails with various information, and did a lot of work together to make sure that this would work from day one. There's still a lot of new features that I'd like to introduce and we'll continue improving both the API and Stanza in the future, to create an optimal experience.
How are publishers and others using the Feedbooks API?
I would describe our API as read-mostly for the moment. It's mostly useful for reading systems such as Stanza. Once we turn it into something that's read/write, the situation will be quite different and I can imagine various innovative publishing techniques based on this.
What publishing techniques do you foresee?
Publishing should be more of a seamless experience. We already use a lot of publishing tools (blogs, social networks etc...) and we shouldn't have such a gap between these tools and ebooks.
What are the biggest issues with digital conversion?
There's a lot of formats, and you can expect standards such as EPUB to evolve in the near future. But I believe that the biggest issue for publishers is to find the right balance between what users are allowed to do and the ability to preserve the layout and design of a book. The holy grail for publishers is probably something as powerful as PDF, but reflowable. Ebooks allow users to customize a lot of things and preserving the design of a book shouldn't be at the cost of this flexibility.
Digging Around Amazon's Topaz File Format
Late Night Code is popping the hood on Topaz, that mysterious "other" file format used on the Kindle:
Mobipocket files purchased from Amazon have an AZW extension (which presumably stands for Amazon Whispernet - the name of the Kindle wireless download service). Mobipocket files from other sources will have a MOBI or PRC extension. Topaz files will have an AZW1 extension if downloaded directly to the Kindle, and a TPZ extension if downloaded from Your Media Library on Amazon.com.
Optimizing Web Content for the Kindle Browser
Amazon's Kindle store is convenient, easy-to-use and stocked with thousands of titles.
But what about publishers and content distributors who want to reach the
estimated 240,000 Kindle users without going through Amazon's program? And what about content formats that the Kindle does not directly support?
One selling point of the device is its free, ubiquitous Internet service and Web browser. Amazon has filed the browser under "Experimental" but it's quite usable as-is. With a few simple changes to a Web site's HTML code, it's even possible to specially cater to Kindle users.
The screenshots used in this article are from the mobile version of Bookworm, my Web application for reading ebooks in the EPUB format. Although what's being displayed is ebook content, it's being delivered by the Kindle's browser, not the Kindle ebook technology, which does not yet support EPUB.
Because the mobile Web version is already heavily optimized for small devices, the layout is simpler than a traditional Web site. What works for an iPhone or other wireless device will also be a good starting point for the Kindle, although we'll see there are some special considerations that don't apply to any other device.
Default or Advanced Mode?
When the Kindle ships, its Web browser is in "default mode." It will not load images or CSS styles, but it does render basic HTML tags like the italic tag <i>. Personally, I prefer "advanced mode," which displays Web pages more like a traditional browser, but some sites can be unreadable in this mode.
When optimizing for the Kindle it's best to consider that most users will not change from "default mode," or even realize that the option exists.
How different are these modes? Here is a comparison shot of the same screen from Bookworm in both modes:
| My list of books in Advanced mode, showing tabular layout and more advanced font styles | My list of books in Default mode |
In Default mode, all the information about the books runs together. It would be better to present this as a simple vertical list, the way the Amazon Kindle store does, rather than as a table.
Font Size Considerations
You can choose from six font sizes in the Kindle browser. As a content creator, you can provide a wider range of font sizes in your Kindle-formatted Web page, but take care that they aren't too small. The device doesn't clearly display fonts that are smaller than its default six sizes.
In this screenshot, the table of contents for a Bookworm book is not readable, even though this page has already been tailored for the small display of mobile phones:
This problem is only likely to occur in Advanced mode where stylesheets are activated.
Usability
The Kindle's method of selecting and traversing hyperlinks is unique. The user activates links by selecting along the vertical, or Y-axis, using the scroll wheel. When multiple links fall on the same line, the Kindle will open a dialog box so the user can clarify which link is the target.
In Bookworm, users move to the next or previous chapter by selecting navigation links lined up horizontally (see the top row of the first image). In the Kindle, this presentation forces the user to click a second time to select the appropriate one:
For commonly-used navigational items like this, line up the links in a vertical row:
- Next
- Contents
- Previous
Now no second click (and accompanying page refresh) is necessary.
It's also important to remember that the Kindle is a black-and-white device. If your site uses text color to convey any useful information (such as what is or is not a hyperlink), re-work the design to accommodate a grayscale display.
Finally, keep pages short. The Kindle cannot scroll; long Web pages are paginated like books. Pagination with E Ink devices is slow relative to scrolling on a computer screen. If possible, keep all your content on the first Kindle "page" when viewed at the default font size.
Targeting the Kindle
Web browsers are identified using their "user-agent" string. The current
version of the Kindle is broadcasting this user-agent:
Mozilla/4.0 (compatible; Linux 2.6.10) NetFront/3.3 Kindle/1.0
(screen 600x800).
It's beyond the scope of this article to describe how to set up your
Web site to deliver different kinds of content to different browsers,
a process that varies considerably with your site's technology.
How do you test your layout if you don't have a Kindle? There's no substitute for having the real device (tell your boss it's for "research"), and currently Amazon does not offer any kind of browser emulator. Some possibilities:
- Disable stylesheets on your browser and look at the output. Does it still make sense? (Instructions for disabling stylesheets; Firefox users should install the Web Developer add-on)
- Use a text-only browser like Lynx
Some Last Advice
Don't spend too much time on this process. The next version of the Kindle is expected soon, no doubt with an improved browser. Indeed, Amazon could offer a new version of the existing browser at any time. Most of the changes recommended above should take little time and money to implement, and can make a great difference in user experience.
In addition, optimizing your site for small-screen browsers can have other benefits: they allow an increasing number of mobile users to get quick access to your content, and aid accessibility for screen-readers and other non-standard browser types.
Looking at EPUB's Flexibility and Fidelity
Jon Noring at TeleRead discusses the fundamental importance of the AAP's endorsement of the EPUB specification and format:
The following two points in AAP’s letter are germane to this article:
1. AAP sees retailers selling EPUB directly to consumers ... as well as selling derivative formats converted from EPUB. Publishers understand the great flexibility that EPUB provides.
2. AAP uses the phrase “high-fidelity” to describe EPUB. This mention means presentation quality is important to AAP, and thus should be important to everyone else in the ebook industry. It also acknowledges that indeed EPUB is “high-fidelity."
It is clear that publishers consider “flexibility” and “high-fidelity” in ebook formats important, for themselves, for the rest of the industry, and for consumers. And EPUB is a format that meets these requirements.
EPUB Creation Just Got Simpler
BookGlutton announced last week that it had developed a Web-based (X)HTML to EPUB conversion form (and API). The form itself accepts HTML or XHTML documents and returns an .epub file (in a couple of seconds) for download. While it doesn't yet support images or CSS stylesheets, it sounds like these features are coming. My handful of tests of the tool have all "just worked." I grabbed HTML files I found on the Web and an HTML version of a recent O'Reilly title and all were happily accepted. The resulting .epub file opened fine in Adobe Digital Editions and was readable.
The impact of this sort of easy-to-use form is huge, as so many content creation tools already support (X)HTML output in some way, from Word to OpenOffice.org to DocBook to Dreamweaver. It should be the first step in lowering the barrier to entry to creating EPUB documents. Bob DuCharme had already showed technical experts how to create .epub files with nothing but free tools and I'm hopeful that the Save as DAISY output from Word will help create more accessible documents, but there's nothing like a simple Web form to bring a complicated standard to the masses.
That said, the lack of CSS and image support really makes this more of a proof-of-concept than a real tool today, unless you're only interested in reading narrative text. With that in mind, let's give it a shot (in Firefox, on my Mac):
- Find Wikipedia's article on E-book.
Save As: Web Page, HTML only (so you don't bother with the images or CSS):
Now take that HTML file from your computer and feed it right back to the BookGlutton form:
Hit convert, then open the resulting .epub in Adobe Digital Editions:
Here's the resulting .epub, for the lazy: wikipedia_on_E-book.epub. I also tried two other samples: the 3rd chapter from Word Hacks (word_hacks_chapter_3.epub) and the Ebook Format Primer from the TOC blog (ebook-format-primer.epub).
So, given our three samples, what are the current drawbacks? Well, as I mentioned before, the lack of images and CSS supprt as the two obvious ones, especially for the book content (which had images, unlike the blog post). There's also the all-too-common drawback of HTML from the wild-wild Web being rather funky. You can see an example of that sort of oddness on the first page of the Wikipedia sample in Digital Editions (which is including some JavaScript code meant to be executed by the browser) :
// document.writeln("\x3cp\x3e\x3ca href=\"http://wikimania2008.wikimedia.org/wiki/Registration\"
blah blah blah
...but that stuff is ignorable and could be removed from the HTML if one cared. Another concern is that while the internal linking (from the Contents, for example) works, some of the external links back to other parts of Wikipedia don't. Linking is a major advantage of ebooks, so this is a sad one, though this is a common web problem and not really BookGlutton's fault. My final complaint has to do with special characters (n spaces), which seem to have gotten messed up in the book content (look around the "Figure" references). That said, the blog post looks pretty nice, once you find it a little later in the document.
Although at this stage it's just a prototype, BookGlutton's work might encourage the re-use of existing content published on the Web packaged as an ebook. This type of thing should significantly increase the number of .epub files ready to go into (format-friendly) ebook devices and create more pressure on ebook device manufacturers to support EPUB.
It's time for the "regular" folks to step out of the woodwork and give this EPUB thing a try!
"Last Lecture" Success Inspires Kindle Marketing
Perhaps catalyzed by the Kindle's new availability, Amazon recently associated the surprise success of The Last Lecture with happy Kindle owners. From an Amazon press release:
"One of the advantages for readers is that Kindle titles never go out of stock," said Steve Kessel, Senior Vice President, World Wide Digital Media at Amazon.com. "That's good for readers, and it's good for publishers too."
Ebook editions of The Last Lecture are available in Kindle, Secure eReader, Secure Mobipocket and Secure Microsoft Reader formats.
Keep Your Eye on the epub Ball (But Do Play Nice)
On Peter Brantley's Reading 2.0 email list, former IDPF director Nick Bogaty offered a great argument for dialing down some of the pressure aimed at device makers for not yet fully supporting the .epub ebook standard. Nick has kindly given permission to have his comments reprinted here:
While companies like the one I work for have broadly implemented .epub support in its eBook products (in Adobe's case in InDesign CS3 for making .epub and Digital Editions for consuming it), I think it is too early to question vendor support for the .epub format. The final piece of the .epub specification (which is really composed of three specs) was only approved in September 2007 and it takes time for big companies to digest the implications of .epub on their businesses, integrate it into their products etc.
When we were making the .epub format when I was at the IDPF, we envisioned that eBook hardware and software would handle .epub in one of two ways. The first way is to simply render (in the case of eBooks that means "read") .epub files natively. Personally this is what I think makes most sense and it is what Adobe thinks makes most sense. You get an .epub file, open it in a piece of software or on an ebook reading device, and you're reading an .epub book. This scenario uses .epub as a consumer format.
The second way is for software or a device to take an .epub file and automatically convert it to a proprietary format. A publisher creates an .epub file, sends it to a vendor or through a channel that they want to sell, and that vendor or channel builds some sort of automatic conversion of the .epub to a proprietary format. There are many reasons for doing this, and all reasons generally have to do with companies thinking the .epub format doesn't meet the requirements of their hardware or software. This scenario uses .epub as a distribution format.
Either way, the advantage for publishers is very clear. Until now publishers had to convert to X numbers of formats if they wanted to take advantage of X numbers of channels. This significantly raised costs for publishers and forced publishers to make a strategic decision on what parts of their inventory they wanted to convert to an eBook in order to recoup their investment in conversion. And this had depressing consequences for consumers. Imagine going into a Barnes & Noble store with only 10,000 titles available.
What .epub really gives publishers is leverage. They can say to their vendors and channels, "ok, I'm now only giving you .epub and you better either provide software that reads .epub or provides an automatic conversion from .epub to Y format." This tremendously lowers costs and aggravation for publishers and, I strongly suspect, will increase inventory through the channels quite dramatically. The decision to create an eBook is just so much easier to make. And, if a hot eBook startup (or existing non-compliant eBook device/software) comes along to a publisher and says, "I've got this great device or software, give me your books in my format," a publisher can say, "you get .epub if you want my books." I strongly suspect that in the coming months, this above scenario of ".epub only" will start to happen more and more as publishers begin to produce .epub and understand its tremendous benefits to their digital businesses. And, publishers can use this leverage to get their software and ebook device partners to implement .epub a little faster.
While people don't seem quite in the mood these days to do so, I'd give Amazon (and others) the benefit of the doubt and a little time on .epub. The format is clearly in everyone's best interest.
[Links and emphasis added]
Nick's comments are reasoned and rational (not unexpected from someone who's spent time on both the standards side and the vendor side). And while publishers need to be realistic in their expectations for adoption of such a new standard (here at O'Reilly we're still working ourselves to efficiently retrofit content for .epub), publishers still need to keep the .epub goal in sight, and make sure that our actions continue progress toward that endpoint. As Nick suggests, saying "ok, I'm now only giving you .epub and you better either provide software that reads .epub or provides an automatic conversion from .epub to Y format" is the right way to go, for publishers and consumers.
Commentary on Penguin's Missed Ebook Opportunity
(Updated with excerpt/link instead of repost)
On Penguin's latest e-Book move via the O'Reilly Radar blog:
What's most galling, of course, is that Penguin isn't attempting to increase interest in ebooks as a medium by making these classics, long past copyright, available in free, un-DRM-encumbered formats. In an old-meets-new mashup, publishers could use free distribution of still-in-demand classics to generate interest in a form, ebooks, that is still only in the earliest days of its potential public acceptance. Wouldn't you be more likely to try something new if it was free?
- Stay Connected
-

TOC RSS Feeds
News Posts
Commentary Posts
Combined Feed
New to RSS?
Subscribe to the TOC newsletter. 
Follow TOC on Twitter. 
Join the TOC Facebook group. 
Join the TOC LinkedIn group. 
Get the TOC Headline Widget.
- Search
-





