Entries tagged with “publishing” from O'Reilly Radar

Tue

Oct 20
2009

Nat Torkington

Four short links: 20 October 2009

Politics in The Age of Social Software, Ethernet Patents, Free Book Fear, Programming Exercises

by Nat Torkington@gnatcomments: 7

  1. Poles, Politeness, and Politics in the Age of Twitter (Stephen Fry) -- begins with a discussion of a UK storm but rapidly turns into a discussion of fame in the age of Twitter, modern political discourse, the "deadwood press", and The Commons in Twitter Assembled. There is an energy abroad in the kingdom, one that yearns for a new openness in our rule making, our justice system and our administration. Do not imagine for a minute that I am saying Twitter is it. Its very name is the clue to its foundation and meaning. It is not, as I have pointed out before, called Ponder or Debate. It is called Twitter. But there again some of the most influential publications of the eighteenth century had titles like Tatler, Rambler, Idler and Spectator. Hardly suggestive of earnest political intent either. History has a habit of choosing the least prepossessing vessels to be agents of change.
  2. Apple and Others Hit With Lawsuit Over 90s Ethernet Patents -- unclear whether the plaintiff is 3Com (who filed the patents) or a troll who bought them. "We strongly believe that 3Com’s Ethernet technologies are being regularly infringed by foreign and some US companies," said David A. Kennedy, Chief Executive Officer of U.S. Ethernet Innovations. "We believe that the continued aggressive enforcement of the fundamental Ethernet technologies developed by 3Com against the waves of cheap, knock-off, foreign manufactured equipment is a necessary step in protecting the competitiveness of this American technology and American companies in general." (via Slashdot)
  3. The Point -- someone's publishing Mark Pilgrim's "Dive into Python", which was published by APress under an open content license. Naturally this freaked out APress (it's easy to imagine many eyelids would tic nervously should such a thing happen with one of O'Reilly's open-licensed books). Mark's response is fantastic. Part of choosing a Free license for your own work is accepting that people may use it in ways you disapprove of. There are no “field of use” restrictions, and there are no “commercial use” restrictions either. In fact, those are two of the fundamental tenets of the “Free” in Free Software. If “others profiting from my work” is something you seek to avoid, then Free Software is not for you. Opt for a Creative Commons “Non-Commercial” license, or a “personal use only” freeware license, or a traditional End User License Agreement. Free Software doesn’t have “end users.” That’s kind of the point.
  4. Programming Praxis -- programming exercises to keep your skills razor-sharp, with solutions.

tags: free, patent, politics, programming, publishing, social software, twittercomments: 7
submit: Reddit Digg stumbleupon   

 

Thu

Sep 24
2009

Tim O'Reilly

Microsoft Press Enters Strategic Alliance with O'Reilly

by Tim O'Reilly@timoreillycomments: 32

Today, Microsoft and O'Reilly Media announced an agreement to support and expand Microsoft Press. Under the terms of the strategic alliance, O'Reilly will be the exclusive distributor of Microsoft Press titles and co-publisher of all Microsoft Press titles, on Nov. 30, 2009. We'll be working with Microsoft to develop new books, as well as distributing both existing and new co-published books to bookstores, and, perhaps most importantly, to the emerging digital book channels that represent the future of book publishing. Microsoft could have chosen to partner with any of the major computer book publishers. That they chose to work with us is a testament to three advantages we bring to the business:

  1. O'Reilly is more than a book publisher. We are an advocate, a connector, and a community builder. We help developers and users make the most of technology, with a focus on what they need to know. Microsoft has a history of building great developer communities, but in today's world, those communities need to be connected with other communities outside Microsoft. Especially in technology, "the world is flat."
  2. O'Reilly plays a unique role in the technology ecosystem: from our earliest days, we provided the documentation for important technologies for which there was no "vendor." The internet, the World Wide Web, Linux and other open source software, and Web 2.0 all were documented and given mainstream awareness by O'Reilly books and events. We identify and evangelize the disruptive technologies that reinvigorate the industry.
  3. O'Reilly has been a pioneer in the new world of ebooks. In the early 1990s, we co-developed docbook, one of the first standardized formats for ebooks, and the progenitor of future XML-based ebook formats. In 2001, in partnership with the Pearson Technology Group, we launched Safari Books Online, the largest and most comprehensive electronic subscription library of computer books and videos. We've built a successful direct business with DRM-free downloads of ebook bundles that work on any device. We're an early leader in publishing books for the iPhone and other portable reading devices, and understanding how to use ebook channels to reach new customers. And of course, our Tools of Change for Publishing Conference (TOC) has become the place to share knowledge about the changes sweeping through publishing.
On this last point, I'm particularly excited that as part of this agreement, Microsoft has committed to make its ebooks DRM-free and device-independent. One of our goals at O'Reilly has been to make sure that ebook customers can read them on any device, and have the ability to keep using them even if they change their preferred device. Having Microsoft Press join us in this commitment is a big step forward towards an open ebook market.

tags: drm, microsoft, oreilly media, publishingcomments: 32
submit: Reddit Digg stumbleupon   

 

Wed

Sep 23
2009

Andy Oram

Worldwide Lexicon: matching up technologies and culture to end the language barrier

by Andy Oram@praxagoracomments: 5

I've reported before on the Worldwide Lexicon, the brainchild of my friend Brian McConnell. His most recent breakthrough, which I blogged about in August, was an impressive Firefox plugin that exploits both human and machine translations on the Web to provide pages you can read in your primary language.

As attractive as the Firefox plug-in can be, it's only the first stage in four that Brian plans toward a computing environment that encourages and leverages human translation. On the browser side, the next logical project is to reproduce the Firefox experience for IE users. Ultimately, he hopes the functionality becomes a standard part of every browser. Even better, he's working on a way to include the functionality on the server side so that it's browser-independent (although that technology would require support in the server software, of course).

And there's even more to come. He lays out his vision in an essay boldly titled The End Of The Language Barrier. The bottom of the article points to an equally important statement written for the World Economic Forum by Ethan Zuckerman, founder of the Global Voices site that extends the reach of weblogs to people in many countries who previously lacked access to such forums.

(continue reading)

tags: Brian McConnell, community, crowdsourcing, documentation, Ethan Zuckerman, Firefox add-on, Global Voices, language, peer production, polyglot, publishing, translation, wealth of networks, wisdom of crowds, World Wide Lexicon, WWLcomments: 5
submit: Reddit Digg stumbleupon   

 

Tue

Aug 25
2009

Andy Oram

World Wide Lexicon Toolbar changes the reading experience for the other 99% of web pages

by Andy Oram@praxagoracomments: 8

Brian McConnell's latest coding effort, World Wide Lexicon Toolbar, meets my criterion for a piece of critical infrastructure: after two days with it I can't get along without it, and I plan to avoid any browser that doesn't have it installed.

Brian is a highly adaptive programmer. With roots in the telecom industry and several start-ups on his resume, he also wrote Beyond Contact: A Guide to SETI and Communicating with Alien Civilizations for O'Reilly. The World Wide Lexicon project he's been working on for the past several years is again something totally different.

Install the add-on (currently experimental) in Firefox 3.5 or higher and visit a page in some language other than your default. Before your eyes, headings and text change into your native language. You can get similar effects by submitting the page to a popular translator such as Google (which is one of the tools used behind the scenes by the WWL toolbar), but the instantaneous effect of the toolbar makes you feel closer to the people whose sites you visit around the world.

There are several languages that I know well enough to get the gist of a page, but where I miss some of the details and get frustrated by gaps in my vocabulary. Therefore, I set the WWL toolbar to "Bilingual view," so each block element of the original text is shown together with its translation. The bilingual view is considerably less attractive, because it swells the size of each block element, but I can tell already that it will improve my language skills quickly.

WWL is designed for volunteer translations. If it becomes more popular, people will submit translations that are much more accurate than the machine-generated ones the WWL must fall back on currently.

What's the process behind this new dimension to web browsing? McConnell let me in on some of the magic.

Volunteer translations

McConnell invented WWL several years ago with the core notion of encouraging people to translate web pages they thought should get a wider audience. When he first told me about the idea, I was skeptical that he would get many volunteers. But then I heard of other volunteer translation efforts. For instance, there's a whole subculture of people who write subtitles for popular Hollywood films. This runs afoul of copyright law, of course (and so do the copies of movies they're attached to, probably) but they show the lengths to which crowdsourcing has progressed in the translation area.

FLOSS Manuals, a project I do volunteer work for, also finds dozens of people willing to translate its open source documentation.

McConnell's first set of tools were designed to facilitate on-the-fly translations. Web designers could enhance their web sites by downloading from the WWL site some JavaScript that made each text element on the page editable. (I blogged about this in December 2007.) The paste-in displayed a little pencil icon, signaling to viewers that they could do instant translations. All they would have to do was click on an element, and a text box would pop up where they could enter their translation. The web site would then register the translation with the central WWL site.

World Wide Lexicon API

The WWL API covers the entire life cycle of a translation: registering a translation, rating translations for quality, searching for a translation of a particular page into a particular language, and retrieving a translation. Queries can specify a minimum rating.

Toolbar

The latest achievement of the WWL project is the toolbar officially released yesterday. It determines the user's native language through settings in the browser. When each page is visited, the toolbar uses the domain name and various tests on the text to make a guess about its language.

The toolbar then issues an API query to see whether any human translations exist. If so, it displays the translations with a light yellow or green background.

If no one has made a human translation (which is usually the case so far) the toolbar resorts to well-known machine translation services. It can make use of Google Translate, Apertium, and Moses, each of which offers an API, and will also query Babelfish when its API is ready. Machine translations are displayed with a light blue or grey background.

The progressive translation used by the toolbar is also interesting. It starts with the first 10 or 20 elements, then translates heading tags (<H1>, etc.), then the larger texts, and ultimately every element on a page. (I displayed one page that embedded a Google ad, and the translator recognized and translated that text too.) McConnell is working on making the various translations run in parallel. Because translation changes the sizes of elements, the toolbar makes various accommodations to display the page as attractively as it can.

In short, WWL is a cool combination of mash-ups, existing services, crowdsourcing, and Ajax. I'm sure that in a year's time I'll think back to its appearance today and be shocked at how primitive it was. But it will remain a transformative tool for me.

tags: Brian McConnell, community, crowdsourcing, documentation, Firefox add-on, peer production, publishing, wealth of networks, wisdom of crowds, World Wide Lexicon, WWLcomments: 8
submit: Reddit Digg stumbleupon   

 

Tue

Aug 25
2009

Nat Torkington

Four Short Links: 25 August 2009

Reverse Search, PDF Stripping, Flash Visualization, Failure

by Nat Torkington@gnatcomments: 1

  1. Tineye -- reverse search engine; you upload an image and they find you similar images so you know where else it's used. Check out their cool searches.
  2. PDF Pirate -- upload a PDF and this web site will give it back to you minus the restrictions on copying/printing/etc.
  3. Flare -- an ActionScript library for creating visualizations that run in the Adobe Flash Player. BSD-licensed, modelled on Prefuse. When there's a visualisation library for every platform, will we start to get people who know how to make them?
  4. The Importance of Failure (Marco Tabini) -- This is a point that I don't often hear made when people talk about failure; the moral behind a failure-related story is usually about preventing it, or dealing with the aftermath, but not about the fact that sometimes things go bad despite your best efforts, and all the careful risk management and contingency planning won't keep you from going down in flames. This is important, because it forces every person to establish a risk threshold that they are willing to accept in every one of their life efforts.

tags: drm, failure, failure happens, flash, publishing, search, visualizationcomments: 1
submit: Reddit Digg stumbleupon   

 

Fri

Aug 14
2009

Nat Torkington

Four short links: 14 August 2009

EPub FTW, SQL Horror, Computer Vision Explained, and A Massive Dump of Twitter Stats

by Nat Torkington@gnatcomments: 1

  1. Page2Pub -- harvest wiki content and turn it into EPub and PDF. See also Sony dropping its proprietary format and moving to EPub. Open standards rock. (via oreillylabs on Twitter)
  2. SQL Pie Chart -- an ASCII pie chart, drawn by SQL code. Horrifying and yet inspiring. Compare to PostgreSQL code to produce ASCII Mandelbrot set. (via jdub on Twitter and Simon Willison)
  3. How SudokuGrab Works -- the computer vision techniques behind an iPhone app that solves Sudoku puzzles that you take a photo of. Well explained! These CV techniques are an essential part of the sensor web. (via blackbeltjones on Delicious)
  4. Twitter by the Numbers -- massive dump of charts and stats on Twitter. I love that there's a section devoted to social media marketers, the Internet's head lice. (via Kevin Marks on Twitter)

tags: book related, computer vision, ebooks, fun, iphone app, publishing, sql, statistics, twittercomments: 1
submit: Reddit Digg stumbleupon   

 

Mon

Aug 10
2009

Nat Torkington

Four short links: 10 August 2009

Propaganda, Computer Science, Web Science, CS History

by Nat Torkington@gnatcomments: 0

  1. The Propaganda Newspapers -- London councils increasingly providing their own newspapers, masquerading as mass-market popular appeal newspapers but without anything critical of the council that produces it. This is an evolutionary dead-end for reinventing newspapers, and is why the non-profit/trust structure works so well.
  2. Time for Computer Science to Grow Up -- publish in journals so conferences can be community events. I've seen academics at Sci Foo look around at the unconference structure, or lightning talks, and say "why can't my normal conferences be like this?!", and not just in computer science too. Science conferences need a heart transplant. (via David Pennock)
  3. Science Online 2010 -- conference on science and the Web. Our goal is to bring together scientists, physicians, patients, educators, students, publishers, editors, bloggers, journalists, writers, web developers, programmers and others to discuss, demonstrate and debate online strategies and tools for doing science, publishing science, teaching science, and promoting the public understanding of science. (via kubke on Twitter)
  4. E.W. Dijkstra Archive -- a collection of over 1,000 manuscripts that EWD sent around during his career. EWD 1036, "On the cruelty of really teaching computing science". "From a bit to a few hundred megabytes, from a microsecond to a half an hour of computing confronts us with completely baffling ratio of 109" (via S. Lott)

tags: education, events, history, newspapers, people, publishing, science, webcomments: 0
submit: Reddit Digg stumbleupon   

 

Tue

Aug 4
2009

Nat Torkington

Four short links: 4 August 2009

NASA Cloudware, btrfs, eBook Editing, Exponential Death

by Nat Torkington@gnatcomments: 1

  1. NASA Nebula Services/Platform Stack -- The NEBULA platform offers a turnkey Software-as-a-Service experience that can rapidly address the requirements of a large number of projects. However, each component of the NEBULA platform is also available individually; thus, NEBULA can also serve in Platform-as-a-Service or Infrastructure-as-a-Service capacities. Bundles RabbitMQ, Eucalyptus, LUSTRE storage, Fabric deployment, Varnish front-end, MySQL and more. (via Jim Stogdill)
  2. A Short History of btrfs -- Now for some personal predictions (based purely on public information - I don't have any insider knowledge). Btrfs will be the default file system on Linux within two years. Btrfs as a project won't (and can't, at this point) be canceled by Oracle. If all the intellectual property issues are worked out (a big if), ZFS will be ported to Linux, but it will have less than a few percent of the installed base of btrfs. Check back in two years and see if I got any of these predictions right!
  3. Sigil -- open source WYSIWYG eBook editor. (via liza on Twitter)
  4. Exponential Decay of Life -- This startling fact was first noticed by the British actuary Benjamin Gompertz in 1825 and is now called the “Gompertz Law of human mortality.” Your probability of dying during a given year doubles every 8 years. For me, a 25-year-old American, the probability of dying during the next year is a fairly miniscule 0.03% — about 1 in 3,000. When I’m 33 it will be about 1 in 1,500, when I’m 42 it will be about 1 in 750, and so on. (via Hacker News)

tags: bio, cloud computing, data, ebooks, math, publishing, storagecomments: 1
submit: Reddit Digg stumbleupon   

 

Mon

Aug 3
2009

Nat Torkington

Four short links: 3 August 2009

Mathematics Collaboration, Risk, Visualisation, and SemWeb

by Nat Torkington@gnatcomments: 0

  1. Enabling Massively Parallel Mathematics Collaboration -- Jon Udell writes about Mike Adams whose WordPress plugin to grok LaTeX formatting of math has enabled a new scale of mathematics collaboration.
  2. 2845 Ways to Spin The Risk -- introduction to the ways in which our perception of risk (and numbers in general) can be distorted by how it is presented. (via titine on Twitter)
  3. Logstalgia -- OpenGL app to visualize Apache log files.
  4. 4Store -- "scalable RDF storage". 4store was designed by Steve Harris and developed at Garlik to underpin their Semantic Web applications. It has been providing the base platform for around 3 years. At times holding and running queries over databases of 15GT, supporting a Web application used by thousands of people. (via joshua on Delicious)

tags: brain, collaboration, crowdsourcing, database, math, publishing, semantic web, visualizationcomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Jul 15
2009

Mark Drapeau

Bantamweight Publishing in an Easily Plagiarised World

by Mark Drapeau@cheeky_geekycomments: 10

Even professional writers are prone to infrequent accidental plagiarism. But in the world of novels, newspapers, and college exams, there are rules about bootlegging others’ work that are well-established - most everyone agrees on what behaviors are unacceptable and what the consequences are. In bantamweight publishing, however, the rules are not so clear.

In order for the British Army to raise more units during the First World War, it created battalions of otherwise healthy men with lowered minimum height requirements. In this way, short, powerful miners and similarly swarthy individuals were able to contribute to the war effort. These soldiers were called bantams (a term now heard most commonly in boxing, bantamweight). Similarly, in a Web 2.0 environment, the short powerful bursts of searchable, findable, and sharable data emitted from personal electronic devices are a form of bantamweight publishing in which persons outside the regulated publishing industry can contribute to the information sharing effort.

Bantamweight publishing comes in many forms. Twitter is certainly in this category, but there are a steadily increasing number of ways to share small bits of information with the world. From updating your Facebook Wall to Yammering inside your enterprise to updating your LinkedIn status to commenting on people’s BrightKite locations, everyone is doing it. But in an easily plagiarized world, who owns your sentences once you publish them? It’s not really clear. And in a murky environment where someone might get a macropublishing book deal by popularizing someone else’s creative hashtag, bantamweight publishing runs the risk of serious future problems.

Oh, bantamweight publishing has its customs. Self-policing crowds ensure that most people who lift someone else’s excellent quote or funny picture or news link give credit to the originator using the “retweet” (RT) convention followed by a username. But there is little downside to cheating relative to being expelled from college or fired from your newspaper. As is well known in animal behavior circles, it can be temporarily advantageous for cheaters to infiltrate a system like this.

To be sure, quoting someone’s original haiku verbatim and making it appear as if it were your own is an infraction of bantamweight publishing customs. But what if someone tweets an Abraham Lincoln quotation - must the re-tweeter cite the originator? The custom seems less pressing in this case, mainly because of a lack of intent to deceive and arguable "fair use" of a well-known statement by a famous person. One can imagine altruistic plagiarism as well, where people repeat memes to raise money for charity, or virally make people aware of an immediate Amber alert. Further, who could fault someone for copying information about a charity onto their Facebook Wall without citing the originator? In the bantamweight publishing world, information sharing can easily supersede attribution. There are gradations of citations.

Bantamweight publishing is popular among those who feel brevity is a virtue. But when an entire work of art is bounded in 140 characters, even brevity has its limits. Sometimes, squeezing in a proper attribution through editing content can change the original meaning, when the edits unwillingly shift from cosmetic to substantive. And what happens when you run out of space when attempting to retweet someone who retweeted someone who tweeted an important quotation from the Washington Post? To a large degree, a work of bantamweight publishing is like a painting with an upper weight limit, where the novelty is the canvas and the attribution is the frame; most viewers would choose to appreciate the canvas without the frame if given the hard choice.

Another major difference between regular publishing and bantamweight publishing is the lack of research and editing standards. Sometimes people attribute flawed information properly. It is obvious that excellent curators of information like NYU professor Jay Rosen and publisher Tim O’Reilly are exceptions to the rule, based simply on the phenomena of Rick Rolling, #moonfruit, and celebrity death hoaxes. To many, bantamweight publishing is not an micro-investigatory piece to be researched, sourced, edited, and spread, but rather a form of enhanced social chatter and gossip spreading. And according to the rules of gossip, it doesn’t really matter where it comes from; gossip is fun.

Few would argue that the British bantam units were a bad idea, and likewise bantamweight publishing has many virtues. But there are also pitfalls to this in an easily plagiarized world, particularly when money comes into play. Who’s looking out for the intellectual property of a winning hashtag that becomes a book, or a stream of haikus that becomes a blog that companies advertise on? At some point, bantamweight publishing will no longer be a lawless frontier territory; what will it look like next?

tags: emerging tech, publishing, twitter, web 2.0comments: 10
submit: Reddit Digg stumbleupon   

 

Wed

Jul 1
2009

Nat Torkington

Four short links: 1 July 2009

Web Awards, Speed Thrills, Magazines in the Cloud, Augmented Reality

by Nat Torkington@gnatcomments: 0

  1. The Onyas -- New Zealand web design awards launch, from the people behind Webstock and Full Code Press. The name comes from "good on ya", the highest praise that traditionally taciturn New Zealanders are allowed by law to give.
  2. The Year of Business Metrics: Don't make your users run away! -- wrapup of the Velocity conference. AOL: Users who had a slower experience view far fewer pages. Some interesting notes on performance from a Google-Bing study: Notice that as the delays get longer the Time To Click increases at a more extreme rate (1000ms increases by 1900ms). The theory is that the user gets distracted and unengaged in the page. In other words, they've lost the user's full attention and have to get it back. [...] As much as five weeks later, some users, especially those who saw delays greater than 400MS, were still searching less than before. (via timoreilly on Twitter)
  3. Printcasting -- very simple content management system for print magazines that lets anyone start a magazine, add content, sign up contributors, sell ads, and go. Clever!
  4. Pachube Augmented Reality Hack -- sexy hack that pushes all my buttons: computer vision, Arduino, sensor network, ubiquitous computing, pervasive alternate reality cyborg villians with chalk designs hellbent on world domination and the enslavement of the human race to use as meatsack AA batteries for their sex toys. Okay, four out of five ain't bad. (via bruces on Twitter)

Pachube Augmented Reality Demo

tags: award, computer vision, hacks, performance, print on demand, publishing, sensor networks, velocity09, webcomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Jun 24
2009

Tim O'Reilly

My 140conf Talk: Twitter as Publishing

by Tim O'Reilly@timoreillycomments: 6

I spoke at Jeff Pulver's 140conf a few weeks ago. My subject was the continuity of what I do, from publishing through conferences through my presence on twitter. I tried to draw the connections, and to explain how "social media" means drawing from, curating, and amplifying the voices of a community. I suggest that the role of an editor and publisher is analogous to the role of a point guard in basketball, handing out "assists" and improving the performance of his or her teammates. After all, I point out, I couldn't possibly tweet enough to cover all the topics I am interested in. But by using my retweets to build the visibility of others, I can create and foster a community that cares about the ideas, trends, and people that I care about.

My talk starts about 1:40 into the video, after a few comments from Jeff Pulver, the conference organizer. I've provided a lightly edited and linkified transcript below, for those of you who don't have time to watch the entire 15 minute video. If you do have the time, you can watch the video from the entire two-day conference at http://www.140conf.com/watchit.

What I learned from Twitter

Hi. I want to talk to you a little bit about Twitter and media. I'm a publisher. I'm a publisher in print. And it turns out I'm also a publisher on Twitter. I want to explain the roots of media and how that connects with what we're doing in this newest form of media.

When you think about the original use case of Twitter, which @Leisa described so wonderfully as “ambient intimacy,” it's really news from your close friends. But it's news nonetheless. And sometimes the news from individuals becomes news that matters to a whole lot more people. When someone in Tehran today is reporting their personal news, it's news that matters to all of us. And so you can see the continuum between the personal and the international in those moments.

But that continuum exists all the time, and it's existed always in media.

(continue reading)

tags: 140conf, publishing, twittercomments: 6
submit: Reddit Digg stumbleupon   

 

Sun

May 17
2009

Andrew Savikas

Scribd Store a Welcome Addition to Ebook Market (and 650 O'Reilly Titles Included)

by Andrew Savikas@andrewsavikascomments: 7

The document-sharing site Scribd has launched a new "Scribd Store" selling view and download access to documents and books. As part of the launch, there are now more than 650 O'Reilly ebooks now available for preview and sale in the Scribd store, and all include DRM-free PDF downloads with purchase. (Scribd will soon be adding EPUB as a format, and we'll make that available as soon as possible.)


Oreilly_scribd

Many publishers (including O'Reilly) have kept Scribd at arm's length because the service was often used by people posting copyrighted material without permission. Though Scribd was reasonably responsive to takedown requests, that puts the onus for monitoring on the publisher, a whack-a-mole scenario that will consume as many resources as you throw at it if you let it. But Scribd has implemented a new system that uses the ebooks provided for sale to identify (and remove) any other unauthorized versions of that material, as well as prevent future unauthorized uploads. Like any technology it's far from perfect (for example, I suspect scanned images are more difficult to test than standard PDFs), but it's good enough for us to be comfortable participating, and is as good an example as any of turning lemons into lemonade.

For a publisher (and I use the term loosely) the terms for the Scribd store are impressive -- publishers set the sale price directly, and keep 80% of the revenue (compare that to Amazon's DTP program, where the standard terms are that Amazon gets to set the actual price, and the publisher only gets 35% of their "suggested" price). There's also an interesting "automated pricing" option in Scribd, which uses an (unspecified) algorithm to set the sale price. But the pieces of the Scribd store I'm most excited about is the real-time reporting (compared with a lag of a month or more with most ebook resellers, including Amazon), the option to easily provide free updates to existing content, and the variety of adjustable display options -- like preview amount, refreshingly optional DRM, and purchase-link images. Administering and understanding your sales in Scribd is downright delightful compared with the same for Kindle.

A service like Scribd further reduces the barriers to content creators interested in self publishing digital material (and again offers much better terms than Amazon's DTP program for Kindle), so in some ways absolutely a threat to existing publishers. But we also view it as an opportunity to get our books in front of interested readers, and a promising sign that the market for ebooks is large enough to continue attracting startups like Scribd who bring needed diversity and competition among resellers.

tags: ebooks, media, new media, newspapers, publishingcomments: 7
submit: Reddit Digg stumbleupon   

 

Fri

May 15
2009

Nat Torkington

Four short links: 15 May 2009

LIfe After socket(), Imminent Death of Web 2.0, Breathalyzer Lameness, and Open Source Science Publishing

by Nat Torkington@gnatcomments: 4

  1. Whither Sockets? -- ACM Queue article on how sockets as a model for network programming have become an obstacle to where networking is going. All of these calls have one thing in common: the calling program must repeatedly ask for data to be delivered. In the world of client/server computing these constant requests make perfect sense, because the server cannot do anything without a request from the client. It makes little sense for a print server to call a client unless the client has something it wishes to print. What, however, if the service being provided is music or video distribution? In a media distribution service there may be one or more sources of data and many listeners. For as long as the user is listening to or viewing the media, the most likely case is that the application will want whatever data has arrived. Specifically requesting new data is a waste of time and resources for the application. The sockets API does not provide the programmer a way in which to say, "Whenever there is data for me, call me to process it directly." (via Slashdot)
  2. Game Web 2.Over? (Meg Pickard) -- update of the classic "wall o' Web 2.0 logos" showing which have folded or been bought. I'm glad to see how many have folded; many were the inevitable "me too"ing of initial successes, and many were simply bad ideas. Death is a natural part of the Darwinian marketplace, painful as it is to those who are naturally selected out of the meme pool. I'm glad to see how many were acquired, showing they had something someone wanted. The diagram's incomplete now, of course: it doesn't show the companies launched after the wall o'logos was made. (via Waxy)
  3. Breathalyzer Source Code Sucks -- 2. Readings are Not Averaged Correctly: When the software takes a series of readings, it first averages the first two readings. Then, it averages the third reading with the average just computed. Then the fourth reading is averaged with the new average, and so on. There is no comment or note detailing a reason for this calculation, which would cause the first reading to have more weight than successive readings. Nonetheless, the comments say that the values should be averaged, and they are not... I periodically worry that I've been so long out of hardcore coding that my skills are rusty and I'd never survive at the coal face again. Then I see something like this and I punch the air and wheeze "I still got it!" as I reach for my cane. (via BoingBoing)
  4. Bloomsbury Science Free Online -- Sir John Sulston, Nobel prize winner and one of the architects of the Human Genome Project, has teamed up with Bloomsbury to edit a new series of books that will look at topics including the ethics of genetics and the cyber enhancement of humans. The series will be the first from Bloomsbury's new venture, Bloomsbury Academic, launched late last year as part of the publisher's post-Harry Potter reinvention. Using Creative Commons licences, the intention is for titles in the imprint to be available for free online for non-commercial use, with revenue to be generated from the hard copies that will be printed via print-on-demand and short-run printing technologies. (via Glyn Moody)

tags: open source, programming, publishing, science, startups, webcomments: 4
submit: Reddit Digg stumbleupon   

 

Fri

Apr 17
2009

Pamela Samuelson

Legally Speaking: The Dead Souls of the Google Booksearch Settlement

by Pamela Samuelsoncomments: 59

Guest blogger Pamela Samuelson is the Richard M. Sherman Distinguished Professor of Law and Information at the University of California, Berkeley, as well as a Director of the Berkeley Center for Law & Technology and an advisor to the Samuelson High Technology Law & Public Policy Clinic at Boalt Hall. She has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes, especially for intellectual property law.

This piece will appear in the July 2009 issue of Communications of the ACM. Readers may also be interested in the slides from Pam's recent presentation, "Reflections on the Google Book Search Settlement."

Google has scanned the texts of more than seven million books from major university research libraries for its Book Search initiative and processed the digitized copies to index their contents. Google allows users to download the entirety of these books if they are in the public domain (about 1 million of them are), but at this point makes available only “snippets” of relevant texts when the books are still in copyright unless the copyright owner has agreed to allow more to be displayed.

In the fall of 2005, the Authors Guild, which then had about 8000 members, and five publishers sued Google for copyright infringement. Google argued that its scanning, indexing, and snippet-providing was a fair and non-infringing use because it promoted wider public access to books and because Google would take out of the Book Search corpus any digitized books whose rights holders objected to their inclusion. Many copyright professionals expected the Authors Guild v. Google case to be the most important fair use case of the 21st century.

This column argues that the proposed settlement of this lawsuit is a privately negotiated compulsory license primarily designed to monetize millions of orphan works. It will benefit Google and certain authors and publishers, but it is questionable whether the authors of most books in the corpus (the “dead souls” to which the title refers) would agree that the settling authors and publishers will truly represent their interests when setting terms for access to the Book Search corpus.

Orphan Works

An estimated 70 per cent of the books in the Book Search repository are in-copyright, but out of print. Most of them are, for all practical purposes, “orphan works,” that is, works for which it is virtually impossible to locate the appropriate rights holders to ask for permission to digitize them.

A broad consensus exists about the desirability of making orphan works more widely available. Yet, without a safe harbor against possible infringement lawsuits, digitization projects pose significant copyright risks. Congress is considering legislation to lessen the risks of using orphan works, but it has yet to pass.

The proposed Book Search settlement agreement will solve the orphan works problem for books—at least for Google. Under this agreement, which must be approved by a federal court judge to become final, Google would get, among other things, a license to display up to 20 per cent of the contents of in-copyright out-of-print books, to run ads alongside these displays, and to sell access to the full texts of these books to institutional subscribers and to individual purchasers.

The Book Rights Registry

Approval of this settlement would establish a new collecting society, the Book Rights Registry (BRR), initially funded by Google with $34.5 million. The BRR will be responsible for allocating $45 million in settlement funds that Google is providing to compensate copyright owners for past uses of their books.

More important is Google’s commitment to pay the BRR 63 per cent of the revenues it makes from Book Search that are subject to sharing provisions. The revenue streams will come from ads appearing next to displays of in-copyright books in response to user queries and from individual purchases of and institutional subscriptions to some or all of the books in the corpus. Google and the BRR may also develop new business models over time that will be subject to similar sharing.

One of the main jobs of the BRR will be to distribute the settlement revenues. The money will go, less BRR’s costs, to authors and publishers who have registered their copyright claims with BRR. Although the settlement agreement extends only to books published prior to January 5, 2009, BRR is expected to attract authors and publishers of later-published books to participate in the revenue sharing arrangement that Google has negotiated with BRR.

(continue reading)

tags: copyright, google, policy, publishingcomments: 59
submit: Reddit Digg stumbleupon   

 

Tue

Mar 31
2009

Tim O'Reilly

What Publishers Need to Learn from Software Developers

by Tim O'Reilly@timoreillycomments: 29

There was a great exchange on the O'Reilly editors' backchannel the other day, so illuminating that I thought I should share it with the rest of you. We've been discussing the fast-track development we're using to produce The Twitter Book. (We're basically authoring the book as a presentation, after I realized how much more quickly I am able to put together a slide deck to make my points than I am a normal book. Twitter is also such a fast-moving topic that we need to be able to update the book every time we reprint it.)

Sarah Milstein wrote:

Apropos of everything, the NYT on publishers' speeding up the production process, especially with eBooks:
“If this book had gone through the normal publishing procedures,” Mr. Kiyosaki said, “it wouldn’t be worth writing.”
Andrew Savikas replied:
The more I think about it the more obvious it's becoming to me that the next generation of authoring/production tools will have much more in common with today's software development tools than with today's word processors.

Software developers spend enormous amounts of time creatively writing with text, editing, revising, refining multiple interconnected textual works -- and often doing so in a highly distributed way with many collaborators. Few writers or editors spend as much time as developers with text, and it only makes sense to apply the lessons developers have learned about managing collaborative writing and editing projects at scale.

'Nuff said. I await said next generation of authoring/production tools.

tags: publishing, tools, twittercomments: 29
submit: Reddit Digg stumbleupon   

 

Mon

Feb 23
2009

Mike Shatzkin

Managing monopolies and dominance in the Net age

by Mike Shatzkin@MikeShatzkincomments: 11

Guest blogger Mike Shatzkin is Founder and CEO of The Idea Logical Company, where he has focused on supply chain and digital change issues since 1979. Mike has spoken at and organized publishing industry conferences all over the world. He recently launched The Shatzkin Files blog. One of Mike's several books, The Ballplayers, forms the core of BaseballLibrary.com.

Our thinking about "monopoly" may need to be recast in the Internet age. This is a complicated question to consider and we need to start gathering some good minds around it.

Network effects were noticed before there was an Internet. Both the phone company and the electric company were networks, and it became clear about a century ago that everything worked better for everybody if they WERE monopolies and everybody was hooked up to the same network, not competing ones. So phones and electricity became regulated monopolies, with prices and other behavior, including mandated service levels, controlled. Whether because of a changing ethos or because things became more complicated, or both, "competition" has been introduced in both spheres over the past two or three decades. With debatable results.

Amazon's dominance -- which is not a monopoly but which certainly looks like unassailable hegemony in the world of online bookselling -- can be largely attributed to brilliant execution and maintaining a tight focus on serving the customer. But part of their success at eliminating meaningful competition for online book sales has to do with the nature of the Internet. Online likes one winner in many spaces because it serves the users better NOT to fragment aggregations. If Amazon's reader reviews were spread over 1000 web sites, they wouldn't be as useful to the consumers. And their recommendation engine thrives on data; fewer customers would mean less helpful recommendations for those customers remaining, and the concentration at Amazon means less useful recommendations come from all their retailing competitors. This is an edge that may not stay with the retailer forever, though, because the playing field for information about books is being leveled by social networking sites. That's why Amazon is investing in them.

This tendency to concentration makes it urgent for publishers to get into niches and start trying to own them while they have legacy advantages. If the history of the Net so far is any guide, each information and interest niche will end up being owned by a very small number of players; often it will boil down to one. We seem to have been pretty fortunate with the dominant players (perhaps we should call them "monopoly threats") that have emerged so far, among them: Amazon, Google, ebay, Craigslist, wikipedia, and a now-emerging Facebook. They've executed well and kept their eye on the stakeholders they serve. They, so far, have been more benign dominators than were Microsoft and AOL, two big winners on the previous go-round.

(continue reading)

tags: business, network effects, publishing, web 2.0comments: 11
submit: Reddit Digg stumbleupon   

 

Mon

Feb 16
2009

Joshua-Michéle Ross

Radar Interview with Clay Shirky

by Joshua-Michéle Ross@jmichelecomments: 3

Clay Shirky is one of the most incisive thinkers on technology and its effects on business and society. I had the pleasure to sit down with him after his keynote at the FASTForward '09 conference last week in Las Vegas.
In this interview Clay talks about

  • The effects of low cost coordination and group action.
  • Where to find the next layer of value when many professions are being disrupted by the Internet
  • The necessary role of low cost experimentation in finding new business models


A big thanks to the FASTForward Blog team for hosting me there.

tags: clay shirky, future at work, innovation, journalism, publishing, social mediacomments: 3
submit: Reddit Digg stumbleupon   

 

Mon

Feb 9
2009

Jim Stogdill

The Kindle and the End of the End of History

by Jim Stogdill@jstogdillcomments: 24

This morning I was absentmindedly checking out the New York Times' bits blog coverage of the Kindle 2 launch and saw this:

“Our vision is every book, ever printed, in any language, all available in less than 60 seconds.

It wasn't the main story for sure. It was buried in the piece like an afterthought, but it was the big news to me. It certainly falls into the category of big hairy audacious goal, and I think it's a lot more interesting than the device Bezos was there to launch (which still can't flatten a colorful maple leaf). I mean, he didn't say "every book in our inventory" or "every book in the catalogues of the major publishers that we work with." Or even, "every book that has already been digitized." He said "every book ever printed."

When I'm working I tend to write random notes to myself on 3x5 cards. Sometimes they get transcribed into Evernote, but all too often they just end up in piles. I read that quote and immediately started digging into the closest pile looking for a card I had just scribbled about an hour earlier.

I had been doing some research this morning and was reading a book published in 1915. It's long out of print, and may have only had one printing, but I know from contemporary news clippings found tucked in its pages that the author had been well known and somewhat controversial back in his day. Yet, Google had barely a hint that he ever existed. I fared even worse looking for other people referenced in the text. Frustrated, I grabbed a 3x5 card and scribbled:

"Google and the end of history... History is no longer a continuum. The pre-digital past doesn't exist, at least not unless I walk away from this computer, get all old school, and find an actual library."

My house is filled with books, it's ridiculous really. They are piled up everywhere. I buy a lot of old used books because I like to see how people lived and how they thought in other eras, and I guess I figure someday I'll find time to read them all. For me, it's often less about the facts they contain and more about peeking into alternative world views. Which is how I originally came upon the book I mentioned a moment ago.

The problem is that old books reference people and other stuff that a contemporary reader would have known immediately, but that are a mystery to me today - a mystery that needs solving if I want to understand what the author is trying to say, and to get that sense of how they saw the world. If you want to see what I mean, try reading Winston Churchill's Second World War series.

Churchill speaks conversationally about people, events, and publications that a London resident in 1950 would have been familiar with. However, without a ready reference to all that minutiae you'll have no idea what he's talking about. Unfortunately, a lot of the stuff he references is really obscure today and today's search engines are hit and miss with it - they only know what a modern wikipedia editor or some other recent writer thinks is relevant today. Google is brilliant for things that have been invented or written about in the digital age, or that made enough of a splash in their day to still get digital now, but the rest of it just doesn't exist. It's B.G. (before Google) or P.D. (pre digital) or something like that.

To cut to the chase, if you read old books you get a sense for how thin the searchable veneer of the web is on our world. The web's view of our world is temporally compressed, biased toward the recent, and even when it does look back through time to events memorable enough to have been digitally remembered, it sees them through our digital-age lens. They are being digitally remembered with our world view overlaid on top.

I posted some of these thoughts to the Radar backchannel list and Nat responded with his usual insight. He pointed out that cultural artifacts have always been divided into popular culture (on the tips of our tongues), cached culture (readily available in an encyclopedia or at the local library) and archived culture (gotta put on your researcher hat and dig, but you can find it in a research library somewhere). The implication is that it's no worse now because of the web.

I like that trichotomy, and of course Nat's right. It's not like the web is burying the archive any deeper. It's right there in the research library where it has always been. Besides, history never really operates as a continuum anyway. It's always been lumpy for a bunch of reasons. But as habit and convenience make us more and more reliant on the web, the off-the-web archive doesn't just seem hard to find, it becomes effectively invisible. In the A.G. era, the deep archive is looking more and more like those charts used by early explorers, with whole blank regions labeled "there be dragons".

So, back to Bezo's big goal... I'd love it to come true, because a comprehensive archive that is accessible in 60 seconds is an archive that is still part of history.

tags: big hairy audacious goals, emerging tech, publishingcomments: 24
submit: Reddit Digg stumbleupon   

 

Sat

Feb 7
2009

Michael Jon Jensen

For-Profit, Non-Profit, and Scary Humor

by Michael Jon Jensencomments: 6

Guest blogger Michael Jon Jensen, Director of Strategic Web Communications for the Office of Communications of the National Academies and National Academies Press, has been at the interface between digital technologies and scholarly/academic publishing since the late 1980s.

Tim was kind enough to suggest that I expand on a longish comment I made on his recent post Stuff That Matters: Non-profit to For-profit.

Two threads wove my argument: first, I pushed back at his conventional framing of the non-profit vs. for-profit sectors. But what I think caught his attention most was my description of a project that's trying to "find the funny" in the grinding, slo-motion collapse of our natural world.

An easy knee-slapper, eh?

I'll get back to that second theme after some musings on non-profit vs. for-profit:

Tim: The heart of my message is that work on stuff that matters is a great hedge in down times: even if there isn't a huge monetary payoff, you've done something that needs doing. And it's certainly true that non-profit enterprises are often a good way to tackle hard problems that the marketplace doesn't seem to be addressing.

But I want to make clear that I'm not just talking about charity work. I'm talking about the creation of real economic value. There are huge opportunities for entrepreneurs in solving hard problems, and in so doing creating new markets that can be exploited not just by themselves but by those that follow in their footsteps.

I certainly can't disagree with most of that statement -- but we need to do better at clarifying the roles and mission-driven goals underlying the nonprofit and the for-profit worlds, especially on "stuff that matters."


Non-profits vs. For-profits

Tim comes to his benign perspective on the for-profit sector honestly: O'Reilly has historically been a responsible for-profit, building immense social value at the same time that it profits from its actions. But O'Reilly Media is a somewhat exceptional company.

On the main, the for-profit world has a different "maturation goal" than the non-profit world has, and it affects nearly every decision made in either kind of enterprise.

I heard my favorite summation of the distinction from Peter Likens at an Online Computer Library Center conference years ago. He was then President of the University of Arizona; I first used this quote more than a decade ago, in a presentation I gave entitled "Entrepreneurs of Social Value":

"A for-profit's mission is to create as much value for its stockholders as possible, within the constraints of society. The non-profit's mission is to create as much value for society as possible, within the constraints of its money."

Of course there are, as Tim mentions, great overlaps betwixt the two, and the more that the for-profit world addresses the "stuff that matters," the better. But quite frequently -- at least in publishing, and online, and in the "public good" sector -- when a for-profit takes advantage of that overlap, the pattern has been to decrease the public good.

Take a look at, for example, scientific publishing: in the post-WWII economy, most non-profit scientific journals were bought up by a handful of smart for-profit publishers who, over the following decades, began to ratchet up the prices far beyond what university libraries could afford, producing a dramatic shift in library resource use: an increasing share of nonprofit money went to for-profit scholarly publishing. One could argue that $50,000 a year is a fair price for a really important specialty journal, but it's not an argument that fits into the "stuff that matters" or "social value" meme.

In that instance, smart, rapacious for-profit cherry-picking decreased the means that nonprofit publishers had to fund their other, less profitable work in the humanities, the social sciences, or even the sciences themselves.

A for-profit takeover of formerly nonprofit work could also describe what has happened with Blackwater, and the privatization of the military in general -- higher costs, less accountability, and unintended consequences.

I've worked in nonprofit publishing for more than 20 years, and while I recognize the need for a risk-reward economy, some care needs to be taken to acknowledge that the "public good" rarely is profit-making. It can be sustainable, but is rarely super-profitable.


That said, over those 20+ years, I've always had side projects of some kind -- "stuff that matters" projects that I hoped would end up being profitable, or potentially commercial ones that might be fabulously so.

My hoped-for goals for those projects have changed over time, and recently shifted drastically. For the last 18 months my side project has been with my oldest, bestest friend -- a project which has changed my entire thinking on "what *really* matters," and what "breakthroughs" we need in the next decade -- from the Web 2.0 community, from myself, and from the world at large.

Yeah, it's time for phase II of this guest blog: about trying to turn the onrushing apocalypses into laughter -- or at least a knowing grin.

(continue reading)

tags: publishing, web 2.0comments: 6
submit: Reddit Digg stumbleupon