Entries tagged with “open source” from O'Reilly Radar

Fri

Nov 20
2009

Carl Malamud

Robots.Txt and the .Gov TLD

by Carl Malamud@CarlMalamudcomments: 5

I'm on the board of CommonCrawl.Org, a nonprofit corporation that is attempting to provide a web crawl for use by all. An interesting report just got sent to us about the use of robots.txt files within the .Gov Top Level Domain, a standard known as the Robots Exclusion Standard.

In examining about 32,000 subdomains in .gov, it turns at least 1,188 of these have a robots.txt file with a "global disallow," meaning robots are excluded from indexing this content. Even more curious, on 175 of these sites, while there is a global disallow, there is a specific bypass that allows the Googlebot to index the data. You can look at the raw data on Factual.

At Public.Resource.Org, we've always felt that the use of a robots.txt file by the government should only be used for purposes of security and integrity of the site, not because some webmaster arbitrarily decides they don't want to be indexed. Indeed, on several occasions we have deliberately ignored government imposed robots.txt files because we felt this was an arbitrary and illegal attempt to keep the public out.

And, needless to say, it doesn't make any sense at all to let in some webcrawlers and not let in others. If this is a reaction to a security/integrity issue, such as limited capacity, the proper thing to do is include in the robots.txt file a comment that can be used by other bots to explain what is going on. For example, it could be perfectly reasonable for a government group faced with limited capacity to ask a robot to limit crawls to a certain number of queries per second and only whitelist crawlers that agree to that condition.

Government webmasters should use the robots.txt file sparingly, and should do so in a non-discriminatory fashion.

tags: gov2.0, open source, searchcomments: 5
submit: Reddit Digg stumbleupon   

 

Fri

Nov 20
2009

Nat Torkington

Four short links: 20 November 2009

Social Network Search for Morons, Bulking Up Bio Data, Better E-Mail, Better Standards

by Nat Torkington@gnatcomments: 1

  1. Spokeo -- abysmal indictment of society, first prize in mankind's race to the bottom. Uncover personal photos, videos, and secrets ... GUARANTEED! Spokeo deep searches within 48 major social networks to find truly mouth-watering news about friends and coworkers. PS, anybody who gives their gmail username and password to a site that specializes in dishing dirt can only be described as a fucking idiot. (via Jim Stogdill, who was equally disappointed in our species)
  2. Biologists rally to sequence 'neglected' microbes (Nature) -- The Genomic Encyclopedia of Bacteria and Archaea is project to sequence genomes from more branches of the evolutionary tree of life. Eisen's team selected and sequenced more than 100 'neglected' species that lacked close relatives among the 1,000 genomes already in GenBank. The researchers reported earlier this year at the JGI's Fourth Annual User Meeting that even mapping the first 56 of these microbes' genomes increased the rate of discovery of new gene and protein families with new biological properties. It also improved the researchers' ability to predict the role of genes with unknown functions in already sequenced organisms. (via Jonathan Eisen)
  3. Mail Learning: The What and the How (Simon Cozens) -- a few things that a really good mail analysis tool needs to do. I hope that my mail client and server does these out of the box in the next five years.
  4. Introducing the Open Web Foundation Agreement -- The Open Web Foundation Agreement itself establishes the copyright and patent rights for a specification, ensuring that downstream consumers may freely implement and reuse the licensed specification without seeking further permission. In addition to the agreement itself, we also created an easy-to-read "Deed" that provides a high level overview of the agreement. Applying the open source approach to better standards.

tags: bio, data, email, genomics, idiots, opensource, search, social graph, social software, standardscomments: 1
submit: Reddit Digg stumbleupon   

 

Thu

Nov 19
2009

Nat Torkington

Four short links: 19 November 2009

Chumby One, Gorgeous IE Debugger, Freer Than Free, and Phone-a-Friend for Government IT

by Nat Torkington@gnatcomments: 0

  1. Chumby One (Bunnie Huang) -- new Chumby product released. In addition to being about half the price of the original chumby, the new device added some features: it has an FM radio, and it has support for a rechargeable lithium ion battery (although it’s not included with the device, you have to buy one and install it yourself). There’s also a knob so you can easily/quickly adjust the volume. But I don’t think those are really the significant new features. What really gets me excited about this one is that it’s much more hackable.
  2. Deep Tracing of Internet Explorer (John Resig) -- very sexy deep inspection of Internet Explorer. Finally, something IE does better than Firefox (other than exploits). dynaTrace Ajax works by sticking low-level instrumentation into Internet Explorer when it launches, capturing any activity that occurs - and I mean virtually any activity that you can imagine. (via Simon Willison)
  3. Less Than Free -- begins by talking about Google giving away turn-by-turn directions on Android, and then analyses Google's "less than free" business model: Additionally, because Google has created an open source version of Android, carriers believe they have an “out” if they part ways with Google in the future. I then asked my friend, “so why would they ever use the Google (non open source) license version.” Here was the big punch line - because Google will give you ad splits on search if you use that version! That’s right; Google will pay you to use their mobile OS. I like to call this the “less than free” business model. This is a remarkable card to play. Because of its dominance in search, Google has ad rates that blow away the competition. To compete at an equally “less than free” price point, Symbian or windows mobile would need to subsidize. Double ouch!!
  4. Expert Labs -- a new independent initiative to help policy makers in our government take advantage of the expertise of their fellow citizens. How does it work? Simple: 1. We ask policy makers what questions they need answered to make better decisions. 2. We help the technology community create the tools that will get those answers. 3. We prompt the scientific & research communities to provide the answers that will make our country run better. New non-profit from Anil Dash.

tags: android, business, free, google, gov2.0, hardware, idiots, opensourcecomments: 0
submit: Reddit Digg stumbleupon   

 

Mon

Nov 16
2009

Nat Torkington

Turning Predictions into Opportunities

by Nat Torkington@gnatcomments: 7

The view from the eye of a recession isn't great. When companies are going bust, unemployment growing, and everyone's scouring their budgets for costs to cut, it can be hard to see opportunities. However, when Tim pointed to Stephen O'Grady's fine set of 2010 predictions I found myself popping with "oh, so naturally this will happen next ..." thoughts. Think of this as a glimpse of the blue sky after the economic funnelspout that's demolished our economy. (Continuations of the tornado metaphor with "being sucked into the cloud", or "trailer park economics", or "we're not in Kansas any more, Tantek" left as an exercise to the reader)

  1. As every cloud provider creates their own "open API" (itself a fraught term), look to see the rise of brokers who can migrate you from one cloud to another. Deltacloud is an early free project here from RedHat, but there are many business opportunities waiting. It's possible that companies will pay for assurance (you've tested your migration tool, you know it works on corner cases), service vs product (they don't want tools to run, they want to pay you to install and maintain the tools accessible through a web console), or premium services so that you're a partner helping them get the most from the cloud and not simply a vendor.
  2. We're a long way from sated in the world of collaboration tools. The current rage is mail learning, applying machine learning techniques to email so as to better understand social networks and prioritise incoming email messages and these are largely server-based solutions because it's so hard to get access to the desktop/web clients. Should Google Mail create an app store environment with hooks into the backend, the game could be on for consumer plays around email analytics, prediction, and simply smarter behaviour (why does my email client still not tell me when I say "see attached" yet don't have an attachment in the message?).
  3. Beyond email, many interesting tools have sprung up around the Gov 2.0 space that have applicability within organisations. Yammer has done well to bring Twitter to large companies, but there are still opportunities around simple document markup and suggestion gathering and filtering. Solve a real problem and there's money waiting.
  4. Google's low overhead management is made possible by its automated intranet and the visibility into projects from public code repositories, public smoke builds, and public status blogs. The opportunities to sell this into large companies looking to be "more like Google" are huge.
  5. If Stephen's right that datasets are increasingly viewed as "serious, balance sheet-worthy assets" then the world is going to need some serious balance sheet-worthy help in valuing those assets.
  6. Big data is being democratised, but there's a lot of unmet need in businesses around data warehousing. The typical solution is to build a data warehouse team around a product like Oracle, but I've heard plenty of business people grizzling about the result. They want answers, they don't want the headaches and lag that a data warehouse involve. Big Data (or Cloud Analytics or whatever) may be the opportunity to figure out a new minimum viable product for these folks, and offer it without the "data warehouse" baggage. This might be back end, might be UIs, might be visualisation, but all of these have a lot of room for improvement.
  7. The proliferation of developer targets immediately makes me think of the early PC era. It makes sense to proliferate: let the most useful ("successful") bubble to the top and survive naturally. At this point in the evolution of the scaleout of massively multiplayer online programming languages, we don't know exactly what winning looks like: it's a big feedback loop between the people who build the programming languages and the people with problems to solve (there are always more of the latter than former) and each time we go around it we know more about what is and isn't useful in this brave new world of coding for other people's data centres. Opportunity? Join the mob and write your own programming language, or simply take your commercial opportunity for a spin around the many different languages out there and be the first in your niche to find a good fit between problem space and solution tool.
  8. Stephen's throwaway comment "I’ve never subscribed to the idea that only what can be measured can be managed - open source, in particular, belies that claim" seems like a thrown gauntlet on open source analytics. In particular, I suspect there's a tools opportunity around the nebulous "community manager" role that every company seems to need. It's part CRM, it's part developer tool, it's part tech support, and part camp mother. Usefully quantify aspects of open source development and help companies that are doing it to know how they're doing and what they could do better.
  9. Marketplaces are big in mobile, but I look to other areas as ripe for the picking. For example, if Google Apps are catching on in many companies then a plugin marketplace is a natural extension. It would build out the Apps suite faster than Google can, would enable the tight loop between demand and supply that will drive the product along, and make Google's offering very different from other parties. This is also true of Microsoft and others, but I feel like momentum is more with Google's product than the others. (A feature can push a leader further in front, but rarely helps a laggard leapfrog to the lead)
  10. Every marketplace thus far has been flawed. Apple's famously annoys many developers and blocks huge categories of product (the "don't be better than we are" rule, which is hard to justify as being in the customer's interest), but don't forget Palm's impedance mismatch with jwz's open source code. I think the final chapter on how marketplaces work is far from written.
  11. NoSQL tools remain in their infancy and so there are huge opportunities here. Identify a niche ("fast accurate and timely web metrics for decision-making"), a tool that can solve it (MongoDB), and build the deployment, scaling, administration, reporting tools so you can sell a complete package into that niche. Rinse, lather, repeat.

tags: business, cloud computing, nosql, opensourcecomments: 7
submit: Reddit Digg stumbleupon   

 

Fri

Nov 13
2009

Nat Torkington

Four short links: 13 November 2009

Open Source Design, Interesting NoSQL Use, Copyright Documentary, Location Intelligence

by Nat Torkington@gnatcomments: 1

  1. Open Source Enters The World of Atoms -- an academic statistical analysis of open design. We indicated that, in open design communities, tangible objects can be developed in very similar fashion to software; one could even say that people treat a design as source code to a physical object and change the object via changing the source.
  2. Why I Like Redis (Simon Willison) -- coherent explanation of why Simon likes and uses a particular nosql system. I can run a long running batch job in one Python interpreter (say loading a few million lines of CSV in to a Redis key/value lookup table) and run another interpreter to play with the data that’s already been collected, even as the first process is streaming data in. I can quit and restart my interpreters without losing any data. And because Redis semantics map closely to Python native data types, I don’t have to think for more than a few seconds about how I’m going to represent my data.
  3. © kiwiright (Vimeo) -- short documentary about copyright, made to raise awareness of the issues in New Zealand. (just as applicable to the rest of the world)
  4. Your Movements Speak For Themselves (Jeff Jonas) -- Mobile devices in America are generating something like 600 billion geo-spatially tagged transactions per day. Every call, text message, email and data transfer handled by your mobile device creates a transaction with your space-time coordinate (to roughly 60 meters accuracy if there are three cell towers in range), whether you have GPS or not. Got a Blackberry? Every few minutes, it sends a heartbeat, creating a transaction whether you are using the phone or not. If the device is GPS-enabled and you’re using a location-based service your location is accurate to somewhere between 10 and 30 meters. Using Wi-Fi? It is accurate below10 meters. A thought-provoking roundup of the information leakage with modern locative systems. (via TomC on Twitter)

tags: collective intelligence, copyright, data mining, design, geo, location, nosql, open sourcecomments: 1
submit: Reddit Digg stumbleupon   

 

Thu

Nov 12
2009

Nat Torkington

Four short links: 12 November 2009

CRM on Rails, Data Mining on Hadoop, Disappointing Keynotes, The Teapot Effect

by Nat Torkington@gnatcomments: 1

  1. Fat Free CRM -- open source (Affero GPL) Ruby on Rails CRM system.
  2. Bixo -- open source data mining toolkit that runs as a series of pipes on top of Hadoop. Built on Cascading workflow system for Hadoop that hides MapReduce. (via kdnuggets)
  3. Andy Kessler's Keynote at Defrag Stank (Pete Warden) -- I'm sorry to hear it, because I loved Andy's book How We Got Here about the intersecting histories of economics, finance, and technology. Read the book instead of reading about the disappointing keynote.
  4. The Teapot Effect -- the thing I love about geeks is how their passion causes them to explore, ruthlessly and quantitatively, the everyday phenomena that the rest of us take for granted. Such as dribbling teapots: “Previous studies have shown that dribbling is the result of flow separation where the layer of fluid closest to the boundary becomes detached from it. When that happens, the fluid flows smoothly over the lip. But as the flow rate decreases, the boundary layer re-attaches to the surface causing dribbling.” Read the post and the research it talks about to learn how to prevent Dribbling Teapot Syndrome ....

tags: CRM, data mining, economics, finance, hadoop, history, open source, rails, research, sciencecomments: 1
submit: Reddit Digg stumbleupon   

 

Wed

Nov 11
2009

Nat Torkington

Four short links: 11 November 2009

Participation Tools, Open Data Requests, Go Programming Language, Why Open Source is Better

by Nat Torkington@gnatcomments: 0

  1. ParticipateDB -- database of online tools for public participation. Closed alpha now, with 32 tools and 15 projects in the database. (via Sara Winge)
  2. DataTO -- like data.gov, but it's where users request data sets. (In this case, from the Toronto municipal government)
  3. Go -- new language from Bell Labs and Unix central figures Rob Pike and Ken Thompson, who now work at Google. Bits of C, bits of Google, it compiles to native binaries and runs nearly as fast as C. Built with concurrency and memory management as central figures. Not used in production at Google yet, but grew from a 20% project to something worthy of public release.
  4. On Commit Bits (Jacob Kaplan-Moss) -- that day-one-commit-bit is one of the starkest differences between the corporate and the open source development model. [...] Granted, Django’s very conservative when it comes to granting that commit bit, but I’m not aware of a single open source project under the sun that’d give out a commit bit on a contributor’s first day. I’ve seen developers who’ve been hired to work full time on open source work for months without commit access to the project they’re paid to develop! One of several posts that Jacob's made about why open source makes for (on average) better software.

tags: gov2.0, language, multicore, open data, open source, programming, social softwarecomments: 0
submit: Reddit Digg stumbleupon   

 

Sun

Nov 8
2009

Carl Malamud

Unlikely Group Working Happily Together To Solve Patent Problem

by Carl Malamud@CarlMalamudcomments: 4

People following the issue of open sourcing the U.S. Patent Database might have been surprised to read an announcement in the official business opportunities web site of the U.S. Government: Synopsis for Public Data Dissemination Sole Source Contract to Google, Inc.

While the first reaction of many might be "OMG, WTF, how could they," this is actually good news, with an unlikely cast of characters working together including Google, Intellectual Ventures, and the Internet Archive.

In September, the Patent Office announced a rather strange "Request for Information" (RFI). Under this proposed scheme, the Patent Office would receive a substantial (upwards of $10 million!) donation of equipment from a vendor. In return, the vendor would get to be the official distributor of the patent database to the public, and would get to sell "value-added products." Among other things, the vendor would get access to the patents before the public does, allowing them to mine the database, and would be allowed to sell a variety of bulk products.

While the RFI makes a nod to public access, like all these Zero-Dollar deals the government cuts, there would be a lot of limits on what is "public" data as the vendor tries to recoup their investment by selling the so-called "value-added" products. Readers may remember a similar fiasco with the General Accountability Office where the Federal Legislative Histories were given away to Thomson West and now even the U.S. Congress has to pay to access this material.

The patent database is no ordinary database. This is the only database specifically called out in the U.S. Constitution as being the responsibility of the U.S. Executive Branch to run!  A lot of people think this Zero-Dollar deal the Patent Office is contemplating kind of stinks, and I'm really pleased to announce that a broad coalition has come together to make this data more broadly available immediately:

  • Intellectual Ventures, the IP group founded by Nathan Myhrvold, is donating several terabytes of the back file to Public.Resource.Org, the Internet Archive, and a variety of other groups to make available to everybody.
  • Google asked for permission to crawl the public application system (known as "PAIR"). The announcement by the Patent Office of a "sole source contract to Google" was the government's way of saying we have permission to crawl their system and bypass the CAPTCHAs. This is good news, because the PAIR system contains the "binders," which is all the material that supplements the basic applications and grants.
  • The Internet Archive has set aside a boatload of disk drives to serve this data. In addition, Public.Resource.Org will provide the usual rsync and FTP, and we expect a variety of other groups to provide mirrors both for bulk access and end-user systems.

It goes without saying that Google, the Internet Archive, and Intellectual Ventures are 3 groups that don't often work together, and I think this illustrates the compelling public interest in making the patent database more broadly available. We announced this Section 8 Task Force in a letter to Congressman Mike Honda. And, we also sent in a FOIA request to the Patent Office, putting them on notice that we expect any responses to their RFI $0 boondoggle to be made available to the public, as required by law.

In the long-term, Patent Office just needs to fix their system instead of resorting to silly $0 deals. They have 600 staff in Information Technology and spend hundreds of millions of dollars. Surely, they can find a way to serve the public as part of that? Putting a lien on the Patent database in return for $10 million in hardware instead of fixing their 70's-era mainframes just doesn't make sense.

In the meantime, we should have the first 8 terabytes of data up pretty soon. Those interested in learning more about the issue are urged to consult the paper trail on our PTO page which includes letters to and from Congress, and pointers to the Patent Office procurement docs.

tags: gov2.0, open data, open sourcecomments: 4
submit: Reddit Digg stumbleupon   

 

Wed

Nov 4
2009

Nat Torkington

Four short links: 4 November 2009

Electronics Hacking FAQs, Speech-To-Text Democracy, Open Source Column Database, Massive Online Analysis

by Nat Torkington@gnatcomments: 1

  1. ChipHacker -- collaborative FAQ site for electronics hacking. Based on the same StackExchange software as RedMonk's FOSS FAQ for open source software.
  2. Democracy Live -- BBC launch searchable coverage of parliamentary discussion, using speech-to-text. One aspect we're particularly proud of is that we've managed to deliver good results for speech-to-text in Welsh, which, we're told, is unique. I think of this as the start of a They Work For You for video coverage. I'd love to be able to scale this to local government coverage, which is disappearing as local newspapers turn into delivery mechanisms for real estate advertisements.
  3. InfiniDB: Open Source Column Database -- hooks into MySQL, uses MySQL for SQL parsing, security, etc. The commercial enterprise version has multi-server support (parallel scale-out). (via Brian Aker)
  4. Massive Online Analysis -- MOA is a framework for data stream mining. Includes tools for evaluation and a collection of machine learning algorithms. Related to the WEKA project, also written in Java, while scaling to more demanding problems. . (via joshua on Delicious)

tags: big data, collective intelligence, databases, democracy, gov2.0, hardware, maker, open sourcecomments: 1
submit: Reddit Digg stumbleupon   

 

Tue

Oct 27
2009

Jim Stogdill

Defense Department Releases Open Source Memo

by Jim Stogdill@jstogdillcomments: 11

I've been holding my breath for so long waiting for this memo that I may not remember how to start breathing again, but here it is. The Department of Defense Deputy CIO Dave Wennergren has signed and released "Clarifying Guidance on Open Source Software."

2009OSS

Written primarily by my friend Dan Risacher at the Office of Secretary of Defense the memo is intended to clear up common misconceptions and make it easier for DoD program managers to include OSS in their programs. Its goals are to improve agility, eliminate lock in, and reduce cost.

One of the memo's key points comes from Dave Wheeler at IDA - OSS is considered "Commercial Off the Shelf" software as far as DoD acquisition rules are concerned and therefore OSS must be considered on an equal footing by law whenever a program is doing market research prior to technology selection.

Some will argue that it doesn't go far enough by only encouraging and not demanding the use of OSS on government programs (I certainly have some sympathy for that point of view) but my hope is that this will at least provide some counter to the FUD machine - you know who you are - and keep moving OSS in defense toward a tipping point of acceptance.

By the way, if you are interested in open source in government and are in or near DC, make sure you check out GOSCON next Thursday, Nov 5. Dave Wennergren will be giving the breakfast keynote and you can ask him all about this memo.

tags: defense, opensourcecomments: 11
submit: Reddit Digg stumbleupon   

 

Tue

Oct 27
2009

Nat Torkington

Four short links: 27 October 2009

Digital Art Programming, DIY Construction Set, Open Source Pedant, Design Principles

by Nat Torkington@gnatcomments: 1

  1. Field -- a development environment for "experimental code" and digital art. We think that, for many uses, Field is a better Processing than Processing. Includes Python and Java bridges, goal is to connect to as many different programming systems as possible. OS X only at the moment.
  2. Contraptor -- a DIY open source construction set for experimental personal fabrication, desktop manufacturing, prototyping and bootstrapping. (via Hacker News)
  3. After The Deadline -- open source contextual spelling and grammar checker. (via Hacker News)
  4. Design Principles to Choose the Right Ideas -- Often people ask me how we know which ideas to choose from all the hundreds of ideas we’ve generated during brainstorm sessions. Apart from our gut feelings and experience there’s a method that could help us decide: define design principles. Interesting for the different sets of design principles used by Google and Microsoft teams. (via egoodman on Delicious)

tags: art, design, diy, hardware, language, open source, processing, programmingcomments: 1
submit: Reddit Digg stumbleupon   

 

Mon

Oct 26
2009

Nat Torkington

Four short links: 26 October 2009

Data Exploration, Evidence-Based Coding, API to the English Language, Dual Licensing

by Nat Torkington@gnatcomments: 4

  1. Toiling in the Data Mines -- Tom Armitage describes the process that Berg calls "material exploration". Programmers very rarely talk about what their work feels like to do, and that's a shame. Material explorations are something I've really only done since I've joined BERG, and both times have felt very similar - in that they were very, very different to writing production code for an understood product. They demand code to be used as a sculpting tool, rather than as an engineering material, and I wanted to explain the knock-on effects of that: not just in terms of what I do, and the kind of code that's appropriate for that, but also in terms of how I feel as I work on these explorations. Even if the section on the code itself feels foreign, I hope that the explanation of what it feels like is understandable.
  2. Bits of Evidence -- Slides for a talk, "What we actually know about software development and why we believe it is true". (via Simon Willison)
  3. Wordnik API -- definitions, frequencies, examples APIs. See the announcement from the Web 2.0 Summit.
  4. The Peculiar Institution of Dual Licensing -- Brian Aker eloquently describes why he feels that dual licensing is anti-open source. Brian obviously has considerable experience informing this opinion--his years as Director of Technology for MySQL.

tags: apis, business, data mining, language, mysql, open source, programming, sciencecomments: 4
submit: Reddit Digg stumbleupon   

 

Sun

Oct 25
2009

Tim O'Reilly

Thoughts on the Whitehouse.gov switch to Drupal

by Tim O'Reilly@timoreillycomments: 43

Yesterday, the new media team at the White House announced via the Associated Press that whitehouse.gov is now running on Drupal, the open source content management system. That Drupal implementation is in turn running on a Red Hat Linux system with Apache, MySQL and the rest of the LAMP stack. Apache Solr is the new White House search engine.

This move is obviously a big win for open source. As John Scott of Open Source for America (a group advocating open source adoption by government, to which I am an advisor) noted in an email to me: "This is great news not only for the use of open source software, but the validation of the open source development model. The White House's adoption of community-based software provides a great example for the rest of the government to follow."

John is right. While open source is already widespread throughout the government, its adoption by the White House will almost certainly give permission for much wider uptake.

Particularly telling are the reasons that the White House made the switch. According to the AP article:

White House officials described the change as similar to rebuilding the foundation of a building without changing the street-level appearance of the facade. It was expected to make the White House site more secure - and the same could be true for other administration sites in the future....

Having the public write code may seem like a security risk, but it's just the opposite, experts inside and outside the government argued. Because programmers collaborate to find errors or opportunities to exploit Web code, the final product is therefore more secure.

More than just security, though, the White House saw the opportunity to increase their flexibility. Drupal has a huge library of user-contributed modules that will provide functionality the White House can use to expand its social media capabilities, with everything from super-scalable live chats to multi-lingual support. In many ways, this is the complement to the Government as Platform mantra I've been chanting in Washington. When you build a vibrant, extensible platform, others add value to the foundation you establish; when you join such a platform, you get the benefit of all those features you didn't have to develop yourself.

Of course, it's easy to imagine that the use of open source software will slash the government's IT budget. After all, this software is freely downloadable. I have a feeling it's quite a bit more complicated than that.

First off, government has a huge number of special requirements (remember the flap over President Obama's blackberry?) Second, don't underestimate the difficulty of doing business in Washington. Procurement is done through a complex ballet understood by few open source companies. Third, a big IT deployment like this requires coordination between many companies, each providing a piece of the puzzle. According to techpresident.com, no fewer than five firms were involved in the switch: prime contractor General Dynamics Information Systems, Drupal specialists Phase 2 and Acquia, hosting provider Terremark, and CDN-supplier Akamai. (Disclosure: O'Reilly AlphaTech Ventures is an investor in Acquia.)

The special nature of the government marketplace is one of the reasons why I launched the Gov 2.0 Expo, which will be held in Washington DC next May. There are huge opportunities for open source, web 2.0, and new media companies in government, but there are also challenges reaching that market. One of my goals for the event is to increase the visibility of cutting edge technology firms not just to government agencies, but also to the prime contractors who are putting together these complex procurements.

The net-net is that I suspect that simply using open source software won't slash government IT budgets, at least not right away. What it will do is increase the amount of value we get for our money and the speed with which new technology can be adopted. Features that would have cost millions of dollars and years of development to add will now be rolled into the scope of current contracts.

It's also important to realize that using open source is very different from contributing to open source. Despite the exaggerated claims in the AP story, that "the programming language is written in public view, available for public use and able for people to edit", the White House has not yet released any of the modifications they made to Drupal or its operating environment back to the open source community. The source code for Drupal (and the rest of the LAMP stack) is indeed available, but the modifications that were made to meet government security, scalability, and hosting requirements have not yet been shared. In my conversations with the new media team at the White House, it is clear that they are exploring this option.

Giving modifications back to the Drupal community is the next breakthrough announcement that I'll be looking for.

Releasing code is more than just being a good open source community citizen, though. Code sharing is a major cost-saving opportunity for government. There are countless government agencies at the federal level, not to mention at the state and local level, that perform similar functions. Yet each of them does its own development, driving up costs. Federal CIO Vivek Kundra has made a great step forward in web services by creating data.gov. I'm eager to see an analogous code.gov portal for government agencies to share their open source software code.

tags: drupal, gov2.0, opensource, whitehousecomments: 43
submit: Reddit Digg stumbleupon   

 

Fri

Oct 23
2009

Nat Torkington

Four short links: 23 October 2009

Beautiful Information, Teen Game Designer, Creative Science Writing, Open Source Schools

by Nat Torkington@gnatcomments: 0

  1. Information is Beautiful -- gorgeous descriptions of the design of infographics. For once, a design discussion that might be useful to mere mortals like me.
  2. Australian Teen Crafts "Sneaky" Games -- video interview with a 16 year-old winner of the IFTF, Sun, and BoingBoing Digital Open. Great to see game design, a topic we've followed on Radar, getting uptake by the people about to enter the workforce. "I love index cards," says Harry, "And I was thinking -- hmm, how can I incorporate them into a project?" So he designed and printed these game cards, and "spread the seeds of sneakiness and espionage" into the unsuspecting pockets, math books, binders and bags and jackets of his schoolmates. (via BoingBoing)
  3. Science Writing Shortlist -- the Manhire Prize is New Zealand's most prestigious award for creative science writing. The shortlisted entries are available via this link, and make for enlightening reading. Interestingly, there are two prizes awarded: one for fiction and another for non-fiction; New Zealand has a tradition of encouraging interaction between the arts and sciences.
  4. Fedena -- an open source school management system, built in India, using Ruby on Rails. (via Brenda Wallace)

tags: design, education, games, open source, science, visualizationcomments: 0
submit: Reddit Digg stumbleupon   

 

Tue

Oct 13
2009

Nat Torkington

Four short links: 13 October 2009

Open Source, Gov 2.0, Gaming, Education

by Nat Torkington@gnatcomments: 0

  1. Our Open Source School -- blog of Albany Senior High School in New Zealand, which only runs open source software.
  2. Behind The Scenes at What Do They Know -- interesting post showing details behind the What Do They Know web site. In the last year there have been only seven significant cases where requests have been hidden from public view on the site due to concerns relating to potential libel and defamation. Three of those cases have involved groups of twenty or so requests made by the same one or two users. While actual number of requests we have had to hide is around 70 (0.4% of the total) even this small fraction overstates the situation due to the repetition of the same potentially libelous accusations comments in different requests. In all cases we have kept as much information up on the site as possible. Our policy with respect to all requests to remove information from the site is that we only take down information in exceptional circumstances; generally only when the law requires us to do so.
  3. The Complete History of Lemmings -- a must-read for videogamers from the early 90s. Theres been much debate over the choice of colours as well, but the colours were selected, not because they were the easiest to choose, but because of the PC EGA palette. With the limited choice, it was decided the green hair was nicer than blue, and with that, the final Lemming was born. I was actually the next person to code up a demo on the Commodore 64, but I only got so far as having a single Lemming walking over the landscape before Dave put me onto another project.
  4. Google Replaces TeleAtlas -- Tele Atlas confirms that Google has decided to stop using Tele Atlas map data for the U.S. Google will now use its own map data. Our relationship with Google for map coverage continues outside of the U.S. in dozens of geographies.

tags: education, gaming, geo, google, gov2.0, mapping, opensource, retro, teleatlascomments: 0
submit: Reddit Digg stumbleupon   

 

Fri

Oct 9
2009

Nat Torkington

Four short links: 9 October 2009

Negative Karma, Wal-Mart TQI, Idiot Airlines, and Native iPhone Apps in Lua

by Nat Torkington@gnatcomments: 2

  1. Don't Display Negative Karma -- A fascinating insight for those building social software, whether for collective intelligence or otherwise: There can be no negative public karma-at least for establishing the trustworthiness of active users. A bad enough public score will simply lead to that user's abandoning the account and starting a new one, a process we call karma bankruptcy. This setup defeats the primary goal of karma-to publicly identify bad actors. Assuming that a karma starts at zero for a brand-new user that an application has no information about, it can never go below zero, since karma bankruptcy resets it. Just look at the record of eBay sellers with more than three red stars-you'll see that most haven't sold anything in months or years, either because the sellers quit or they're now doing business under different account names. (I love finding articles like this, thinking "they should write a book for us!" and then realizing "oh, they already are!") (via Hacker News)
  2. Information Wants to be Free, Even At Wal-Mart (Pete Warden) -- an interesting piece on the value of opening up data, sharing information in negotiations so the best outcome can be reached. I'd argue that this trust argument is usually a cop-out, hiding worries about turf and control. In most cases it's clear that it's not in the other party's best interest to screw you over, and if it is, why are you dealing with them at all? The worst cases I saw were between departments within the same company, often we shared more information with competitors than the guys down the hall. The other reason I see people not sharing is shame: many companies (and individuals) work hard to present a facade of competence and quality that facts belie.
  3. The Forest, The Trees, and the Bag Fees -- The bean counters can't track the revenue dilution of all these new fees. They don't want to. We miss the forest for the goddamed trees all the time. And the CEO acts as if fees are found cash. Meanwhile, no one asks why our overall revenue is plunging and we're losing money quarter after quarter. Everyone acts as if one thing has nothing to do with the other. A reminder to watch the important numbers, e.g. cash in bank, profit, customer satisfaction. (via Bryan O'Sullivan)
  4. Native iPhone Apps Written in Lua -- open source port of Lua with Cocoa bindings for the iPhone. This is a tutorial showing you how to install and get past Hello, World. Apple have already approved one app written using it.

tags: business, collective intelligence, iphone app, lua, open data, opensource, programming, social softwarecomments: 2
submit: Reddit Digg stumbleupon   

 

Wed

Oct 7
2009

Nat Torkington

Four short links: 7 October 2009

Ongoing Palm Fail, YouTube Numbers, Plugin Patent Pain, Bivalve-Oriented Architecture

by Nat Torkington@gnatcomments: 1

  1. Followup to jwz's Palm App Store Fiasco -- redux: still nothing concrete from Palm, but they're saying they'll create a second-rate app store into which open source apps will go (along with apps that Palm hasn't reviewed).
  2. Schmidt on YouTube -- the interesting bit for me was Every minute, more than 10 hours of video is uploaded to the site.
  3. Company that won $585M from Microsoft sues Apple, Google - The infamous '906 patent granted to Eolas and the University of California was one of the first patents to get the young online tech scene going in 1998. The patent addresses third-party browser plug-ins to run various forms of media as an "embedded program object"—essentially a program that runs within another program. Eolas promptly sued Microsoft for its implementation of ActiveX in Internet Explorer, which set in motion a years-long legal battle between the two companies. and won $585M, now they're suing many large Internet companies. (via Hacker News)
  4. IBM Uses Mussels as Sensor Network -- Concerned with the environmental and revenue impacts of leaks during oil drilling, StatOil sought an innovative and automated way to detect leaks. They wanted to replace a manual process that included deep sea drivers. StatOil’s innovation, they attached RFID tags to the shells of blue mussels. When the blue mussels sense an oil leak, they close which prompts the RFID tags to emit closure events. In response to the events, the drilling line is automatically stopped. And, in case you are wondering, this is of no harm to the blue mussels. (via monkchips on Twitter)

tags: app store, google, open source, palm, patent, sensor networks, web, youtubecomments: 1
submit: Reddit Digg stumbleupon   

 

Tue

Oct 6
2009

Carl Malamud

Questions (and Answers!) About the Federal Register

by Carl Malamud@CarlMalamudcomments: 3

When the White House retweets Cory Doctorow, you know something unusual has happened. As many of you saw, the Office of the Federal Register announced that source code for the Federal Register is now available in bulk—for free—and has been converted to XML. Ed Felten's shop at Princeton created a site called fedthread.org to see what you can do with the data and Public.Resource.Org helped the Government Printing Office in testing early stages of the XML work.

All-in-all, a nice piece of public-private cooperation and an important step towards open source America's operating system, and I figured that was the end of that. So, imagine my surprise when I got a call from the White House saying they were making Raymond Mosley, Director of the Office of the Federal Register (OFR) and Michael L. Wash, the Chief Information Officer of the Government Printing Office (GPO) available just in case there were any technical questions from the net.

I gathered questions from a variety of sources, including on-line discussion groups and twitter, and have been doing email back and forth with both Ray and Mike. Hope this is useful (it certainly has been fun to do)!

(continue reading)

tags: gov20, open government, open sourcecomments: 3
submit: Reddit Digg stumbleupon   

 

Mon

Oct 5
2009

Nat Torkington

Four short links: 5 October 2009

Bozo Cloud Talk, Annotation Fail(ish), Python MySQL Slash, and Infinite Books

by Nat Torkington@gnatcomments: 2

  1. Brown Cloud Marketing -- advertorial "interviewing" GM of a company offering "DNS in the cloud". This might be a worthwhile service, but the way he markets it (by saying open source is "freeware" and the market leader is "legacy") reveals a rich vein of bozo. Freeware legacy DNS is the internet's dirty little secret (actually, it's the reason we have a functioning DNS), Nominum software was written 100 percent from the ground up, and by having software with source code that is not open for everybody to look at, it is inherently more secure. (security through obscurity is equating clothing with being naked yet blind). The Internet kindly did the poor man's homework: screenshot of a cross-site scripting vulnerability in their customer portal, a Nominum security advisory from 2008, and the Nominum web server is running Linux, Apache, and PHP (all legacy freeware yet apparently not the Internet's dirty little secret). (via Bert Hubert and Securosis)
  2. Public Annotations on Healthcare Bill -- using technology from SharedBook, Congressman Culberson hoped to get citizens marking up the healthcare bill. They're using the software but many are just commenting on page 1--turning the hosted annotation platform into a forum with an odd user interface. It's a UI challenge: designing a way to let focused people comment on specific things, while also permitting impatient unfocused people to comment on the general topic. It's like asking for a SmartCar that seats 80. See also OpenCongress and their annotation system which also has hundreds of comments on the first few lines of the bill (including 39 on the one line "111th Congress"--apparently more contentious than you'd think!).
  3. MyConnPy -- pure-Python MySQL client library, useful because it requires no C compilation to install (and thus can work on systems without C compilers installed, e.g. mobile). (via Simon Willison)
  4. The Infinite Book -- design concept for an ebook reader (not a product you can buy yet). Sexy. (via Gizmodo)

tags: cloud, dns, ebooks, gov2.0, marketing, mysql, open source, python, social softwarecomments: 2
submit: Reddit Digg stumbleupon   

 

Tue

Sep 29
2009

Nat Torkington

Four short links: 29 September 2009

Bletchley Park No Longer Blech, Contest Mania, Palm Process Fails For Free Software, Open Source Web Analytics

by Nat Torkington@gnatcomments: 0

  1. Bletchley Park May Have a Future -- the UK birthplace of modern computing, where Alan Turing worked during WW II breaking German codes, is dilapidated and in need of major repair. They appear to have a supporter in the UK National Lottery, who have given them a grant to begin work and prepare for further grants. It should be secured for the future as a place of significant historical merit in the development of computing. (See also The Geek Atlas)
  2. Google Opens Voting on Ideas to Change the World -- there are a lot of contests at the moment: Project 10^100, Apps for Democracy, Apps for America, a plethora of X Prizes, the Netflix prize, and more. I wonder whether contests are like communities: you need a manager to cultivate and boost interest, or else your contest withers on the vine.
  3. My ongoing Kafka-esque nightmare of dealing with Palm and their App Catalog submission process (jwz) -- This is my story about attempting to simply distribute this free software that I have written, and how Palm has so far completely prevented me from doing so. Epic Palm fail. (via Hacker News)
  4. Piwik -- Piwik aims to be an open source alternative to Google Analytics. GPL-licensed.

tags: analytics, collective intelligence, history, open source, palm, uk, webcomments: 0
submit: Reddit Digg stumbleupon