Entries tagged with “data” from O'Reilly Radar

Fri

Nov 20
2009

Nat Torkington

Four short links: 20 November 2009

Social Network Search for Morons, Bulking Up Bio Data, Better E-Mail, Better Standards

by Nat Torkington@gnatcomments: 1

  1. Spokeo -- abysmal indictment of society, first prize in mankind's race to the bottom. Uncover personal photos, videos, and secrets ... GUARANTEED! Spokeo deep searches within 48 major social networks to find truly mouth-watering news about friends and coworkers. PS, anybody who gives their gmail username and password to a site that specializes in dishing dirt can only be described as a fucking idiot. (via Jim Stogdill, who was equally disappointed in our species)
  2. Biologists rally to sequence 'neglected' microbes (Nature) -- The Genomic Encyclopedia of Bacteria and Archaea is project to sequence genomes from more branches of the evolutionary tree of life. Eisen's team selected and sequenced more than 100 'neglected' species that lacked close relatives among the 1,000 genomes already in GenBank. The researchers reported earlier this year at the JGI's Fourth Annual User Meeting that even mapping the first 56 of these microbes' genomes increased the rate of discovery of new gene and protein families with new biological properties. It also improved the researchers' ability to predict the role of genes with unknown functions in already sequenced organisms. (via Jonathan Eisen)
  3. Mail Learning: The What and the How (Simon Cozens) -- a few things that a really good mail analysis tool needs to do. I hope that my mail client and server does these out of the box in the next five years.
  4. Introducing the Open Web Foundation Agreement -- The Open Web Foundation Agreement itself establishes the copyright and patent rights for a specification, ensuring that downstream consumers may freely implement and reuse the licensed specification without seeking further permission. In addition to the agreement itself, we also created an easy-to-read "Deed" that provides a high level overview of the agreement. Applying the open source approach to better standards.

tags: bio, data, email, genomics, idiots, opensource, search, social graph, social software, standardscomments: 1
submit: Reddit Digg stumbleupon   

 

Thu

Oct 15
2009

Mike Loukides

Wolfram|Alpha API to be released later today

by Mike Loukides@mikeloukidescomments: 4

We've just been told that the public API for Wolfram Alpha will be made available later today. The API documentation will be available at http://products.wolframalpha.com/api . As of noon, PDT, that page only redirects to the Alpha home page, but they've promised it will be available sometime this afternoon.

(continue reading)

tags: alpha, alpha API, API, data, twitter, wolfram, wolfram|alphacomments: 4
submit: Reddit Digg stumbleupon   

 

Tue

Oct 6
2009

Nat Torkington

Four short links: 6 October 2009

Birdwatching Technology, Transportation Data, Multitouch in Python, and Face Detection on the iPhone

by Nat Torkington@gnatcomments: 0

  1. Bird-watching Turns To Technology (BBC) -- CCTV-esque automated bird watching. Sensor networks + computer vision for an ecological purpose. In a bid to track the guillemots behaviour, Dr Dickinson is refining established work that involves modelling the visual structure of an area around a nest. The computer system will be able to use this model to identify changing elements in the scene, and determine if they correspond to movement by a guillemot. "That is the typical way of doing surveillance," said Dr Dickinson, "work out what's moving, that gives you an idea about what is interesting in a scene."
  2. The Case for Open MTA Data -- If you live in Portland, there are dozens of mobile applications that help fill gaps in transit information. You can check your phone to see when the next bus is supposed to come. You can plan a trip from one unfamiliar part of town to another. You can even have your mobile device buzz if you fall asleep before reaching your destination. For the basic stuff, there's no iPhone necessary (although that certainly helps for information luxuries). Anyone who has a plain old cell phone with text messaging can ride the train or the bus with greater ease thanks to these apps. (via Making Light)
  3. PyMT -- a python module for developing multi-touch enabled media rich applications. Currently the aim is to allow for quick and easy interaction design and rapid prototype development. There is also a focus on logging tasks or sessions of user interaction to quantitative data and the analysis/visualization of such data.
  4. Near Realtime Face Detection on the iPhone with OpenCV Port -- we're probably only one or two revisions of iPhone hardware away from being able to do some serious computer vision tasks on the handset. Proof of concept adds a tie to the face you're pointing the camera at.

tags: computer vision, data, gov2.0, iphone, multitouch, programming, pythoncomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Sep 30
2009

Nat Torkington

Four short links: 30 September 2009

Smart Materials, Google OCR API, Teaching Webinar, HistEx

by Nat Torkington@gnatcomments: 2

  1. Smart Materials in Architecture -- Using thermal bimetals can allow architects to experiment with shape-changing buildings, Ritter said. Thermal bimetals include a combination of materials with different expansion coefficients that can cause a change in. Under changing temperatures this can lead one side of a compound to bend more than the other side, potentially creating an entirely different shape, he said. A little impractical at the moment, but think of it as hackers experimenting with what's possible, iterating to find the fit between materials possibility and customer need. (via Liminal Existence)
  2. Google OCR API -- The server will attempt to extract the text from the images; creating a new Google Doc for each image. Experimental at this stage, and early users report periodic crashes. Still, it's a useful service. I wonder whether they're seeing how people correct the scan text and using that to train the OCR algorithms. (via Waxy)
  3. My O'Reilly Podcast: Dan Meyer -- I'm not pimping this because it's O'Reilly (O'R do heaps of stuff I don't mention) but because it's the astonishingly brilliant Dan Meyer. For everything it does well, the US model of math education conditions students to anticipate narrowly defined problems with narrowly prescribed solutions. This puts them in no place to anticipate the ambiguous, broadly defined, problems they'll need to solve after graduation, as citizens. This webcast will define two contributing factors to this intellectual impatience and then suggest a solution.
  4. Inflation Conversion Factors for Dollars 1774 to Estimated 2019 -- in PDF and Excel format. I've wanted such a table in the past for answering those inevitable "... in today's dollars?" historical business questions. (via Schuyler on Delicious)

tags: architecture, data, education, google, history, materials science, moneycomments: 2
submit: Reddit Digg stumbleupon   

 

Thu

Aug 27
2009

Nat Torkington

Four short links: 27 August 2009

Copycrime, Die Music Industry Die, Open Government Data, Augmented Reality

by Nat Torkington@gnatcomments: 0

  1. Second Degree Murder and Six Other Crimes Cheaper Than Pirating Music -- I'm outraged that the Obama administration is supporting the RIAA on the case against Jammie Thomas, a single mother of four who has to pay them $1.92 million for downloading songs. That's more expensive than murder and six other crimes... (via Br3nda)
  2. Bill Drummond Talk (MP3) -- cofounder of the KLF gives 130 years of music industry history and explains why music's future might depend on not recording it. (via Br3nda)
  3. NZ Government Recommends CC-BY -- NZ all-of-Government licensing framework recommends CC. So far as copyright works are concerned, NZGOAL proposes that agencies apply the most liberal of the New Zealand Creative Commons law licences to those of their copyright works that are appropriate for release, unless there is a restriction which would prevent this. The most liberal Creative Commons licence is the Attribution (BY) licence. So far as non-copyright information is concerned, NZGOAL recommends the use of clear “no-known rights” statements, to provide certainty for people wishing to re-use that information..
  4. Augmented Reality: 5 Barriers to a Web That's Everywhere (ReadWriteWeb) -- great post with five areas that need to be addressed before we can move from "wow" to commonplace. Interoperability: Right now you cannot see information from the Wikitude AR environment if you're looking through the Layar AR browser. This could be the coming of a new browser war just like that of the 1990s. It may not be obvious and it may not even be true that users have a right to view any layer of Augmented Reality through any Augmented Reality browser. Interoperability, standards and openness have been what has let the Web scale and flourish beyond the suffocating walled gardens of its early days. The same is true of telephones, railroads and countless other networked technologies. Logically then, a lack of interoperability between AR environments would be a tragedy of the same type as if the web had remained defined by the islands of AOL and Compuserve or Internet Explorer, forever. (A lack of data portability when it comes to Augmented Reality could cause substantial psychological distress!)

tags: augmented reality, business, copyright, data, gov2.0, law, music, opencomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Aug 26
2009

Nat Torkington

Four short links: 26 August 2009

Food, NoSQL, Brain Power, Social Data

by Nat Torkington@gnatcomments: 0

  1. Better BBQ Through Chemistry -- food is the perfect ground for geek training: there are measurements, there's science, it's easy to know whether you've succeeded, and you can eat all but the worst of your failures. (via BoingBoing)
  2. NoSQL (East) -- conference on East Coast for relationless databases.
  3. Human Brain Processing Speed -- clocked at 60bits/second, according to this MIT Technology Review article. Their approach eventually led to Hick's Law, one of the few laws of experimental psychology. It states that the time it takes to make a choice is linearly related to the entropy of the possible alternatives. The results from various reaction-time experiments seem to show that this is the case. Although one byproduct of this approach is that the results are intimately linked to the type of experiment used to measure the reaction time. And that makes each study peculiarly vulnerable to the idiosyncrasies of the experimental approach. Today, Fermi Moscoso del Prado Martín from the Université de Provence in France proposes a new way to study reaction times by analyzing the entropy of their distribution, rather in the manner of thermodynamics. (via Hacker News)
  4. Truly Social Data -- Data will only be truly social when you can work with it in the kinds of ways we work with information in the real, non-computational, world. In the real world we don’t ask for permission to have an opinion on something, to add to the ball of information surrounding a concept. Our needs don’t have to be anticipated by programmers. We can share information as we please. For example, nobody owns the concept of Barcelona. If I want to essentially “tag” Barcelona as being hot, or noisy, or beautiful, I just do it. I can keep my opinion private, I can share it with certain others, I can hold conflicting opinions, I can organize things in multiple ways at the same time and give things many names.

tags: brain, data, fluiddb, food, nosql, research, science, socialcomments: 0
submit: Reddit Digg stumbleupon   

 

Mon

Aug 17
2009

Brady Forrest

Data Is Journalism: MSNBC.com Acquires Everyblock

by Brady Forrest@bradycomments: 3

everyblock logo

Everyblock, Adrian Holovaty's local data aggregator, has been acquired by MSNBC.com. Many are hailing it as local news acquisition. For 15 major US cities Everyblock aggregates crime data, restaurant reviews, health inspections, local news and more. This is data that is only of interest to people within a certain area. I care much less about crime ten blocks away than I do about crime two blocks away. Everyblock lets me know what is happening within three blocks of my home and filters everything out (on the web and iPhone). So Everyblock is a hyperlocalnews acquisition, but that is only half of the story (maybe less).

The future of news is data and Everyblock is the premier startup in this area. As Adrian phrased it on his site this past May in a post entitled The definitive, two-part answer to "is data journalism?":

It's a hot topic among journalists right now: Is data journalism? Is it journalism to publish a raw database? Here, at last, is the definitive, two-part answer:
1. Who cares?
2. I hope my competitors waste their time arguing about this as long as possible.

MSNBC.com stopped wasting time just in time.

everyblock data snapshot

There is a coming deluge of data from the new administration. Sites like Data.gov, USASpending.gov and Recovery.gov are hopefully just the beginning of new data sources. It's already too much for many organizations to make sense of. Without the proper tools many stories will never be covered. People will not get the info they need. Everyblock has proven that by taking free local government data sources and making them readily available to interested citizens you can create value. Now it's time to turn those tools and thinking onto a problem of a national scale. (If you'd like to learn more about the Obama administrations efforts to release data check out Anil Dash's latest piece The Most Interesting New Tech Startup of 2009.)

It's important to note that Everyblock recently open-sourced the code to their site and as Techcrunch pointed out their traffic is not that high. So MSNBC could have easily duplicated Everyblock and just turned their traffic hose at the new property. Instead MSNBC.com realized that they are facing a new problem and they needed a new team to tackle it head on. Enter Adrian and Everyblock.

Of course many people know Adrian as one of the co-creators of DJango. In his acquisition blog post he states that he will have more time to work on Django, that Everyblock will stay Python (and presumably continue to roll their own maps) and that this does not effect ebcode, the open-sourced version of Everyblock (Radar post).

Congrats Adrian it looks like you solved the dilemma (Radar site) of what to do once you've open-sourced your site; you tackle a bigger problem.

Post updated to reflect that it was MSNBC.com, not MSNBC, that bought Everyblock.

tags: data, geo, journalismcomments: 3
submit: Reddit Digg stumbleupon   

 

Thu

Aug 13
2009

Nat Torkington

Four short links: 13 August 2009

by Nat Torkington@gnatcomments: 1

  1. Under the Hood of App Inventor for Android -- regular readers know I'm a big fan of visual programming language Scratch, and apparently Google are too. They've got twelve university classes testing App Inventor for Android, a visual connect-the-bits programming environment for Android. University classes probably because one of the co-creators is Hal Abelson, coauthor of the definitive programming textbook. Also found online: the PR-type announcement, a Professor using it, and @AppInv (nothing juicy on Twitter--it looks like might be a channel for tech support for the students). (via Hacker News)
  2. Google Web Optimizer Case Study (Four Hour Work Week) -- GWO manages A/B tests for you, with a lot of statistical analysis. It's a fascinating read to see how these should be done. Every equation may halve the readership of a book, but every table of numbers and relevancy analysis doubles the value of a post like this. (via Hacker News)
  3. Opening Up The BBC's Natural History Archive -- the BBC are releasing programme segments and a whole lot of metadata around their programming. Audio and video segmented, tagged with DBpedia terms, and aggregated into a URI structure based on natural history concepts: species, habitats, adaptations, etc. Gorgeous!
  4. Yahoo! Term Extraction API to Close -- Internally, both services share a backend data source that is closing down, so the publicly-facing YDN services will be closing as well. I think it's the most significant casualty of Y! outsourcing search to MSFT, as this API was used by a lot of projects. (via Simon Willison)

tags: android, apis, bbc, data, google, history, programming, semantic web, statistics, web, yahoocomments: 1
submit: Reddit Digg stumbleupon   

 

Wed

Aug 12
2009

Nat Torkington

Four short links: 12 August 2009

Health Data, Python Term Extraction, Network Neutrality, New Database

by Nat Torkington@gnatcomments: 0

  1. Improving Health Care -- Adam Bosworth's speech to the Aspen Health Forum. It starts strong and just gets better: There is a lot of talk about improving health care. And there is a lot to improve. Inadequate Evidence: We don’t know enough about what works. We should require sharing of population statistics across practices and hospitals in order to better determine what works for whom. We should reward practices and hospitals that are delivering the best most cost-effective long-term outcomes and penalize those that deliver the worst.
  2. topia.termextract -- Python library for term extraction, so you can get a list of the nouns and noun phrases used in a piece of text. (via Simon Willison)
  3. Key to Understanding Network Neutrality -- David Pennock neatly identifies the crucial issue, that service quality and price levels be uniformly applied and not arbitrary based on how much the service provider thinks they can gouge from the customer. The key to understanding this debate is recognizing the difference between anonymity and egalitarianism. A mechanism is anonymous if the outcome does not depend on the identity of the players: two players who bid the same are treated equally. It doesn’t matter what their name, age, or wealth is, what company they represent, or how they plan to use the item — all that matters is what they bid. This is a good property for almost any public marketplace that ensures fair treatment, and one worth fighting for on the Internet.
  4. (the item I linked to releases in a week's time, I will link again when it's live--sorry for the inconvenience. In the meantime, please enjoy this video of a monkey washing a cat)

tags: data, database, fluiddb, healthcare, network neutrality, opensource, pythoncomments: 0
submit: Reddit Digg stumbleupon   

 

Tue

Aug 4
2009

Nat Torkington

Four short links: 4 August 2009

NASA Cloudware, btrfs, eBook Editing, Exponential Death

by Nat Torkington@gnatcomments: 1

  1. NASA Nebula Services/Platform Stack -- The NEBULA platform offers a turnkey Software-as-a-Service experience that can rapidly address the requirements of a large number of projects. However, each component of the NEBULA platform is also available individually; thus, NEBULA can also serve in Platform-as-a-Service or Infrastructure-as-a-Service capacities. Bundles RabbitMQ, Eucalyptus, LUSTRE storage, Fabric deployment, Varnish front-end, MySQL and more. (via Jim Stogdill)
  2. A Short History of btrfs -- Now for some personal predictions (based purely on public information - I don't have any insider knowledge). Btrfs will be the default file system on Linux within two years. Btrfs as a project won't (and can't, at this point) be canceled by Oracle. If all the intellectual property issues are worked out (a big if), ZFS will be ported to Linux, but it will have less than a few percent of the installed base of btrfs. Check back in two years and see if I got any of these predictions right!
  3. Sigil -- open source WYSIWYG eBook editor. (via liza on Twitter)
  4. Exponential Decay of Life -- This startling fact was first noticed by the British actuary Benjamin Gompertz in 1825 and is now called the “Gompertz Law of human mortality.” Your probability of dying during a given year doubles every 8 years. For me, a 25-year-old American, the probability of dying during the next year is a fairly miniscule 0.03% — about 1 in 3,000. When I’m 33 it will be about 1 in 1,500, when I’m 42 it will be about 1 in 750, and so on. (via Hacker News)

tags: bio, cloud computing, data, ebooks, math, publishing, storagecomments: 1
submit: Reddit Digg stumbleupon   

 

Fri

Jul 17
2009

Nat Torkington

Four short links: 17 July 2009

by Nat Torkington@gnatcomments: 0

  1. NodeXL: Network Overview, Discovery and Exploration in Excel -- Excel plugin for analysing graph data within Excel. Visualization and data wizardry come to the corporates who live in Excel.
  2. Managing the Environmental Crisis -- a comment by Edwin Winge: "Public involvement does offer long-range benefits, the most pragmatic of which is that it results in better decisions. Park Service managers have discovered through experience that when they are willing to modify their professional judgements by considering ideas and opinions (values) of concerned citizens, the final decision that results is not only more acceptable to the public, it is also more satisfying to the Service." A banner quote for Gov 2.0, from the father of O'Reilly's Sara Winge. (via timoreilly on Twitter)
  3. Dopplr Social Atlas for iPhone -- an iPhone app that gives you the recommendations by Dopplr users for places to eat, things to do, places to stay around the world.
  4. Microformats Dev Camp -- July 25-6 (weekend following OSCON), in San Francisco at the Automattic offices. (via Tantek)

tags: data, dopplr, events, gov2.0, iphone app, microformats, visualizationcomments: 0
submit: Reddit Digg stumbleupon   

 

Thu

Jun 25
2009

Nat Torkington

Four short links: 25 June 2009

Twitter Bucks, Nike Numbers, Map Apps, and Digi Shiz

by Nat Torkington@gnatcomments: 2

  1. How an Indie Musician Can Make $19,000 in 10 Hours Using Twitter -- as Zoe Keating pointed out: "cash made by @amandapalmer in one month on Twitter = $19,000; cash made by @amandapalmer from 30,000 record sales = $0".
  2. The Nike Experiment: How the Shoe Giant Unleashed the Power of Personal Metrics (Wired) -- And not only can we collect that data, we can analyze it as well, looking for patterns, information that might help us change both the quality and the length of our lives. We can live longer and better by applying, on a personal scale, the same quantitative mindset that powers Google and medical research. Call it Living by Numbers—the ability to gather and analyze data about yourself, setting up a feedback loop that we can use to upgrade our lives, from better health to better habits to better performance. Collective intelligence + sensor networks can = happiness. (Mathematics gets by with just an "equals" operator. The rest of us need a "can equal" operator ...)
  3. Old Map App -- iPhone app with old maps. Reminds me of David Rumsey's keynote at OSCON 2004.
  4. Make It Digital -- Digital NZ site that helps organisations wanting to produce digital content, by offering them guidance on formats, metadata, and other issues they'll have to tackle. Includes a voting system to promote the (NZ) content you want to have digitised.

tags: business, collective intelligence, data, iphone, music, sensor networks, twittercomments: 2
submit: Reddit Digg stumbleupon   

 

Wed

Jun 17
2009

Nat Torkington

Four short links: 17 June 2009

Word Mining, Open Ideas, Power Meter BotNet, and Realtime Web Traffic Tracking

by Nat Torkington@gnatcomments: 0

  1. NY Times Mines Its Data To Identify Words That Readers Find Abstruse -- the feature that lets you highlight a word on a NY Times web page and get more information about it is something that irritates me. I'm fascinated by the analysis of their data: boggling that sumptuary is less perplexing than solipsistic. Louche (#3 on the list) has been my favourite word for two years, by the way, since I heard Dylan Moran toss it out in that uniquely facile way the Irish have with words. I think Irish citizens get this incredible competence with the English language for free, along with staggering house prices and beer you can walk on.
  2. Open Ideas -- Alex Payne's blog of Concepts in the public domain, awaiting collaboration and appropriation.
  3. Buggy 'smart meters' open door to power-grid botnet (The Register) -- Paul Graham said that we've found what we get when we cross a television with a computer: a computer. Similarly, intelligent power meters are computers, computers that apparently haven't been well-secured. To prove his point, Davis and his IOActive colleagues designed a worm that self-propagates across a large number of one manufacturer's smart meter. Once infected, the device is under the control of the malware developers in much the way infected PCs are under the spell of bot herders. Attackers can then send instructions that cause its software to turn power on or off and reveal power usage or sensitive system configuration settings.
  4. Chartbeat -- the sexiest web analytics ever. It gives realtime count of users, whether they're reading or writing (based on whether focus is in a form element), where they're from, mentions on Twitter, and more and more and more. This is a different form of analytics than Google Analytics, which tells you trends and historical access. Love this for the pure sex appeal of a heads-up dashboard that can tell you exactly how many people are on your site and exactly what they're doing. (via Artur)

tags: analytics, crowdsourcing, data, energy, innovation, lazyweb, mining, securitycomments: 0
submit: Reddit Digg stumbleupon   

 

Tue

Jun 16
2009

Nat Torkington

Four short links: 16 June 2009

by Nat Torkington@gnatcomments: 5

  1. Dealing with Election Results Data -- taking the raw UK European election data into Google's Fusion Tables to try and make sense of it. More cloud-based tools for the data scientist within. (via Simon Willison)
  2. Time for an Open 311 API -- "311" is the US number to call for non-emergency municipal services. There have been a lot of individual projects to hack together web sites that provide the single coherent view of government services that the government itself is unable to offer, but the individual projects have all built their own APIs. SeeClickFix suggest these be unified so tools can be written (e.g., iPhone apps) that run across multiple municipalities. (via timoreilly on Twitter)
  3. Shoppers Cars Soon Able to Power Supermarkets (Daily Mail) -- At the Sainsbury's store in Gloucester, kinetic plates, which were embedded in the road yesterday, are pushed down every time a vehicle passes over them. A pumping action is then initiated through a series of hydraulic pipes that drive a generator. The plates are able to produce 30kw of green energy an hour - more than enough to power the store's checkouts. (via Freaklabs)
  4. Humans Prefer Cockiness to Expertise (New Scientist) -- the blogosphere explained in one paper. (via Mind Hacks)

tags: apis, brain, data, energy, google, gov2.0, visualizationcomments: 5
submit: Reddit Digg stumbleupon   

 

Tue

May 19
2009

Mike Loukides

Wolfram Alpha a Google Killer? Not... Supposed... To... Be

by Mike Loukides@mikeloukidescomments: 11

I'm getting tired of reading about whether Alpha is a Google-killer. I've seen Stephen Wolfram's presentations a couple of times; he's quite careful to say that it isn't. There's a fundamental difference that many people out there are just missing. Google is a search engine. Alpha looks like a search engine, but it isn't; it's all about curated data, and the analysis of that data.

(continue reading)

tags: analysis, curated data, data, mathematica, wolfram alphacomments: 11
submit: Reddit Digg stumbleupon   

 

Tue

May 12
2009

Nat Torkington

Four short links: 12 May 2009

Storage Superfluity, Data-Driven Design, Twit-Mapping, and DIY Biohacking

by Nat Torkington@gnatcomments: 1

  1. Lacie 10TB Storage -- for what used to be the price of a good computer, you can now buy 10TB of storage. Storage on sale goes for less than $100 a terabyte. This obviously promotes collecting, hoarding, packratting, and the search technology necessary to find what you've stashed away. Analogies to be drawn between McMansions full of Chinese-made crap and terabyte drive full of downloaded crap. Do we need to keep it? Are there psychological consequences to clutter? (via gizmodo)
  2. In Defense of Data-Driven Design -- a thoughtful response to the "Google hates design!" hashmob formed around designer Douglas Bowman's departure from Google. When you’ve got the enormous traffic necessary to work out if miniscule changes have some minor, statistically significant effect, then sure, if you can do it quickly, why wouldn’t you? But that’s optimization that should happen at the very end of the design cycle. The cart goes after the horse. Put it the other way ‘round and you have a broken setup. It doesn’t mean horses suck. It doesn’t mean carts suck. Carts are not the enemy of horses. Optimization is not the enemy of design. Get them in the right order and you have something really useful. Get them the wrong way around and you have something broken.
  3. Just Landed: Processing + Twitter + Metacarta + Hidden Data -- Jer searched Twitter for "just landed in", used Metacarta to extract the locations mentioned, and then used Processing to build visualizations.
  4. Do It Yourself Genetic Sleuthing -- MIT is starting a hotbed of DIY biologists. The 23-year-old MIT graduate uses tools that fit neatly next to her shoe rack. There is a vintage thermal cycler she uses to alternately heat and cool snippets of DNA, a high-voltage power supply scored on eBay, and chemicals stored in the freezer in a box that had once held vegan "bacon" strips. Aull is on a quirky journey of self-discovery for the genetics age, seeking the footprint of a disease that can be fatal but is easily treated if identified. But her quest also raises a broader question: If hobbyists working on computers in their garages can create companies such as Apple, could genetics follow suit? It's unclear what those DIY-started "genetics" companies would look like--the potential is there, but it's yet to met the right problem. (via Andy Oram)

Just Landed - 36 Hours from blprnt on Vimeo.

tags: attention, biology, data, design, diy, google, make, map, media, metacarta, news, processing, twittercomments: 1
submit: Reddit Digg stumbleupon   

 

Fri

May 8
2009

Jesse Robbins

Velocity 2009 - Big Ideas (early registration deadline)

by Jesse Robbins@jesserobbinscomments: 7

what-is-velocityconf.png

(tag cloud created from Velocity session & speaker information using wordle.net)

My favorite interview question to ask candidates is: "What happens when you type www.(amazon|google|yahoo).com in your browser and press return?"

While the actual process of serving and rendering a page takes seconds to complete, describing it in real detail can take an hour. A good answer spans every part of the Internet from the client browser & operating system, DNS, through the network, to load balancers, servers, services, storage, down to the operating system & hardware, and all the way back again to the browser. It requires an understanding of TCP/IP, HTTP, & SSL deep enough to describe how connections are managed, how load-balancers work, and how certificates are exchanged and validated... and that's just the first request!

Web Performance & Operations is an emerging discipline which requires incredible breadth, focusing less on specific technologies and more on how the entire system works together. While people often specialize on particular components, great engineers always think of that component in relation to the whole. The best engineers are able to fly to the 50,000 foot view and see the entire system in motion and then zoom in to microscopic levels and examine the tiny movements of an individual part.

John Allspaw recently described this interconnectedness on his blog:

With websites, the introduction of change (for example, a bad database query) can affect (in a bad way) the entire system, not just the component(s) that saw the change. Adding handfuls of milliseconds to a query that’s made often, and you’re now holding page requests up longer. The same thing applies to optimizations as well. Break that [bad] query into two small fast ones, and watch how usage can change all over the system pretty quickly. Databases respond a bit faster, pages get built quicker, which means users click on more links, etc. This second-order effect of optimization is probably pretty familiar to those of us running sites of decent scale.

Working with these systems requires an understanding not only of the way technology interacts, but the way that people do as well. The structure, operation, and development of a website mirrors the organization that creates it, which is why so many people in WebOps focus on understanding and improving management culture & process.

Organizing a conference like Velocity is a wonderful challenge because it requires the same sort of thinking. We focus on the big concepts that everyone needs to know and then go deep into the technologies that change our understanding of the system. We find ways to share the unique experience that can only be gained by operating at scale. We make it safe to share as much of the "Secret Sauce" as we can.

Please join us at Velocity this year, we have an amazing lineup of speakers & participants. Early registration ends on Monday, May 11th at 11:59 PM Pacific. (Radar readers can use "vel09cmb" for an additional 15% discount.)

Velocity, the Web Performance and Operations Conference 2009

tags: cloud, data, infrastructure, operations, scale, velocity, velocity09, velocityconf, web, web2.0comments: 7
submit: Reddit Digg stumbleupon   

 

Thu

Apr 30
2009

Nat Torkington

Four short links: 30 Apr 2009

Youth, Government, Tween Arduino Hackers, and Table Slurpage

by Nat Torkington@gnatcomments: 0

  1. Ypulse Conference -- conference on marketing to youth with technology, from the very savvy Anastasia Goodstein who runs the interesting Ypulse blog on youth culture that I've raved about before. Register with the code RADAR for a 10% discount (thanks, Anastasia!).
  2. Government in the Global Village -- departing post by the NZ CIO (and Kiwi Foo Camper) Laurence Millar. The principles here are applicable to almost every nation. We need to recognise the network effects of opening up government data in a form that means others can access it. Economic value is created by businesses building innovative new services using government data. Public value is created by enabling a richer and deeper understanding and dialogue among interested individuals about what the data tells us about our lives.[...] The legal, policy, and moral position is clear - New Zealanders own the data, having paid for its collection through taxes. These “problems” will all be solved by the community, and our role as government is to give priority to this. These efforts are stuff that matters. See also Google adds search to public data.
  3. Children's Arduino Workshop (Makezine) -- video of three eleven-year old girls working on an Arduino project, and should be inspiration to anyone who has ever wanted to work on hardware projects with kids. Whoever did it succeeded in making it fun! (via followr on Twitter)
  4. With YQL Execute, The Internet Becomes Your Database -- YQL is a query language for Yahoo! data sources, and now they've added a server-side Javascript way to import your own web page's tables into YQL. YQL and Pipes are turning into very interesting pieces of infrastructure (e.g., Museum Pipes blog). (via Simon Willison and straup on delicious)

tags: data, databases, democracy, education, government, hardware, make, marketing, transparency, web as platformcomments: 0
submit: Reddit Digg stumbleupon   

 

Mon

Apr 27
2009

John Geraci

Trying to Track Swine Flu Across Cities in Realtime

by John Geraci@johngeracicomments: 15

John Geraci is a guest blogger and heads up the DIY City movement. He will be speaking about DIY City at Where 2.0 in San Jose on 5/20.

Since early last friday, when I got a tip about swine flu in Mexico City from a health researcher, the team that does SickCity has been working to make the system something that can (or could) detect swine flu outbreaks in cities around the world.

It hasn't been easy.

SickCity is the "realtime disease detection for your city", created by people at DIYcity. The service, launched last month, works by monitoring Twitter for local mentions of various terms that mean "I'm getting sick" and plotting those to location. Up until Friday, SickCity seemed to work reasonably well for the very rough beta tool that it is. It showed incidences of people reporting they had flu, or chicken pox, or other illnesses, broken down by city. You could look at a graph of the past 30 days for your city and see days when mentions of certain diseases and symptoms were higher or when they were lower. You could sometimes see trends. No one claimed that SickCity was ready for prime time, but those working on it felt that there was a very worthwhile idea in it that with a bit of refinement would be of huge value to communities.

On Friday, all of that got turned upside down.

Going to SickCity's Mexico City page early in the day, I saw a sudden, several-hundred percent increase in mentions of flu. The problem was, not a single one of them was about actually having the flu - all were about the gigantic swine flu media event that was just beginning. Our disease detection tool had turned into a media event detection tool overnight.

Since then, we've been in a constant struggle to filter out the media effect from the data. The problem is, as the story grows and changes, the terms we have to filter for keep growing and changing. On Saturday we made a series of changes to the filters and search terms, and thought we were fine. By Sunday, those had become totally insufficient in the face of the growing Twitter storm surrounding swine flu (70 more results in the time it took me to write that sentence). We made more changes Sunday. Today, those additional filters seemed puny and insufficient. People are now calling swine flu "piggy flu", "pork flu", "bacon flu", "wine flu". They're talking about Obama having flu. They're talking about bird flu. The list of tweeting topics grows at an exponential rate. The topic of swine flu is incredibly viral.

So how do you get down below this huge cloud of noise, to the relatively tiny (but very important) signal down beneath? There are probably several thousand tweets happening right now about the idea of flu for every one that is about actually having the flu. The number of people actually coming down with flu right now in fact seems very low (let's hope it stays that way).

Tracking other terms related to flu seems more promising - the term "fever" seems like a good one to look for, and once you get rid of the tweets mentioning spring fever, cabin fever and Doctor Johnny Fever, you've got a pretty good data set to use. But how representative of the flu population is that term?

Maybe tracking actual flu tweets in this situation isn't really possible?

Still, the payoff in terms of value to communities and health organizations is huge if the developers can get something that can be demonstrated to work. As a public health researcher following SickCity told me, realtime outbreak detection is currently terrible at best. To improve on what's there, you just have to give people a reliable signal that *something* is happening in a city. You don't need to have exact numbers. You don't even need to know whether what's happening is actually flu, or food poisoning, or plague, really - the health officials can figure that out for themselves pretty quickly with all of the other tools at their disposal, once they know to be on the lookout. You just need to be able to reliably say "there is a sickness event happening right now in this city", and that's enough. You just need a canary in the coal mine.

So the developers behind SickCity, volunteers from DIYcity (mainly Paul Watson and Dan Greenblatt at this time, plus a few others) keep working on making it that. And right now they're working round the clock. (It's a public project - if you want to pitch in, by all means do so - you can get more info here.).

Even if SickCity fails to detect swine flu in cities around the world, it will have become a much more robust tool in the process of failing. If it doesn't succeed in catching this pandemic, maybe it will be better prepared to catch the next one?

tags: data, diy, swine flu, twittercomments: 15
submit: Reddit Digg stumbleupon   

 

Thu

Apr 23
2009

Nat Torkington

Four short links: 23 Apr 2009

by Nat Torkington@gnatcomments: 3

Multitouch, visualizations, body hacks, and ubicomp:

  1. Dell Demos Multitouch on the Studio One 19 (Engadget) -- the multitouch software on this baby is Fingertapps from the New Zealand company Unlimited Realities, whose founder was at Kiwi Foo Camp this year. Multitouch hits consumer PCs in a very mainstream way.
  2. Circos -- open source Perl library to produce beautiful circular data displays. (via flowing data)
  3. Brain Gain: The Underground World of “Neuroenhancing” Drugs (New Yorker) -- more on the body hacks theme of radical and literal self-improvement, as originally documented by Quinn Norton. What I found interesting was that when BoingBoing linked to it, they quoted the "Provigil might make us smarter" bit, and when MInd Hacks linked to it, they quoted the negative effects of amphetamine-based drugs.
  4. Towards the Web of Things: Web Mashups for Embedded Devices -- slides and notes for a presentation given at MEM 2009. Basically saying that the Internet of Things should be built on JSON and REST, with demo. (via Freaklabs)

tags: biology, data, medicine, multitouch, sensors, ubicomp, visualizationcomments: 3
submit: Reddit Digg stumbleupon