Entries tagged with “databases” from O'Reilly Radar
Four short links: 4 November 2009
Electronics Hacking FAQs, Speech-To-Text Democracy, Open Source Column Database, Massive Online Analysis
by Nat Torkington | @gnat | comments: 1
- ChipHacker -- collaborative FAQ site for electronics hacking. Based on the same StackExchange software as RedMonk's FOSS FAQ for open source software.
- Democracy Live -- BBC launch searchable coverage of parliamentary discussion, using speech-to-text. One aspect we're particularly proud of is that we've managed to deliver good results for speech-to-text in Welsh, which, we're told, is unique. I think of this as the start of a They Work For You for video coverage. I'd love to be able to scale this to local government coverage, which is disappearing as local newspapers turn into delivery mechanisms for real estate advertisements.
- InfiniDB: Open Source Column Database -- hooks into MySQL, uses MySQL for SQL parsing, security, etc. The commercial enterprise version has multi-server support (parallel scale-out). (via Brian Aker)
- Massive Online Analysis -- MOA is a framework for data stream mining. Includes tools for evaluation and a collection of machine learning algorithms. Related to the WEKA project, also written in Java, while scaling to more demanding problems. . (via joshua on Delicious)
tags: big data, collective intelligence, databases, democracy, gov2.0, hardware, maker, open source
| comments: 1
submit:
Four short links: 18 August 2009
iPhone App Backstory, Cookie Resurrection, The Entrepreneuralism Lickmus test, and An Interesting Database
by Nat Torkington | @gnat | comments: 2
- The Making of the NPR News iPhone App -- interesting behind-the-scenes look, with sketches and all. Station streams, however, presented a larger challenge. To begin with, NPR didn't have direct stream links for any of its stations, so we built a Web spider that identified and captured more than 300 iPhone-compatible station streams. After that first pass, we worked with our station representatives to manually test each stream. In the process they found enough new streams to double our database. All of these streams are delivered to the app from NPR's Station Finder API. (via mattb on Twitter)
- You Deleted Your Cookies? Think Again (Wired) -- Flash keeps its own cookies, which are harder to delete. Several services even use the surreptitious data storage to reinstate traditional cookies that a user deleted, which is called ‘re-spawning’ in homage to video games where zombies come back to life even after being “killed,” the report found. So even if a user gets rid of a website’s tracking cookie, that cookie’s unique ID will be assigned back to a new cookie again using the Flash data as the “backup.” (via Simon Willison)
- Would You Lick It? (Rowan Simpson) -- clever example of what it takes to be an entrepreneur.
- FluidDB -- a shared "in the cloud" database built around tags: an object is a container for a set of tags which are name:value pairs, tag names have simple namespaces (e.g., "gnat/review" is the "review" tag in my namespace), all objects are world readable and writable but there are ACLs for tags, values can be any type (string, number, URL, Excel spreadsheet), and there's a simple query language. I'm curious to see what applications spring up around shared data. They're in limited alpha, controlling the # of users, so register now to play before everyone else.
Four short links: 6 August 2009
Ancient Language, NoSQL, Molecular Gastronomy, SQL Weirdness
by Nat Torkington | @gnat | comments: 0
- Computers Unlock More Secrets of the Indus Valley Script -- Four-thousand years ago, an urban civilization lived and traded on what is now the border between Pakistan and India. During the past century, thousands of artifacts bearing hieroglyphics left by this prehistoric people have been discovered. Today, a team of Indian and American researchers are using mathematics and computer science to try to piece together information about the still-unknown script. The team led by a University of Washington researcher has used computers to extract patterns in ancient Indus symbols. The study, published this week in the Proceedings of the National Academy of Sciences, shows distinct patterns in the symbols' placement in sequences and creates a statistical model for the unknown language. (via ACM TechNews)
- NoSQL: If Only It Was That Easy -- war stories of the problems with nosql systems to handle big throughput. We liked Tokyo Tyrant so much, we put it in production. In fact, every request to AboutUs.org hits Tokyo. One of the uses is as a persistent memcached replacement for caching 10 million+ wiki pages (as a json document of all the pieces of our page, which comes out to around 51gb(edited) of data), and it works great. It runs on a single server, it serves up a single type of data, very quickly, and has been a pleasure to use. We keep other ancillary data sets on some other servers too, and it’s great for this. Tokyo Tyrant is a great example of very performant software, but it doesn’t scale. (via straup on Delicious)
- WillPowder -- Specialty Powders and Spices from Chef Will Goldfarb -- molecular gastronomy products from "the golden boy of pastry". (via joshua on Delicious)
- What is the Deal with NULLs? -- In the past, I’ve criticized NULL semantics, but in this post I’d just like to explain some corner cases that I think you’ll find interesting, and try to straighten out some myths and misconceptions. [...] I believe the above shows, beyond a reasonable doubt, that NULL semantics are unintuitive, and if viewed according to most of the “standard explanations,” highly inconsistent. (via bos on Delicious)
Four short links: 29 May 2009
Meatware Hacks, iPhone Web Stats, Distributed Hash Tables, Richard Feynman Fun
by Nat Torkington | @gnat | comments: 5
- Freedom for OS X -- Mac app that disables networking for up to eight hours so you can get work done without Internet distractions. Technology workarounds for meatware bugs. (via Joshua-Michèle Ross).
- iPhone Casts a Giant Shadow on the Web -- 43% of mobile web traffic is from iPhone users, as measured by "the world's largest purveyor of ads on mobile apps and websites". As I was told today, "more people are spending more time looking at the web through one of these. For how much longer can you afford to ignore it?" (via timoreilly on Twitter)
- Why you won't be building your killer app on a distributed hash table (Jonathan Ellis) -- locking and sophisticated queries. I'm still trying to figure out where we'll end up with these "let's do something simple in a way that lets us scale horizontally, and then build on top of that" approaches to solving the big data/graph theory problems behind many modern apps.
- Richard Feynman Interviews at Microsoft -- a bit of fun to start the weekend on. (new URL 20090601)
Four short links: 26 May 2009
Databases, Sensors, Visualization, and Patents
by Nat Torkington | @gnat | comments: 0
- Flare -- dynamically partitioning and reconstructing key-value server. Currently built on Tokyo Cabinet, but backend is theoretically pluggable. (via joshua on delicious)
- Implantable Device Offers Continuous Cancer Monitoring -- the sensor network begins to extend into our bodies. The cylindrical, 5-millimeter implant contains magnetic nanoparticles coated with antibodies specific to the target molecules. Target molecules enter the implant through a semipermeable membrane, bind to the particles and cause them to clump together. That clumping can be detected by MRI (magnetic resonance imaging). The device is made of a polymer called polyethylene, which is commonly used in orthopedic implants. The semipermeable membrane, which allows target molecules to enter but keeps the magnetic nanoparticles trapped inside, is made of polycarbonate, a compound used in many plastics. (via FreakLabs)
- Visualizing Data source -- the source code to examples in Visualizing Data.
- The First Software Patent (Wired) -- was issued on this day in 1981, for a complex full-text storage and retrieval system. Tellingly, business strategy of the owner of the first software patent was ... to become a patent lawyer. A day that will linger in irritation, if not live in infamy. (via glynmoody on Twitter)
tags: big data, book related, databases, history, law, medicine, patent, sensors, visualization
| comments: 0
submit:
Four short links: 30 Apr 2009
Youth, Government, Tween Arduino Hackers, and Table Slurpage
by Nat Torkington | @gnat | comments: 0
- Ypulse Conference -- conference on marketing to youth with technology, from the very savvy Anastasia Goodstein who runs the interesting Ypulse blog on youth culture that I've raved about before. Register with the code RADAR for a 10% discount (thanks, Anastasia!).
- Government in the Global Village -- departing post by the NZ CIO (and Kiwi Foo Camper) Laurence Millar. The principles here are applicable to almost every nation. We need to recognise the network effects of opening up government data in a form that means others can access it. Economic value is created by businesses building innovative new services using government data. Public value is created by enabling a richer and deeper understanding and dialogue among interested individuals about what the data tells us about our lives.[...] The legal, policy, and moral position is clear - New Zealanders own the data, having paid for its collection through taxes. These “problems” will all be solved by the community, and our role as government is to give priority to this. These efforts are stuff that matters. See also Google adds search to public data.
- Children's Arduino Workshop (Makezine) -- video of three eleven-year old girls working on an Arduino project, and should be inspiration to anyone who has ever wanted to work on hardware projects with kids. Whoever did it succeeded in making it fun! (via followr on Twitter)
- With YQL Execute, The Internet Becomes Your Database -- YQL is a query language for Yahoo! data sources, and now they've added a server-side Javascript way to import your own web page's tables into YQL. YQL and Pipes are turning into very interesting pieces of infrastructure (e.g., Museum Pipes blog). (via Simon Willison and straup on delicious)
tags: data, databases, democracy, education, government, hardware, make, marketing, transparency, web as platform
| comments: 0
submit:
Four short links: 22 Apr 2009
by Nat Torkington | @gnat | comments: 0
Government, Bayes, SMS, and distributed keystores:
- Government Projects the Agile Way -- Can It Be Done? (NZ Government) -- notes and audio from a workshop at the New Zealand State Services Commission looking to merge agile and government. The pullquotes are mostly generic about agile, but the important thing is that there are agile projects within government and their numbers are growing. Having witnessed the incredibly slow, cautious, and non-agile development processes of government, I know how good this shift can be for budgets and delivery.
- DivMod Reverend -- general purpose open source Bayesian classifier in Python (the Ruby port is Bishop). Bayes theorem lies behind the 2000-era spam filters, and there have been plenty of open source libraries to do Bayesian classification, but this one caught my eye because it's from the very good DivMod folks who are behind the very good Twisted framework. (via noahgift's delicious stream)
- RapidSMS -- a free and open source messaging framework for building SMS applications. Integrates with Django. (via straup's delicious stream)
- Some Notes on Distributed Key Stores (Leonard Lin) -- he had to install and test distributed keystores for a client's project, and posted his notes. Distributed keystores are one of the recent spates of database-like tools intended to solve some of the problems of big data applications. The distributed stores out there is currently pretty half-baked at best right now. [...] Don’t believe the hype. There’s a lot of talk, but I didn’t find any public project that came close to the (implied?) promise of tossing nodes in and having it figure things out. [...] Based on the maturity of projects out there, you could write your own in less than a day. It’ll perform as well and at least when it breaks, you’ll be more fond of it. Alternatively, you could go on the conference circuit and talk about how awesome your half-baked distributed keystore is. (via straup's delicious stream)
tags: collective intelligence, data, databases, django, mobile, open source, programming, sms
| comments: 0
submit:
Four short links: 16 Apr 2009
by Nat Torkington | @gnat | comments: 1
China, databases, storage, and git:
- China's Complicated Internet Culture (Ethan Zuckerman) -- summary of Rebecca McKinnon's talk at the Berkman Internet Center. Democracy is complex and hard to transition to, online democracy doubly so. Rebecca questions the widespread but unjustified belief that the Great Firewall of China is all that separates Chinese citizens from the empowered liberty of the West, and lays out the tangled state of affairs in China's political Internet. Despite the rise of web video, “no one has managed to organized an opposition party on the web,” Rebecca points out. “There’s no Lech Walenza, no religious movement - Falun Gong has been squished pretty thoroughly.” (via cshirky's delicious stream)
- Drop ACID and Think About Data -- Bob Ippolito's talk from PyCon about the things you can do easily when you foresake the promises of ACID. More in the ongoing reinvention of databases for the needs of modern web systems. (via cesther's Twitter stream)
- The Pogoplug -- The Pogoplug connects your external hard drive to the Internet so you can easily share and access your files from anywhere. We're accumulating terabytes of storage at home, where it's very useful to all the computers in the home. This offers an easy way for non-technical civilians to make these drives useful outside the home as well. There are many possibilities for Interesting Things in the massive storage we're accumulating. (via joshua's delicious stream)
- Gitorious -- open source (AGPLv3) clone of github. (via edd's delicious stream)
tags: big data, china, databases, democracy, hardware, open source, politics, programming
| comments: 1
submit:
Four short links: 15 Apr 2009
by Nat Torkington | @gnat | comments: 0
Computer archaeology, Unix, mad science, and data mining:
- NASA Images Saved By Volunteers -- Pictures from the mid-1960s Lunar Orbiter program lay forgotten for decades. But one woman was determined to see them restored. One woman and some keen hardware hackers who built Frankenstein's tape reader to recover the images. Not just a reminder of how ephemeral our media, but also the huge amount of useful work that falls outside the interest of Official Groups to fund. (via Tim's twitter stream)
- The Art of Unix Programming and The UNIX-HATERS Handbook (PDF) -- one loves Unix, the other ... not so much. It's interesting to read both books consecutively and realize the vast gulf that existed between Good Enough and Perfect, and how Perfect has been well and truly vanquished by Good Enough. The original Unix solved a problem and solved it well, as did the Roman numeral system, the mercury treatment for syphilis, and carbon paper. And like those technologies, Unix, too, rightfully belongs to history. (TAoUP via bengebre's delicious bookmarks)
- Theo Gray's Mad Science -- a book full of Make-like charismatic megascience that you could theoretically do if you were sufficiently patient, provisioned, and safe. Projects include making your own nylon, turning beach sand to steel, and making salt by spectacularly combining sodium and chlorine. (via BoingBoing)
- Microsoft Offers Data Mining Tools in the Cloud (Byteonic) -- Microsoft offers some data mining functionality of SQL Server 2008 with no local analysis services server in the cloud. The service is offered in two flavors: a cloud service and as a plug-in for Excel. The tools are forecasting, prediction, and "analyze key influencers". Interesting to see Microsoft offering this higher-level service than the simple Spreadsheet-in-the-Sky offered by Google.
Salt from sodium and chlorine
tags: cloud computing, data, databases, history, science
| comments: 0
submit:
Four short links: 17 Mar 2009
by Nat Torkington | @gnat | comments: 1
Startups, databases, iPhone app marketplace, and how to launch:
- Weary of Looking for Work, Some Create Their Own (NY Times) -- a story about a new tide of entrepreneurs forced into it by the economic times. The goal for many entrepreneurs nowadays is not to create a company that will someday make billions but to come up with an idea that will produce revenue quickly, said Jerome S. Engel, director for the center for entrepreneurship at the Berkeley Haas School of Business. Mr. Engel said many people will focus on serving immediate needs for individuals and businesses.
- Redis -- another key/value pair database, but this time with atomic operations to push and pop. The reinvention of databases continues apace ....
- Gaming on the iPhone--Natural Selection in Real Time -- as the number of games has risen, the price has dropped. But that's where things have begun to settle, just a short time after the App Store started featuring games for the iPhone and iPod Touch. Five bucks is to the iPhone what sixty bucks is to the PC: the high end of the price scale. And the expectation is that, if you're gonna tempt someone to fork over a Lincoln for your hard work, it had better be something special [...] The iPhone is a relatively easy platform for developing games, where you can generally create a game with a small budget and short development time, and be looking at potentially large returns. But the market has become so crowded with casual games that it has become incredibly hard to get your game noticed.
- Don't Launch -- an eminently reasonable answer to the question I've often been asked. Don't chicken out and do a closed beta; get real customers in through real renewable channels. Start with a five-dollar-a-day SEM campaign. Iterate as fast and for as long as you can. Don't scale. Don't marketing launch. I love everything this guy writes. If he ever publishes a collection of his laundry lists and telephone doodles, I'll preorder it on Amazon.

