Entries tagged with “interviews” from O'Reilly Radar

Tue

Nov 17
2009

James Turner

The iPhone: Tricorder Version 1.0?

by James Turnercomments: 4

The iPhone, in addition to revolutionizing how people thought about mobile phone user interfaces, also was one of the first devices to offer a suite of sensors measuring everything from the visual environment to position to acceleration, all in a package that could fit in your shirt pocket.

On December 3rd, O'Reilly will be offering a one-day online edition of the Where 2.0 conference, focusing on the iPhone sensors, and what you can do with them. Alasdair Allan (the University of Exeter and Babilim Light Industries) and Jeffrey Powers (Occipital) will be among the speakers, and I recently spoke with each of them about how the iPhone has evolved as a sensing platform and the new and interesting things being done with the device.

Occipital is probably best known for Red Laser, the iPhone scanning application that lets you point the camera at a UPC code and get shopping information about the product. With recent iPhone OS releases, applications can now overlay data on top of a real time camera display, which has led to the new augmented reality applications. But according to Powers, the ability to process the camera data is still not fully supported, which has left Red Laser in a bit of a limbo state. "What happened with the most recent update is that the APIs for changing the way the camera screen looks were opened up pretty much completely. So you can customize it to make it look any way you want. You can also programmatically engage photo capture, which is something you couldn't do before either. You could only send the UI up and the user would have to use the normal built-in iPhone UI to capture. So you can do this programmatic data capturing, and you can process those images that come in. But as it turns out, at the same time, shortly after 3.1, the method that a lot of people were using to get the raw data while it was streaming in became a blacklisted function for the review team. So we've actually had a lot of trouble as of late getting technology updates through the App Store because the function we're using is now on a blacklist. Whereas it wasn't on a blacklist for the last year."

RedLaser.JPGPowers is hopeful that the next release of the OS will bring official support for the API calls that Red Laser uses, based on the fact that the App Store screeners aren't taking down existing apps that use the banned APIs. Issues with the iPhone camera sensors pose more of a problem for him. "In terms of science, it's definitely a really bad sensor, especially if you look at the older iPhone sensor, because it has what's called a rolling shutter. A rolling shutter means that as you press capture or rather as the camera is capturing video frames or as you capture a frame, the camera then begins to take an image. And it takes a finite number of milliseconds, maybe 50 or so, before it is actually exposed to the entire frame and stored that off into a sensor. Because it's doing something that's more like a serial data transfer instead of this all at once parallel capture of the entire frame, what that causes is weird tearing and odd effects like that. For photography, as long as it's not too dramatic, it's not a huge deal. For vision processing, it's a huge deal because it breaks a lot of assumptions that we typically make about the camera. That has gotten better in the 3GS camera, but it's still not perfect. It is getting better, especially when the camera's turned on the video mode."

(continue reading)

tags: augmented reality, image recognition, interviews, iphone, science, sensors, webcast, where 2.0comments: 4
submit: Reddit Digg stumbleupon   

 

Mon

Nov 9
2009

James Turner

The Minds Behind Some of the Most Addictive Games Around

If you've wasted half your life playing Peggle, Bejeweled, Zuma or Plants vs. Zombies, blame these guys!

by James Turnercomments: 5

You may also download this file. Running time: 38:21

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

The gaming industry tends to focus on the high end products, first person shooters that crank out a bazillion polygons a seconds and RPGs which spend more time developing the plot in cut scenes than in actual gameplay. But for every person playing Borderlands, there are scores playing casual games like Bejeweled and Zuma. PopCap Games has been at the forefront of casual game development, with a catalog that includes bestselling titles like Peggle and Plants vs Zombies, in addition to the two previously mentioned. I recently had a chance to talk to Jason Kapalka, one of the founders and the creative director of PopCap. We discussed the evolution of PopCap, how the casual gaming industry differs from mainstream gaming, and the challenges of creating games that can be engaging, without being frustrating.

James Turner: Could you start by talking a little bit about your background and how you came to PopCap and what you did before then?

Jason Kapalka: My career in computer games started back in the early '90s, when I was writing for the magazine, Computer Gaming World, doing various reviews and articles. In '95, one of the editors from the magazine left to join an internet dotcom start-up in San Francisco called TEN, the Total Entertainment Network. He invited me to come down there and work there, which I did. And TEN evolved over the dotcom boom and bust cycle, from a very hardcore gaming service into what eventually turned into Pogo.com around 1999. I worked there initially on hardcore games. One day, I was working on Total Annihilation tournaments, and then the next day, someone said, "Hey, design bingo." And I was sort of like, "Oh. Bingo? Okay."

pvz.jpgThat was the beginning of my casual game design career, I guess. And yes, I was there at Pogo. I helped design a lot of the structure for their casual games until around 2000 when I left, and Pogo eventually went on to get bought by Electronic Arts, of course. I left in 2000 and started PopCap with two other guys, Brian Fiete and John Vechey who are these guys from Indiana that I'd met earlier, around '97. They had made an internet action game called ARC that we'd produced on TEN, and we stayed in touch. In 2000, we all thought we wanted to try something different. So we all left our respective companies to start PopCap. As you might remember, 2000 was not the best year for internet companies. So we didn't really realize that the entire industry was collapsing. We had an interesting time initially. Luckily, our ignorance protected us, I guess.

PopCap started from there, just the three of us working out of our apartments. And over time, we'd say, "Well, I guess we need to hire an artist." And I'd say, "Well, I guess we need to hire maybe another guy here to program this stuff." And then eventually, maybe someone should look at the books or whatever, so we'll hire someone to take care of the bookkeeping. And it kept going like that until eventually we thought that maybe we needed an office. And from there, suddenly, we've got nearly 300 employees now in 2009. So it's been an interesting kind of experience. We never really intended PopCap to get anywhere near as big as it has today.

James Turner: How would you describe PopCap's place in the market today?

Jason Kapalka: I guess it's a bit odd. Casual game companies exist in these strange spaces where they're often the developer and the publisher at the same time. And then they also publish stuff with other guys, where they're sort of rivals, but also they're partners. There's a lot of this co-opetition thing going on. PopCap is obviously a developer, and we develop a lot of games. We used to publish other people's games. And we still do indirectly. in that we have SpinTop Games. which is a company we bought a couple of years back. They distribute a lot of other people's games through their site. But primarily, I think we develop and then publish titles. But we primarily focus on publishing our own titles. So we're kind of a self-publisher, I suppose.

James Turner: That's actually something I wanted to ask you about because one of your distribution channels now is Steam, which is another company's portal for their games and others. How do you see that relationship?

Jason Kapalka: Steam's been really good. We work with lots of different portals. Steam is one of many that our typical game would go out on. On Steam, on Real Arcade, Big Fish Games, Yahoo Games, MSN, WildTangent, a whole bunch of smaller channels. So Steam was just one of several. It's been interesting in that it was developed differently than a lot of those other ones. Steam is definitely much more of a hardcore game distribution channel than something like Real Arcade. So initially, when we started on Steam, it was uncertain whether our games were going to really fit in. Initially, a lot of the ones we tried on Steam didn't really work too well for their audience. Hidden object games don't do especially well with Steam users, for example.

The turning point for Steam was probably when we did Peggle Extreme with Valve. I don't know if you remember that. Peggle had just come out, and the guys at Valve really liked it. We were talking and we had some weird ideas. Someone had the odd suggestion to do sort of a miniature-themed version of Peggle that featured all of the Orange Box's characters, the Half-Life, MT Team Fortress guys. It was a really strange idea, because that was a fairly mature violent kind of franchise. And certainly, it didn't seem like the obvious fit for Peggle. But, on the other hand, we thought, "Well, what the heck? We can try it and it's only going to go on Steam anyway so it's not like it'll offend the soccer moms necessarily." So we tried that out, and it went up. And we were all kinds of nervous because we didn't know -- it had launched initially as a free download with the Orange Box. And even though it didn't cost people anything, we were still kind of wondering if there was going to be this big backlash from the hardcore community about, "What the hell is this cheap little pinball thing doing in the middle of my Orange Box product."

But actually, the response was really good. I mean, the Orange Box guys all really liked Peggle a lot. And ultimately, that led them to go and seek out and buy the regular versions of Peggle which made Peggle suddenly this fairly big success on Steam. Which a month or two ago, before that, didn't seem very likely that this game with unicorns and rainbows would be selling well on Steam. So after that, that sort of seemed to kind of be -- it sort of opened the floodgates a little bit. And now a variety of our games do very well on Steam. Obviously, Plants Vs Zombies was the last one that had quite a hit there. Not everything. There's still some of our games that are clearly more casual and that don't particularly work well on Steam. But the ones that do work there seem to really work well.

James Turner: There seems to be a fairly different expectation level for casual games in terms of graphics and such. Do you think that's a natural result of how they're produced and what they're intended for? Or could you see something like Plants Vs Zombies but with the graphics levels of a Half-Life?

Jason Kapalka: It's certainly possible. I mean in some cases, we're not intentionally trying to make the games low fidelity. We try to do the best art direction we can. Although the usual contradiction, or decision to be made, there is we also want to make games as accessible as possible. So we want Plants Vs Zombies to play on every crummy netbook and seven-year-old computer your mom has and all of these types of things. And so that tends to mean that we try to work and have good art, but usually make the technical requirements very modest. We've been working at making things that can scale well so that on a good computer, you'll get a really nice experience and it'll still scale down to play on a lower-end computer. But that can be challenging in itself. So usually, we err on the side of not worrying about the graphics being too high-end because our experience is showing that a good game with not very fancy graphics can sell very well, like Plants Vs Zombies. And I think that game has good graphics, but it's definitely limited. It's only got 800X600 resolution and so forth. But on the other hand, we've seen plenty of games in the casual space that have really good graphics and they sell very poorly if they're not a fun game. So accessibility and fun definitely, for us, end up being a first priority over graphics. And especially 3-D or technically impressive graphics versus just good art direction.

James Turner: You would think Nethack and Rogue would be the ultimate proof that you can have good game play without good graphics.

Jason Kapalka: Sure, I love Roguelike games. We have lots of Nethack fans over at PopCap, which seems a bit weird in that they're obviously not very casual in many regards. But yeah, they're good exemplars of that principle that graphics are not as important as game play.

(continue reading)

tags: development, flash, games, gaming, interviews, iphone, popcap, software, steamcomments: 5
submit: Reddit Digg stumbleupon   

 

Tue

Sep 29
2009

James Turner

David Hoover's Top 5 Tips for Apprentices

Finding a Good Mentor is Key

by James Turnercomments: 1

If you're a senior developer with years of experience under your belt, it may be hard to remember what it was like coming out of college with a newly minted CS degree, and entering the workplace. But as David Hoover argues, helping these newcomers to the workforce to succeed can be the difference between effective, motivated developers and confused, discouraged ones. Hoover is the author of the new O'Reilly book Apprenticeship Patterns, and he says that people coming right out of college may, in fact, be less motivated than someone who has been working for a while. "One of my theories is computer science education is really hard, and it's expensive. And so when you're done with it, you're ready to cash in and sit back for a little while. 'Hey, I just spent a lot of money. I spent a ton of time and effort and pain on four years of getting this certificate and okay, now it's time to make that pay off.' You're definitely going to be less incentivized to start a new job, and now realize that you've got so much more to learn still. As opposed to someone who's just coming up, who's going to be at a big disadvantage knowledge-wise, but is probably actually going to be at a big advantage motivation-wise because they're going to be hungry, and just assume that they have to learn everything on their own. Whereas, like I said, some computer science people are going to be disincentivized. They're going to be surprised that they've come into their first job and, geez, they have to learn source control and they have to learn unit testing and they have to learn about these different processes that we use. And some programs prepare you for that stuff; some programs are very theoretical and very outdated. And you just have a ton to learn in your first gig."

According to Hoover, one way to ease the transition into real life development is to use an apprenticeship model. His book draws on his own experience moving from being a psychologist to a developer, and the lessons he's learned running an apprenticeship program at a company called Obtiva. "We have an apprenticeship program that takes in fairly newcomers to software development, and we have a fairly loose, fairly unstructured program that gets them up to speed pretty quickly. And we try to find people that are high-potential, low credential people, that are passionate and excited about software development and that works out pretty well."

Hoover says that most developers have benefited from one or two key people in their career that helped them move along. "For people that had had successful careers, they only point back to one or two people that mentored them for a certain amount of time, a significant amount of time, a month, two months, a year in their careers." He also points out that finding that person may mean looking outside your company. "For me personally, I wasn't able to find a mentor at my company. I was in a company that didn't really have that many people who were actually passionate about technology and that was hard for me. So what I did is I went to a user group, a local Agile user group or you could go to a Ruby user group or a .net user group, whatever it is and find people that are passionate about it and have been doing it for a long time. I've heard several instances of people seeking out to be mentored by the leader, for me that was the case. One of our perspective apprentices right now was mentored by the leader of a local Ruby user group. And that doesn't necessarily mean you're working for the person, but you're seeking them out and maybe you're just, "Hey, can you have lunch with me every week or breakfast with me every other week." Even maybe just talking, maybe not even pairing. But just getting exposure to people that have been far on the path ahead of you, to just glean off their insights."

(continue reading)

tags: agile, apprenticeship, interviews, mentorship, peer programmingcomments: 1
submit: Reddit Digg stumbleupon   

 

Thu

Jul 16
2009

James Turner

How NPR is Embracing Open Source and Open APIs

Daniel Jacobson Will Talk About the NPR Open API at OSCON

by James Turnercomments: 7

You may also download this file. Running time: 14:14

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

News providers, like most content providers, are interested in having their content seen by as many people as possible. But unlike many news organizations, whose primary concern may be monetizing their content, National Public Radio is interested in turning it into a resource for people to use in new and novel ways as well. Daniel Jacobson is in charge making that content available to developers and end users in a wide variety of formats, and has been doing so using an Open API that NPR developed specifically for that purpose. Daniel will talk about how the project is going at OSCON, the O'Reilly Open Source Convention. Here's a preview of what he'll be talking about.

James Turner: Can you start by explaining what NPR Digital Media is and what your role with it involves?

Daniel Jacobson: Sure. NPR is a radio organization, of course, and the Digital Media Group, of which I'm a part, handles, essentially as I describe it, everything that is publishable by NPR that does not go to a radio. So that includes the website, podcasts, API, mobile sites, HD radios, anything that has some sort of visual component to it. So Digital Media as a group is responsible for producing that content, producing all of those distribution channels, managing all of those relationships.

James Turner: And what is your particular role there?

Daniel Jacobson: I manage the application development team that is responsible for all the functional aspects of all of the systems, which includes our CMS, all of the templating engines for the website, for the API, for the podcasts, all of the engines that drive that.

James Turner: Now NPR is an organization that consists of a lot of member stations kind of flying in close formation. What's your relationship with the content producers? To what extent do they have their own stuff, and to what extent do you work together?

2009_0223_npr_logo.jpgDaniel Jacobson: Those member stations are really exactly that; they are members of NPR. They essentially buy NPR programming. They're distinct organizations from us. NPR is a content producer and distributor. They buy our programming and broadcast it out to the world. They also have their own corresponding web teams that can take NPR content and also produce their own content and create their own websites. So in the Digital Media Team, we take a lot of pride and effort in providing services that help those member stations better serve their communities and their listeners and audiences, using NPR content and using their own content. We work with them to try and satisfy their missions. And to the extent that they need NPR services or content, we work hard to try and provide those. The API is one massive step, I think, in making it much easier for them to do what they need to do without a whole lot of intervention from us, where previously they would have to pull in content in much more arduous ways. So the API, I think, is a step in the right direction to make it more of a self-service model.

James Turner: Since you've mentioned the API, that's what you're going to be talking about at OSCON. We've already talked to the New York Times and the way they're opening up their content through APIs. What are you doing with yours?

Daniel Jacobson: Well, we launched ours formally at OSCON last year. And at that time, we essentially opened up our entire archive. So anything that you can get on npr.org is available through the API, to the extent that we have the rights to distribute it. There are some rights restrictions, for example, for receiving photos or stories from sources that we have not cleared rights to redistribute. Those are getting suppressed through a rights filtering engine on our API. Everything else that you can get on npr.org, you can get through the API. That includes full text. It includes images, audio, video, everything like that. Throughout the last year, we have added more features. We included the layer of "mix your own podcast", for example, which allows people to not only get the content in audio form, but also to download it as a podcast-type item. And all of that is available through search terms or totally customized queries. So what the API really does is it enables people to take the content, make widgets, or do whatever they want with essentially everything that is on npr.org and get to audiences that we are not getting to.

(continue reading)

tags: interviews, news, npr, open apis, opensouce, osconcomments: 7
submit: Reddit Digg stumbleupon   

 

Tue

Jul 14
2009

James Turner

Making Government Transparent Using R

Danese Cooper thinks it will be an important tool in Open Gov

by James Turnercomments: 7

You may also download this file. Running time: 26:58

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

With Open Source now considered an accepted part of the software industry, some people are starting to wonder if we can't bring the same degree of openness and innovation into government. Danese Cooper, who is actively involved in the open source community through her work with the Open Source Initiative and Apache, as well as working as an R wonk for Revolution Computing, would love to see the government become more open. Part of that openness is being able to access and interpret the mass of data that the government collects, something Cooper thinks R would be a great tool for. She'll be talking about R and Open Government at OSCON, the O'Reilly Open Source Convention.

James Turner: Why don't you start by describing where you came from, and you're involved in, and what your interests are?

Danese Cooper: Okay. I'm Danese Cooper. I serve on the board of the Open Source Initiative. I have been serving for the last eight years. And I'm also currently employed by Revolution Computing, which is a start-up focusing on an open source language called R, as in the letter R, that is very useful for analytics and statistical analysis. I'm also an Apache member. And I also serve on an advisory board for Mozilla.

James Turner: One of the two panels you're going to be speaking on at OSCON is on open source and open government. If you could talk a little bit about what interests you about open government and also what open government means to you.

Danese Cooper: Sure. Well, along with a lot of open source people, I got interested in the Obama campaign and in helping President Obama get elected. And part of why he was so compelling was that the vision of how Washington needed to change is pretty close to the way that we think about working collaboratively in open source. The night that he was elected, there was a great little clip on CNET of a Republican commentator actually explaining open source as exactly what I just said. It was a really brilliant little two-minute clip. He pointed at The Cathedral and the Bazaar, that canonical document about how open source works. And he said, "Microsoft is the cathedral. It's their way or the highway. And the bazaar is a bunch of people working together grassroots to collaboratively build the things that they need. And so Obama's basically asking for the government to become open source, and the problem is Washington isn't really like that right now."

So anyway, that's the transformation that has to happen in order for government to really be transparent. To me, open source government is transparent government. There's been an awful lot of shenanigans in recent political history, like the last decade has been pretty crazy in terms of things happening that couldn't be traced back to any source. Even just the way we vote and the way that voting is managed, and the fact that the software that runs the machines that we vote on is not open source so it can't be inspected. And nobody knows quite what it does. There are all of these stories of weird updates to the software that happened right before major elections in states where there are strange results. Transparency, in the same way that it helped the software industry transform, could really help the government transform. So that's what I'm talking about. There's a bunch of other people on that panel. My good friend, Brian Behlendorf, and I co-proposed it. And he's actually taken the next step. He helped found Apache. And he's run off to Washington to work on projects that are interesting to the Obama government to try to figure out how to help them to more open source solutions. And he'll be talking about his progress on that panel. So I think it's a pretty exciting panel.

(continue reading)

tags: interviews, open government, open source, oscon, r, statisticscomments: 7
submit: Reddit Digg stumbleupon   

 

Mon

Jul 13
2009

James Turner

Sequencing a Genome a Week

Radar Talks to OSCON Speaker David Dooling

by James Turnercomments: 3

You may also download this file. Running time: 34:51

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

The Human Genome Project took 13 years to fully sequence a single human's genetic information. At Washington University's Genome Center, they can now do one in a week. But when you're generating that much data, just keeping track of it can become a major challenge in itself. David Dooling is in charge of managing the massive output of the Center's herd of gene sequencing machines, and making it available to researchers inside the Center and around the world. He'll be speaking at OSCON, the O'Reilly Open Source Convention. His talk, titled The Freedom to Cure Cancer: Open Source Software in Genomics, will be about how he uses open source tools to keep things under control, and he agreed to talk about how the field of genomics is evolving.

James Turner: Can you start by describing what it is you do and how you came to be doing it?

David Dooling: Sure. I work at the Genome Center at Washington University in St. Louis. We are one of the handful or so of large scale genome sequencing centers around the world. What that means is essentially we participate in large genome sequencing projects that some people may have heard of, like the Human Genome Project, Thousand Genomes Project, things like that. And involved in that is a lot of data processing, laboratory processing, tracking and all sorts of things, so it's a rather large enterprise.

There are about 300 or so people that work here. And how I came to work here was that about eight years ago, I decided that I wanted to get more into programming and more into open science. So I took a job as a programmer here at the Genome Center and gradually worked my way around to where I am now, where I oversee all of the software development and IT infrastructure here at the Genome Center. And it's a fairly large IT infrastructure.

We have somewhere around three petabytes of storage online, and somewhere north of 3,000 cores in our computational cluster. And we're generating terabytes, tens of terabytes of data, per day with our current sequencing instruments. The sorts of things that we're doing now as we transition from more fundamental evolutionary types of projects, such as the Human Genome Project and subsequent projects like the Mouse Genome Project, we've done things like corn and things of that nature, now we're doing more and more sequencing projects related to medicine and medical sequencing.

Last year, we published the full cancer genome sequence. In doing both the cancer and the normal, we were able to determine the differences between those two genomes and begin to identify what might've possibly caused cancer in that individual. So projects like that. We're also doing projects with metabolic syndromes, like diabetes, and several other cancer projects as well. That's essentially what we're doing and how we're doing it and how I got here.

James Turner: Genomics is an area that seems to be on the steep part of the hockey stick curve right now. In just a decade, we've gone from sequencing one genome over a period of years to doing them routinely. Can you talk a bit about what's enabled this acceleration?

doolingd.pngDavid Dooling: Well, a whole host of things. But I think really at the core was the changing fundamentals of sequencing itself. For a long time, DNA sequencing was based on a process invented by Sanger, sometimes called Sanger Sequencing, sometimes called capillary electrophoresis now because of the last revision of the instruments that were generated. But essentially with that approach, you did reactions in 96 plate wells. You processed sequence in these 96 plate well chunks. And you did reactions in there. You loaded them on the readers, and the readers read out sequence for each of those 96 wells. So that's sort of how you processed it. And at the height of that sort of sequencing, which was only a few years ago, we had about 130 or so of those instruments each churning about 15 to 20 runs per day. Each run gave you 100 pieces of sequences. You had 100 or so machines. And so you got on the order of a few thousand sequence reads, that's what we called them, because of the way the instrument read the information.

Now, since that time, 454 was first [of the new generation of sequencers] and then Solexa came, which was later bought-out by Illumina, and the ABI SOLiD has a platform. There's one from Helicos as well. And then several other third generation, those first being the second generation, sequencers have come out. And what those do is greatly increase the parallelism with which you're able to process DNA and sequence it. So instead of a few thousand runs per day, or a few thousand reads per day, you may get a few million reads per run. And these runs, for some of the platforms, do take a little bit longer. But the parallelism of it increases your throughput tremendously. And so now we have about 35 to 40 of these highly parallel instruments in-house. And with that, we're able to sequence the human genome to complete coverage in less than a week.

So the main driver has been this change in the sequencing technology and the parallelism of it. It's a fundamentally different chemistry, different physics. The flipside of it is that we talked about the hockey stick, and so that hockey stick is the sequencing hockey stick, but it's brought several other hockey sticks along with it, mainly the amount of data that these things generate. And the amount of processing power that is required to process that data has increased greatly as well. Much faster than Moore's Law over the last two years or so. Whereas with those original instruments, you would generate on the order of megabytes per day, now we're doing tens of terabytes per day with these new instruments. And then processing that, instead of taking a single processor a few minutes, it can take a small cluster a few days to actually analyze the data from each of these runs.

Those are the main things. The enabling technology was the change in the sequencing chemistry itself. And then what had to come along with that was building these infrastructures to be able to track these things and process these things and store all of this data as the instruments increased in their abilities.

(continue reading)

tags: genomics, informatics, interviews, open source, osconcomments: 3
submit: Reddit Digg stumbleupon   

 

Thu

Jul 2
2009

James Turner

Patrick Collison Puts the Squeeze on Wikipedia

How to Cram the Wikipedia onto an 8GB iPhone

by James Turnercomments: 9

You may also download this file. Running time: 15:13

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Think about Wikipedia, what some consider the most complete general survey of human knowledge we have at the moment. Now imagine squeezing it down to fit comfortably on an 8GB iPhone. Sound daunting? Well, that's just what Patrick Collison's Encylopedia iPhone application does. App Store purchasers of Collison's open source application can browse and search the full text of Wikipedia when stuck in a plane, or trapped in the middle of nowhere (or, as defined by AT&T coverage...) Collison will be presenting a talk on how he did it at OSCON, O'Reilly's Open Source Convention at the end of July, and he spent some time talking to me about it recently.

James Turner: Why don't you start by talking about your background a bit and how you got involved with working with the Wikipedia?


Patrick Collison: I guess I've always been pretty interested in Wikipedia, and I ran my own MediaWiki installations back when I was in school in Ireland. We had our own personal ones and all of the rest. Then in November of 2007, I went to visit my friend in Japan for a month. And in Japan they have all of this incredibly advanced cellular technology and all of the rest. And so because of that, they had very few wireless networks, and my phone didn't work. As a result, I actually had very little access to the Internet. I sort of realized without Wikipedia how little I really knew. And I had just got an iPhone, so I decided to try basically putting a copy of Wikipedia on the phone, so that I'd have it as I was walking around in Japan. Then basically, I spent a significant fraction of my time there in Japan, again, in 2007 writing those applications, say maybe two or three weeks, just firstly trying to decide if it was possible and putting it all together. And then it was released, I think, January of 2008.

iphone-article-large.pngJames Turner: Now you've also worked on getting it onto the OLPC I understand. How did that occur?

Patrick Collison: I actually didn't do much of the work for this. It was actually a project led by Chris Ball who works both with FreeBSD and with the OLPC project. But I released the code to this application; it was open source from the very start. So it was pretty easy for them to take it and to port it to the OLPC. I mean there are already some applications that allowed you to put a copy of Wikipedia on your computer or something like that, but none had really been optimized for embedded or low power devices or anything like that, which obviously Wikipedia for the iPhone had to be. I think it took about two or three weeks to take the code that ran on the iPhone and then to bring it to the point where it'd run on the OLPC.

James Turner: There are obvious benefits to having Wikipedia on the OLPC, because connectivity is very important in some of those areas. So you'd want to have it local, but outside of the experience that you were just describing, isn't the point of the iPhone that you can just access Wikipedia? What are kind of the advantages of having it locally?

Patrick Collison: I actually find that you spend, or I certainly spend a surprising amount of my time without access to the internet, even with the iPhone. Say for start if you were abroad, I mean everyone knows the horror stories of the data changes AT&T will issue you with if you're roaming. But also just stuff like personally, I find that on a plane or something you have eight hours to not do much. And so I actually end up doing a lot of my Wikipedia browsing there. But even aside from connectivity issues, it actually turns out to be quite a bit faster to use the built-in, cached Wikipedia application as opposed to the website. I mean you can search in real-time with the applications. You just type a couple of characters and tap into your article, rather than firing up Safari or searching for the article in Google; then zooming in so you can tap in, et cetera, et cetera. I and most of the people I know who use the application actually end up using it even when they have internet connectivity. And maybe 20 percent of the time it's pretty useful because it's the only choice.

James Turner: Now just as a point of interest, is this an App Store app or do you have to have a jail-broken phone for it?

Patrick Collison: It was released back when only the jail-broken SDK existed. It was in that initial sort of surge of early applications. I guess the first jail-broken iPhone app, I think, happened in August, and so this was released just under six months later. And then when Apple announced the SDK, I actually originally did not intend to port it to the App store, just because I was just working on other things at the time and my company had just been bought and so it seemed like a lot of work. But then over the summer, I started getting a huge amount of email from people who had upgraded to the new version of the iPhone OS, and were now missing Wikipedia. And I started getting 20 or so emails from people per day saying they love this application and they were really missing it. Or even people saying they were continuing to use the old version of the OS just for this application. And they really hoped that I would port it so they could eventually upgrade. After receiving these emails for a while, I eventually felt too bad about not porting it. So I spent a couple of days porting it and then released it in the App Store. I wrote it and finished the port in August. And then it took about three months to wade through Apple's approval process. Around the end of October, it was released in the App Store.

(continue reading)

tags: interviews, iphone, open source, oscon, wikipediacomments: 9
submit: Reddit Digg stumbleupon   

 

Tue

Jun 16
2009

James Turner

Walking the Censorship Tightrope with Google's Marissa Mayer

by James Turnercomments: 4

You may also download this file. Running time: 18:36

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Google sometimes finds itself at a difficult crossroad of wanting to make as much information available to as many people as possible, while still trying to obey the laws of the countries they operate in. I recently had a chance to talk to Marissa Mayer, who started at Google as their first female engineer, and has now risen to the ranks of vice president in charge of some of Google's most critical product areas, such as search, maps, and Chrome. We talked about some of Google's future product directions, and also about how Google makes the decision as to when information has to be withheld from the users. Marissa will be delivering a keynote address at the O'Reilly Velocity conference next week.

James Turner: As VP of Search Products and User Experience, you're responsible for a vast swath of the Google product line, from search to maps to Google Labs. You were also the first female engineer at Google. Can you talk a little about how you came to Google and what brought you to where you are today?

Marissa Mayer: Sure. My background is when I was at Stanford, I was doing a symbolic systems degree in artificial intelligence. And I was always somewhat interested in search. I ended up getting an email [from Google] towards the very end of my job interview process. I came to Google and did the interviews. And I came here because I really wanted to put my AI background to use. For about the first year or so, I did. I did a lot of work on categorization, some work on search quality. And then interestingly, we sort of had a void around how the site looked and felt and how it worked. And we tried very hard to hire someone in UI. We thought we needed someone to do UI like one day a week, and do systems engineering the rest of the time. After a few months of failing to hire such a person, Urs Hölzle, our VP of Engineering, pulled me in and said, "Marissa, we've looked through all of the resumes and you have this background in your undergrad on cognitive psychology and philosophy and things. And would you mind dedicating one day a week to UI?" So I did.

marissa_mayer_lg.jpgI pulled together a volunteer team to help out one day a week while we all still worked on our various AI and systems work. And then, of course, one day became two days which became three days or four days or five days. And I was also programming at that point. I switched over from a lot of the AI work I was doing to programming the front end for Google, working on the Google web server because it was nice for me to be able to not only make decisions about the UI but also to implement them.

And then because I was implementing the changes to the front end, I would go and meet with Larry and Sergey. And they would say, "What's happening on the site this week?" And I would say, "Well, I coded a change that looked like this. And I coded a change that looked like that. And translated this page and it'll go here. And there'll be a pull-down over there for the number of results." And without even realizing it, I did project management before Google had project management, by specifying how things were and looked and communicating to the rest of the company about how these changes would manifest. And then when we got to about 200 or 300 people, Larry and Sergey discovered that most companies have this function, product management, which I didn't know about; they didn't know about prior [to this]. And we decided we should have such a department. They realized there were a few of us around the company that were doing product management even though that wasn't our title. So they started the product management group and got it started earlier. And so I became a PM.

First, I was the PM on Google.com, really broadly across the whole site, because there were only three of us. One of us did the site, which was me. Salar Kamangar did the ads. And Susan [Wojcicki] did the partners. And then as our teams grew, I became the director of consumer web properties, where I still did all of the consumer facing work of the website including branching into Gmail and tool bars and some of these other areas. And then as we progressed, eventually, we restructured so we had search, ads, and apps, because Gmail and the related space of calendar and docs became large enough that it made sense to spin it out. So then I kept the search piece and the properties we have that are more related to search.

James Turner: There's always been other companies trying to take a piece of Google's dominant position in search; how does Google plan to stay a step ahead in search, especially in light of new players like Wolfram and Microsoft's new emphasis on Bing?

Marissa Mayer: Well, we are very focused on search. We have a large team here that's really focused on it and working hard on it. And we're constantly trying to forge in new directions. So we were really excited about the launch of our search options page because we think that allows us to try a lot of new ways to slice and dice and filter results. We were also really excited about Google Squared, which attempts to do automated text extraction from the web and present comparison tables for different entities in response to queries. They're both new ways to search. How do you generate a timeline from a web search? How do you generate a comparison table? And some of our competitors are also looking at those same issues. So I think on the whole, right now, our search is a very healthy ecosystem. There's a lot of interest. There's a lot of activity, and there's a lot of new ground being forged.

James Turner: Google users want the most useful results, but content providers want to get their pages seen, sometimes it seems at any cost. How will Google continue to provide the most useful results in the world of increasingly sophisticated SEO gaming?

Marissa Mayer: Well, we generally -- we really want to be fair in these issues as well as be good to users. We do think that spam is very detrimental to the user experience. We do have an incentive to find spam and remove it from our results. But we want to do something in a way that's very scalable. The web is scaling at an incredible rate. And we don't think it's really viable to try and fight spam in a manual way. So we're always looking for new algorithmic ways to understand new spam techniques, to be able to detect them in an automated way and remove them from our results. And the nice side benefit that scalability has is it's also reasonably objective and fair.

(continue reading)

tags: censorship, google, interviews, maps, news, seocomments: 4
submit: Reddit Digg stumbleupon   

 

Mon

May 18
2009

James Turner

Velocity Preview - The Greatest Good for the Greatest Number at Microsoft

by James Turnercomments: 4

You may also download this file. Running time: 00:20:26

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

The psychology of engineering user experiences on the web can be difficult. How much rich content can you place up on a page before the load time drives away your visitors? Get the answer wrong, and you can end up with a ghost town; get it right and you're a star. Eric Schurman knows this well, since he is responsible for just those kind of trade-off decisions on some of Microsoft's highest traffic pages. He'll be speaking at O'Reilly's Velocity Conference in June, and he recently talked with us about how Microsoft tests different user experiences on small groups of visitors.

James Turner: Why don't you start by describing what your gig at Microsoft is now and what your career path has been there?

Eric Schurman: I'm a principal dev lead for Live Search, what used to be MSN Search. And I started at Microsoft back in the late 90s working in Microsoft's Press organization, where we actually were developing training software that would emulate new Microsoft products, but didn't require those products to be on a user's machine. So, for example, if you had an organization that was running Windows 95, we would have a training system for Windows 98 that would emulate a bunch of the functionality of Windows 98 so that you could deploy it to your people. They could train their people on how to use Windows 98 before they actually deployed it.

I then moved on to the Microsoft Press website, where I became the dev lead for it. I made a few other moves and ended up going to Microsoft.com, where I ran the download center, the Microsoft.com homepage, the product catalog, and a bunch of other places from a dev perspective.

velocity2009_336x280.gifI then moved to what was then MSN Search, back in about 2005, and was there through the MSN to Live transition. At the time, I wasn't working on performance; I was just working on the Live Search application. And it became very obvious that we had some major performance problems. Performance has always been one of my really strong interests, so I took on addressing a lot of those. And when we addressed them, we had very significant improvements in our business metrics. That really surfaced how important performance was to the organization, and I moved into a role where I was really focusing just on performance. I've been in that role now for about two years.

JT: You've worked on at least three very different parts of the Microsoft website. The homepage has lots of hits, fairly static. The download page is a lot of data for long periods of time. Live Search is high volume, but there's also a lot of backend on that. In what ways do you need to architect them differently? And where can you reuse the same lessons?

ES:: That's a great question. On the web, you've got different concerns on what you have for client apps. The main things that tend to impact end-user perceived performance on the web are often things about how you've designed your application from a network perspective. So how many different HTTP get requests are you making? How are those get requests structured? So, for example, are they serialized? Did you have a JavaScript file that then gets returned to the browser that requests another JavaScript file and another JavaScript file and then some content and then it finally gets rendered? So the number of assets that you request, that's going to be something that's important no matter what product your doing.

There are other things, like how much script do you have on the page, how much CSS you have on the page, how much actual content are your rendering to the page, etcetera. There are tricks that you can use like combining many different graphics into a single tiled image and sending that down to the browser. It's much faster to send one image to the browser than, say, 20 images. Even if you end up sending the same overall graphics, but combined into one, it's still must faster to send it as one request.

There are also different data volume concerns. They're also different from a business perspective. A lot of what we were sending out from the download center was extremely time critical. We would have an update go out, and we needed to make sure that update was going to be available anywhere in the world within a certain time frame, which required us to handle very high bandwidth, and a very high volume of requests coming into the site that were transferring lots of bits. So that required something totally different than something like the Microsoft.com homepage.

It's also interesting looking at the volume of traffic and how that traffic reflects real users. So, for example, one of the problems that you end up with on both the Microsoft homepage and Live Search is that we have a huge number of bots that are trying to hit the system, lots of people trying to do SEO work are trying to hit search engines to gather information about their site, about competitor sites, about all sorts of things. On the Microsoft.com homepage, it's always under distributed denial of service attacks. It's not a question of how frequently does it happen; it's just what is the rate right now? Also, the Microsoft.com homepage has historically had such a high up-time rate that it's actually hit by a lot of hardware devices simply to check for connectivity to the internet. And so you'd want to treat a request from that kind of "user" very differently from a request that's coming from a real user.

So that's kind of a long, rambling answer to your question. Do you have any areas that you want me to drill in or maybe talk about something else?

(continue reading)

tags: interviews, microsoft, operations, velocity09, velocityconf, web2.0, webopscomments: 4
submit: Reddit Digg stumbleupon   

 

Tue

May 12
2009

James Turner

Google Engineering Explains Microformat Support in Searches

by James Turnercomments: 8

You may also download this file. Running time: 18:24

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Today, Google is releasing support for parsing and display of microformat data in their search results. While the initial launch will be limited to a specific set of partners (including LinkedIn, Yelp and CNet reviews), the intent is that very quickly, anyone who marks their pages up with the appropriate microformat data will be able to make their information understandable by Google. This technology would allow you to explicitly search, for example, for only printers that had an average customer review of 3 stars or higher. Initial support will include things such as:

  • Review Ratings
  • Product Prices
  • Personal Details

We talked this morning with Othar Hansson and RV Guha, two of the Google engineers responsible for the new functionality, and you can listen to them discuss it in this exclusive O'Reilly interview.


JAMES TURNER: Why don't you guys start by introducing yourselves?

OTHAR HANSSON: Sure. I'm Othar Hansson: and I'm a tech lead on this project. And I'm in Google's Search UI Group.

RV GUHA: My name is Guha. I'm an engineer at Google and I do stuff across the board.

JT: So can you describe briefly, to start off, exactly what it is you're releasing today?

RVG: Okay. We are asking webmasters who have pieces of data like reviews or people profiles, and in an experimental form, things like information about organizations and products, to put the structure data representing the content on the webpage in a machine-understandable form on the webpage. Typically, what happens is that if you take a website and having created opinions, I can talk about the context of opinions. You would typically have a database in the back-end which has lots of information about products. People write reviews about them. And you get information such as the number of reviews, the average rating of the reviews, the price of the product, who sells it, et cetera, et cetera, et cetera. It's stored in a structured database in your back-end. You then use some scripts to format it into HTML as per the site's design. Now going from the structured data to the HTML is quite straight-forward. But going from the HTML back to the structured data in a fashion which works across sites is very, very, very hard. Now our search engine doesn't -- it's very difficult for a search engine to understand -- to sort of get back the structured data for all of the sites. Now if it were to understand that, if it were to understand that this is a review site where the product being reviewed is such and such and it has 30 reviews with an average rating of 3.2 and so on and so forth, we could do a better job of the search. In particular, we could do a better job of presenting the two or three lines of text that appeared as part of the search result so that the user has a better idea of what to expect on that page. And from our experiments, it seemed that giving the user a better idea of what to expect on the page increases the click-through rate on the search results. So if the webmasters do` this, it's really good for them. They get more traffic. It's good for users because they have a better idea of what to expect on the page. And, overall, it's good for the web.

JT: So in some ways, that's in the same way that right now for certain sites, you'll give the internal structure of the site as part of the search result or for shopping results, you'll give price ranges and things like this. This is just, again, enriching and providing more structured -- more than just a snippet, giving more of a structured display of the information on that page?

RVG: Yes. If we have a structured data, we can do lots of things. We're starting off by improving the snippets. It's an absolute no-brainer. It seems to be helping everybody. And, as you know us, we keep playing it on with different ideas and different things. As structured data becomes more prevalent, there's a ton of ideas, both inside Google and outside Google, on how you might improve search.

(continue reading)

tags: google, interviews, microformats, search, seocomments: 8
submit: Reddit Digg stumbleupon   

 

Thu

May 7
2009

James Turner

Velocity Preview - Keeping Twitter Tweeting

by James Turnercomments: 3

You may also download this file. Running time: 00:10:46

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

If there's a site that exemplifies explosive growth, it has to be Twitter. It seems like everywhere you look, someone is Tweeting, or talking about Tweeting, or Tweeting about Tweeting. Keeping the site responsive under that type of increase is no easy job, but it's one that John Adams has to deal with every day, working in Twitter Operations. He'll be talking about that work at O'Reilly's Velocity Conference, in a session entitled Fixing Twitter: Improving the Performance and Scalability of the World's Most Popular Micro-blogging Site, and he spent some time with us to talk about what is involved in keeping the site alive.

James Turner: Can you start by describing the platforms and technologies that make Twitter run today?

John Adams: Twitter currently runs on Ruby on Rails. And we also use a combination of Java and Scala, and a number of homegrown scripts that run the site. We also use a lot of open-source tools like Apache, MySQL, memcached.

twitter_logo_header.pngJT: What type of hardware are you running on?

JA: It's all Linux, so a lot of x86 hardware. I can't tell you the brands or how many.

JT: Do you make any kind of attempt to stay homogeneous in that?

JA: Yes, we do. All of our hardware is very consistent. It makes deployment of new software very easy. And we also use a number of configuration management tools like Puppet to deliver software to those machines.

JT: As anyone can see, Twitter has had a pretty explosive growth, especially recently. Were you prepared for this kind of ramp up?

JA: I don't think so. I mean we're growing week over week in enormous numbers. And we spend a lot of time calculating the growth and scalability of the site to make sure that we can handle the upcoming load.

JT: I mean obviously there are events like Oprah decides she's going to Tweet that are going to be spikes. Do you try to get warning of that stuff?

JA: Yeah. And frequently we know of major events happening. Major events are very predictable like Macworld, even any massive amount of media interaction, we have some fair warning beforehand.

(continue reading)

tags: interviews, operations, twitter, velocity, velocity09, velocityconf, web2.0, webopscomments: 3
submit: Reddit Digg stumbleupon   

 

Tue

Apr 21
2009

James Turner

Where 2.0 Preview - DARPA's TIGR Project Helps Platoons Stay Alive

by James Turnercomments: 12

You may also download this file. Running time: 00:24:14

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

A modern soldier depends as much on good intel as a reliable rifle. Gone are the days when decision-making happened at the highest levels of command and the non-coms just did what they were told. In a modern world of insurgencies and roadside bombs, the soldier on the ground needs to have as much data as they can, as quickly as they can. And when DARPA decided to try and solve the problem, their solution was TIGR, the Tactical Ground Reporting System. Sam Earp, President of Multisensor Science, works as a consultant to DARPA and Mari Maeda is the program manager at DARPA. Both will be speaking about TIGR at the O'Reilly Where 2.0 Conference in May.

James Turner: Why don't you start by describing the problem that soldiers on the ground face today and how TIGR tries to help?

Mari Maeda: Okay. Well, just as you described, the problem is that in the past, the military has focused on feeding the information up the chain-of-command. The decision-makers are the colonels and generals, and so the soldiers on the ground are just collecting information so they can make big decisions. Now in Afghanistan and Iraq, really it's the patrol leaders, soldiers on the ground, lower echelon soldiers, captains, lieutenants who need to make decisions. Are they going to take this route or the other route? Should they knock on this door or that door? Has this person ever been seen before or cited before? Does he have useful information? All of those day-to-day decisions are being made at the lowest echelon and we really needed a tool to serve those low-level soldiers. And that's why TIGR was created.

JT: Can you describe a little bit about exactly what TIGR gives to the platoon level?

tigr.pngMM: Yes. TIGR has a map-based user interface. And so instead of having a folder full of reports telling you what happened here and who they met with, here's a patrol debrief, instead of having Word files or Power Point slides, TIGR's a map-based application where you can go and do searches by defining an area. It could be a rectangle, a circle, a polygon or a route even. And it'll pull back all of the events and people and places, information along that route or in that region. And it ranges from census collection that was done in the location, names of all of the schools, pictures of schools, videos of an attack that might've taken place. Very rich multimedia information will be returned to you for the area that you defined.

And so instead of just writing a patrol report that says this happened and hoping someone might read it, you're just really looking for geospatially relevant information for the mission at hand. If you're going to take this route and you're not familiar with this route that you're thinking of taking, you can look and see how many attacks have taken place; what kind of attacks have taken place; who's been there before. So all of that information is at your fingertips. Sam, do you have anything to add to that?

(continue reading)

tags: geo, interviews, militarycomments: 12
submit: Reddit Digg stumbleupon   

 

Thu

Apr 16
2009

James Turner

Where 2.0 Preview - Building the SENSEable City

by James Turnercomments: 2

You may also download this file. Running time: 00:23:55

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Much of the information we have about how cities work (or don't) comes through direct, intentional observation and study--but could we learn as much or more by mining the data that citizens generate in their day-to-day lives, through cell phone traffic and internet usage? That's one of the questions that Andrea Vaccari, a research associate at the MIT SENSEable City Lab, is trying to answer. Andrea will be speaking on the research that the SENSEable City Project is doing at the O'Reilly Where 2.0 Conference in May.

James Turner: So why don't you start a little bit by talking about what the charter of the SENSEable City Lab is?

Andrea Vaccari: Sure. The SENSEable City Lab is a recent initiative; a new initiative of the Massachusetts Institute of Technology which focuses on studying how digital technologies are evolutionizing the way we live in cities. And, therefore, how we can leverage these technologies; how we can make use of it through understanding how cities are using it; how we can design better cities. And then we can create cities that are more sustainable, more livable and automatically more efficient.

AndreaVacarri.JPGJT: A lot of data that governments gather about cities -- the example I think of is the little things they put across the roads to find out traffic going over a road, but that's almost like just a point source data. Can you compare that to the kind of data that you're able to extract through the records you can get access to?

AV: Sure. The problem with past data in all aspects of the urban planning and social studies is that the data is usually punctual, so it refers to very specific points in space and also in time. And that's because the methods that were used to gather this information were very expensive. They required either to deploy infrastructures or to employ people to count manually cars, people, vehicles. And, therefore, it was impossible to have a real-time flow of information. What we are trying to do is to leverage the pervasive systems that enhance our cities today. And I'm referring to telecommunication networks, wireless networks, transportation systems or any other sort of digital system that interacts on a daily basis -- on a real-time basis -- with the citizens. What happens is that with these systems, interactions between the user and the system creates logs of their activity. And these logs can be used to understand the urban dynamics, to understand how people move in living cities and how cities themselves evolve in time.

JT: Now, you showed me some of the examples of the datasets that you've been playing with, and it seems like largely it's cell phone data and wifi data and then secondarily, things that are more voluntary like Flickr uploads.

AV: Yes.

JT: Wifi data you can pretty much get to a hotspot. And as Google has demonstrated with cell phone data, you can get fairly good positioning. But what kind of resolution do you get out of say cell phone data?

AV: Sure. The resolutions that we get for the cell phone is aggregated at the antenna level. So we don't get information about the individuals because we strongly respect privacy. And what we basically know is how many calls, how many text messages, how much traffic is served by each antenna in a city. And, of course, we know the position of the antenna and we can estimate the coverage of these antennas. So we can fairly understand what are the dynamics going on in the area of coverage. But, again, we don't get information about individuals.

(continue reading)

tags: cities, geo, interviews, sensors, where 2.0comments: 2
submit: Reddit Digg stumbleupon   

 

Wed

Apr 15
2009

James Turner

Where 2.0 Preview - Tyler Bell on Yahoo's Open Location Project

by James Turnercomments: 2

You may also download this file. Running time: 00:28:07

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Location can be a vague concept to pin down. To a surveyor, location means latitude and longitude accurate to a few millimeters, while to a cab driver, a street address would be much more useful. If you're German, I can tell you that I live in the United States. To a Californian, I live in New Hampshire. And to someone from Manchester, I live in Derry. Unfortunately, the way that location is currently stored and presented online is both non-uniform and frequently at a level of precision inappropriate for the end-user. That's part of what Open Location is trying to fix. Tyler Bell, who took his doctorate from Oxford to Yahoo, is currently the product lead for the Yahoo Geo Technology Group. At O'Reilly's Where 2.0 Conference, he'll be discussing Open Location.

James Turner: So first off, can you describe what the Geo Technologies Group does?

Tyler Bell: The Geo Technologies Group at Yahoo oversees all technologies that relate to geography and geographic information. So it's largely self-evident. But this is what I mean by that: it's really we own and oversee the maps and mapping technologies. So the visualizations and placements of geographically informed data. We also own user location technologies. So here, we're dealing with different methods of detecting user location, managing user location, and ensuring that users receive geo-relevant results whenever they log onto Yahoo or use a Yahoo service. And then lastly, we have something which is slightly more esoteric. It's called the Geoinformatics Group. And that's the organization which uses geography to inform data. And we do this without ever showing a map. So it's really how we add value and power to information wholly based upon where things are and where our users are.

FireEagle.pngJT: That's like returning relevant search information to what you know about the user's location.

TB: That's correct. That's the end product of search groups consuming the geo technologies services on the back-end. But what we also need to do is actually organize the geographic information. So instead of searches, they're the specialists at Yahoo about matching user intent to the results that are returned; it's our job on the Geoinformatics Group, for example, to say that when a user queries against Springfield or they're searching for Springfield, which of the countless Springfields in the United States, in the world do you mean? So we need to be able to recognize that this is a place. We need to identify all of the places of a particular place name. And then we need to be able to do a so-called geo-geo disambiguation to ensure that when you mean Springfield, when you mean Campbell, when you give us a city name, which is otherwise nonspecific, we are very likely to return the most direct and accurate results.

(continue reading)

tags: geo, interviews, where 2.0, yahoocomments: 2
submit: Reddit Digg stumbleupon   

 

Thu

Apr 9
2009

James Turner

Where 2.0 Preview - Pelago's Jeff Holden on Creating Stories Out of Your Life

by James Turnercomments: 1

You may also download this file. Running time: 00:23:05

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Tools like Twitter and Facebook have let people share in near real-time what they are doing. Now with a new generation of location-aware mobile devices, you can tell your friends or the entire world where you're doing it. Jeff Holden's company, Pelago, is one of many trying to come up with a killer application that blends location, images, text, and social networking to create a new kind of group awareness. Before starting Pelago, Jeff had a long career as the Senior Vice President of Consumer Websites for Amazon and before that, the Director of Supply Chain Optimization Systems. He'll be speaking at O'Reilly's Where 2.0 Conference on "Footstreams: Clickstreams for the Physical World."

James Turner: Pelago's first product is Whrrl. Can you start by describing what Whrrl is and what the experience to date has been?

Jeff Holden: Yeah. Sure. So Whrrl actually, there's a little complexity there because we just launched Whrrl V. 2.0, which is the prize we're focused on. And Whrrl V. 2.0 is a real-time storytelling product for people's daily lives.

JT: When you say storytelling, I've seen a lot of people talk about storytelling with these new social network things. What concretely does that mean to you?

whrrlv20story480px1.pngJH: The most important aspect of what we mean by that is the organization of the content as the story unit. So the unit of content inside Whrrl is the story. And a story for us is something that has a beginning and an end. It can have multiple people involved in the story who can all share and contribute to a single story together. It has a location associated with it. And then people basically inject into those containers, those story containers, photos and text. As they're doing that, that's actually being shared out to any number of friends that they choose. And those friends can then jump in and actually comment on the story which then becomes part of the story as well. And so that's what we mean by it is we're focused on this -- I think some people use that term generically. We're using it very specifically to refer to the core unit of content in Whrrl.

JT: From a practical standpoint, apart from people who are chronic Twitterers and would just use it every moment of their life, what would you see a typical story being?

JH: What we're seeing right now is a lot of the families are using the product to share stories. And, in fact, just this morning Alison Sweeney, she's the host of the Biggest Loser and she was on Days of Our Lives for years. She's a really famous soap opera actress. She just started using Whrrl today. And she visited the set of Days of Our Lives with her family. And so it's actually entitled, "Family Visits Days." And we feature that story because it's such a cool -- and she did it publically. And it's a really cute story about her kids and the visit with the cast of Days of Our Lives. So we're seeing a lot of that kind of thing. We're seeing people at a more general level are viewing kind of very, very funny things like Melissa Pierce, who's a really very successful video blogger and just general blogger; she's done a number of very, very funny stories. She did one called "Lonely Bear" about this gummy bear lost in the world. And through a sequence of photos and text updates, she told the story of Lonely Bear and kind of left it dangling and was going to have a follow-up segment. And is actually going to be collaborating with people to build the next story.

So people are using it in different ways. And it's really kind of unleashing a lot of creativity.

(continue reading)

tags: geo, interviews, where 2.0comments: 1
submit: Reddit Digg stumbleupon   

 

Thu

Apr 2
2009

James Turner

Where 2.0 Preview: Eric Gunderson of Development Seed on the Promise of Open Data

by James Turnercomments: 2

You may also download this file. Running time: 00:16:54

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

When we think about how government uses geographic information, we tend to think about USGS maps or census data, very centralized and preplanned projects meant to produce a very specific set of products. But Development Seed believes that there is a lot more that could be done if these types of data could be mashed up easily with each other as well as with alternate sources such as social networks. Eric Gunderson, President of Development Seed, will be speaking at the O'Reilly Where 2.0 Conference in June, and he recently took some time to speak to us about the potential benefits that open access to government data brings.

James Turner: Can you start by talking a bit about Development Seed and how you came to be involved with it?

EG: We're a strategy organization in Washington, D.C., and what sets us apart from a lot of other strategy organizations in town is the fact that we do a lot of the building. And we build [it] all on open source tools. We particularly work with international development organizations, and the knowledge silos there are pretty fierce. For the last couple of years, we've worked on a lot of projects where you have really good data and bad technology's slowing it down. So we work on a host of projects whether they're internal internets or external mapping sites.

Picture 18.png
JT: If we focus, first of all, on our government, what are the problems with how the government manages data today?

EG: Right. Well, first, a lot of times it's not even released. I mean people aren't putting it out there in any kind of way where we can access it. But even when it is, for example, like a mandate by an agency to report on food prices or a certain statistic, sometimes it's baked into PDFs. And it's put out in a way that you can't really do much with it, you know, interact with it, parse it out, discover what's there. So that said, that's starting to change. I mean there's been some folks that are saying, "Wait a minute. We've already collected this data, and if we spend a little extra time packaging it, we can put it out there. And it will essentially have a whole new lifecycle and start adding value back to the community--the tax payers that paid for it."

(continue reading)

tags: eric gunderson, geo, interviews, where 2.0 conferencecomments: 2
submit: Reddit Digg stumbleupon   

 

Thu

Mar 26
2009

Kurt Cagle

Web 2.0 Expo Preview: Will Wright, Sims and Simulations

by Kurt Caglecomments: 7

You may also download this file. Running time: 00:24:27

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Will Wright has been the foundational genius behind a thirty year string of blockbuster games, from the early Raid on Bungeling Bay in 1984 to the first truly fun urban simulation Sim City, a game "universe" that let players create and manage their own cities, dealing with everything from balancing budgets and battling crime to dealing with the aftermath of alien attacks. This game was later expanded to SimCity Societies to better explore the larger social factors that shape society.

From there he delved deeper into the lives of the individual inhabitants of those cities with the Sims, a virtual "dollhouse" that gives players the ability to shape the eponymous sim-people, their houses, careers and relationships (and in subsequent installments, let them start businesses, party, go to college, have pets, and take vacations, among many other activities).

In 2008, Wright produced Spore, where the players can "play gods" - raising new life from Sim-ooze to intergalactic civilizations in a freeform multiplayer environment that's evolving nearly as fast as the spores themselves. Scheduled for June 2009, Wright will release the much awaited Sims 3, in which for the first time, the Sims world comes together in a full immersive environment, perhaps the full merger of Sims and Sim City.

Wright will be speaking on the Sims and games in general at the Web 2.0 Expo in San Francisco. O'Reilly editor Kurt Cagle caught up with Will Wright to ask a few questions.

Kurt Cagle: You've been doing this a long time. I can remember distinctly playing Raid on Bungeling Bay back on the old Commodores days back in the late '80s. The thing I find fascinating is every game that you've done in the last 25 years or so would more actually be considered a simulation rather than a game. What have you found most fascinating about simulations as games?

Will Wright: Well, as a kid, I spent a lot of time getting models and an inordinate amount of time dealing with the plastic and with models. That kind of got me into robotics which was kind of a different form of modeling. I bought my first computer, which was an Apple 2, to connect to my robots to control the programs on that. It wasn't too long before I started doing little simulations of the robots I was working on on the computer, and I started realizing this was kind of a new way to make the models that I'd kind of grown up making, except these models had dynamics underneath them rather than just static structure.

(continue reading)

tags: gaming, interviews, modeling, sims, simulations, web 2.0, will wrightcomments: 7
submit: Reddit Digg stumbleupon   

 

Tue

Mar 3
2009

James Turner

Marc Bohlen: Finding the Intersection of Art and Technology

by James Turnercomments: 0

You may also download this file. Running time: 00:17:44

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

Artist-Engineer Marc Bohlen uses some fairly advanced technology to express his artistic visions. It's not often you find an artist with a degree from CMU in robotics, or an engineer with an Masters in Art History. Bohlen's projects explore how people and technology interact, ranging from the bickering robots Amy and Klara, to his latest project, the Glass Bottom Float. In advance of his appearance at the Emerging Technology Conference in March, Bohlen talked to us about how he approaches art, and just what art is.

James Turner: This is James Turner for O'Reily Media. I am speaking today with Marc Bohlen, who seems to collect degrees like some people collect comic books. He has a Bachelors in Electrical Engineering from the University of Colorado, a Masters in Art History from the University of Zürich, a Masters in Robotics from CMU, and a MFA, also from CMU. He's been a visiting professor in universities from Zürich to California. His work explores the boundaries between Machine Intelligence, technology, art and society. He will be speaking at O'Reily's Emerging Technology Conference in March. Thank you for taking the time to talk to us.

Marc Bohlen: My pleasure.

JT: So let me begin by asking: do you consider yourself an artist, an engineer, a social commentator or a melange of all of them?

MB: A melange of all of them, but I think artist-engineer is quite precise actually.

JT: What led you to that fusion of art and technology?

MB: Well, I was working in Art History, on Marcel Duchan and Joseph Beuys at the time, trying to figure out how the materials that they used in their work generated meaning. So the traditional art historian methodology just didn't work anymore. I was forced to start to look into domains of knowledge that were not part of artist textbooks or repertoire. So I wandered off into engineering, trying to solve those problems, and in the process of doing that I jumped into this field which, at the time of the late 80's and early 90's, started to formulate itself as an art technology complex, art technology endeavors, and I never looked back since then.

(continue reading)

tags: art, emerging telephony, engineering, interviews, technologycomments: 0
submit: Reddit Digg stumbleupon   

 

Mon

Feb 23
2009

James Turner

ETech Preview: On The Front Lines of the Next Pandemic

by James Turnercomments: 0

You may also download this file. Running time: 00:23:20

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

With all of the stress and anxiety that humanity deals with on a daily basis--confronting the dangers of global warming, the perils of a financial system in meltdown and the ever-present threat of terrorism--the fact that there's yet another danger lurking out there ready to destroy mankind: the threat of a global pandemic, may be easy to forget. But although you and I may have driven thoughts of Ebola and the like from our minds, Dr. Nathan Wolfe worries about them every day. Dr. Wolfe founded and directs the Global Viral Forecasting Initiative which monitors the transfer of new diseases from animals to humans.

He received his Bachelor's degree at Stanford in 1993 and his Doctorate in Immunology and Infectious Diseases from Harvard in 1998. Dr. Wolfe was awarded the National Institute of Health International Research Scientist Development Award in 1999 and a prestigious NIH Directors Pioneer Award in 2005. He'll be speaking at the O'Reilly Emerging Technology Conference in March. His session is entitled, "Viral Forecasting." Thank you for joining us.

Dr. Nathan Wolfe: My pleasure.

James Turner: So why don't we start by talking about the Global Viral Forecasting Initiative. How is it different from the work that the CDC and the WHO and similar organizations do monitoring disease spread?

NW: Well, what we do is we actually focus on the interface between humans and animal populations. When we looked back and investigated the ways in which disease got started, the ways that pandemics really originated, what we found is that really the vast majority of these things are animal diseases. So rather than monitoring for illness, at which point it could potentially be too late, we've taken it one step backward. We actually focus on people who have high levels of contact with animals. And we set up large groups of these individuals and monitor the diseases that they have, as well as the diseases in the animal population. So the idea is to be able to catch these things a little bit earlier.

JT: The last disease that really made a big splash with the media was Ebola, earlier in this decade. But we really haven't heard much recently. Have things calmed down as far as new and novel diseases? Or are we just hearing less about outbreaks these days?

NW: Well, I mean I think we've had really substantive important pandemics. If you take a look at SARS, for example. SARS really only infected about 1,200 individuals, but its impact was tremendous. It was billions of dollars of economic impact all throughout the world. Even in a place like Singapore, where you had a small number of cases, you had an incredibly substantive financial impact. And then, of course, right now we have H5N1 which is -- they call it the bird flu. Actually, most influenzas are bird influenzas, so it's a little bit of a misnomer. But H5N1 is a virus which is spreading around the world in birds. And if it does make a transition into humans, which some bird flu will over the next 20 to 30 years, it could be incredibly devastating. So I think that these are kind of constant and present dangers. They're things that are increasing over time simply because of the way that we're connected as a human population.

(continue reading)

tags: emerging tech, interviews, pandemics, virusescomments: 0
submit: Reddit Digg stumbleupon   

 

Thu

Feb 19
2009

James Turner

ETech Preview: Science Commons Wants Data to Be Free

by James Turnercomments: 4

You may also download this file. Running time: 00:31:04

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

John Wilbanks has a passion for lowering the barrier between scientists who want to share information. A graduate of Tulane University, Mr. Wilbanks started his career working as a legislative aide, before moving on to pursue work in bioinformatics, which included the founding of Incellico, a company which built semantic graph networks for use in pharmaceutical research and development. Mr. Wilbanks now serves as the Vice President of Science at Creative Commons, and runs the Science Commons project. He will be speaking at The O'Reilly Emerging Technology Conference in March, on the challenges and accomplishments of Science Commons, and he's joining us today to talk a bit about it. Good day.

John Wilbanks: Hi, James.

JT: So science is supposed to be a discipline where knowledge is shared openly, so that ideas can be tested and confirmed or rejected. What gets in the way of that process?

This photograph is licensed to the public under the Creative Commons Attribution-Share Alike 3.0 license by Fred Benenson

JW: Well, most of the systems that scientists have evolved to do that: sharing, confirmation and rejecting, evolved before we had the network. And they're very stable systems, unlike a lot of the systems that we have online now, like Facebook. For science to get on the Internet, it has to really disrupt a lot of existing systems. Facebook didn't have to disrupt an existing physical Facebook model. And the scientific and scholarly communication model is locked up by a lot of interlocking controls. One of them is the law. The copyright systems that we have tend to lock up the facts inside scientific papers and databases, which prevents a lot of the movement of scientific information that we take for granted with cultural information.

Frequently, contracts get layered on top of those copyright licenses, which prevent things like indexing and hyperlinking of scholarly articles. There's also a lot of incentive problems. Scientists and scholars tend to have an incentive to write very formally. And the Internet, blogging, email, these are all very informal modalities of communication.

(continue reading)

tags: creative commons, data, interviews, sciencecomments: 4
submit: Reddit Digg stumbleupon