Entries tagged with “privacy” from O'Reilly Radar

Thu

Nov 19
2009

Brian Ahier

Health gets personal in the cloud

Google Health Beta and Microsoft's My Health Info

by Brian Ahier@ahiercomments: 13

Healthcare is one of the biggest industries in the world. The United States spends over 17% of its GDP on healthcare and the issue of the industry's future is being hotly debated in Congress. Whatever happens to other elements of health reform, health information technology will play a key role in moving us towards the goal of bending the cost curve and improving quality and clinical outcomes. A Personal Health Record (PHR) is one way that patients can have some control of their own health data, while providing an interoperable platform for sharing relevant clinical data between providers. Healthcare is changing rapidly and there are some important trends worth watching.

Healthcare in the near future will be quite different than it is today. Web enabled technology is already changing the way medicine is practiced. As the digital nation comes of age we will see new opportunities, and new challenges, bringing healthcare in America into the 21st century. Health consumers will come to expect they will have control over their own health data. Having secure, interoperable access to clinical data will allow patients to partner with their care providers in new ways incorporating Web 2.0 principles.

For example, Google announced at the Health 2.0 conference that they have entered into a partnership to provide telehealth services through their Google Health platform using MDLiveCare. With the integration of MDLiveCare technology, Google can provide a service that offers patients access to doctors from remote locations, via webcam or telephone, into its personal health record offering. This will be particularly valuable for those who are caring for their loved ones from far away. My family is scattered around the country and caring for our mother with advanced stage Alzheimer's was quite a challenge that would have benefited from this type of service. Here is a screenshot of Google Health: google-health.jpg

"Patients remember less than 25% of what they're told when they consult with a doctor,” said Bob Smoley, CEO, MDLiveCare, in the statement. "By directly synchronizing the information that's shared…we're able to provide patients with a convenient solution to review their physician or therapist encounters."

(continue reading)

tags: data portability, electronic medical records, health 2.0, health care, healthcare, phr, privacycomments: 13
submit: Reddit Digg stumbleupon   

 

Tue

Nov 10
2009

Andy Oram

Converting to Electronic Health Records: fits and starts

by Andy Oram@praxagoracomments: 26

The people of the United States are finally pulling together around the goals of reducing health care costs (by far the highest per capita in the world) and improving outcomes (we have the worst health of any developed country). Everyone seems to recognize the critical importance of data and communications in these efforts. So several of us at O'Reilly Media, having been involved with information technologies for some time, are tracking the issues that come up in deploying computer technology in health care--not just to streamline payments, not just to facilitate access by doctors to records, but actually to create new ways to deliver and track health care.

I recently attended a forum on how my state, Massachusetts, is facilitating the move to Electronic Health Records, a prerequisite for many things doctors, patients, and insurance companies can do to improve health. It's notable that the chief sponsor of the event, the Massachusetts Health Data Consortium, derives a lot of its support from insurance companies. Lots of invective has be\ en thrown at these companies recently, but the questions of technology can pull together the insurers, providers, and patients in a common quest. [AO: My original blog said that insurance companies set up MHDC, but this was incorrect.]

My own understanding of the progress and frustrations in deploying heath care technology was enhanced by the conversations I had that day and the statistics bandied about.

(continue reading)

tags: data portability, electronic medical records, health care, privacycomments: 26
submit: Reddit Digg stumbleupon   

 

Mon

Nov 2
2009

Nat Torkington

Four short links: 2 November 2009

Inside Botnets, Creating Choropleths, Privacy Simplified, Massively Machiavellian Online Social Gaming

by Nat Torkington@gnatcomments: 1

  1. Your Botnet is My Botnet (PDF) -- 2008 USENIX Security paper analysing >70G of data gathered when security researchers hijacked the Torpig botnet. A major limitation of analyzing a botnet from the inside is the limited view. Most current botnets use stripped-down IRC or HTTP servers as their command and control channels, and it is not possible to make reliable statements about other bots. In particular, it is difficult to determine the size of the botnet or the amount and nature of the sensitive data that is stolen. One way to overcome this limitation is to “hijack” the entire botnet, typically by seizing control of the C&C channel. [...] As a result, whenever a bot resolves a domain (or URL) to connect to its C&C server, the connection is redirected or sinkholed. This provides the defender with a complete view of all IPs that attempt to connect to the C&C server as well as interesting information that the bots might send..
  2. cartographer.js -- build thematic maps using Google Maps. To be precise, you can build a choropleth, which is my word of the day. (via Simon Willison)
  3. Making Privacy Policies Not Suck (Aza Raskin) -- interested in developing a standard set of privacy policy components the way that Creative Commons has created a standard set of copyright license components.
  4. Scamville: The Social Gaming Ecosystem of Hell (TechCrunch) -- many of those games on Facebook that your friends play are evil. To get in-game money or objects, they'll let you take a survey but at the end you're signed up for crap you never wanted. Related: this article on monetizing social networks which talks about social gaming's business model.

tags: creative commons, gaming, google maps, mapping, privacy, research, security, socialcomments: 1
submit: Reddit Digg stumbleupon   

 

Mon

Oct 26
2009

Andy Oram

What sociologist Erving Goffman could tell us about social networking and Internet identity

by Andy Oram@praxagoracomments: 4

I just finished Erving Goffman's classic sociological text, The Presentation of Self in Everyday Life. A friend told me to read this for an exploration into what "identity" means online, and I did find that the book offers some useful frameworks.

I have to admit, to start with, that it's a rather distasteful work: personally, I don't see my entire life as a performance and everyone around me as an audience. That seems to be just what Goffmn wants me to do. (He calls this attitude his "dramaturgical perspective.")

Furthermore, the book was published in 1959, just before the social revolution of the 1960s exploded the expectations of formality it documents--all the assumptions about proper behavior, social distinctions, making a good impression, and so forth. These distinctions remain, of course, but people tend to behave in ways that consciously disavow differences in class and status instead of highlighting them (at least in the United States).

Goffman's underlying framework is still valid, though, and it casts a useful light on some of the dilemmas of going online.

(continue reading)

tags: Erving Goffman, identity, privacy, reputation, The Presentation of Self in Everyday Life, trustcomments: 4
submit: Reddit Digg stumbleupon   

 

Wed

Oct 14
2009

Andy Oram

Vendor Relationship Management workshop

by Andy Oram@praxagoracomments: 4

Nobody knows you as well as you do. Or do they? Let's run a test. Do you know what percentage of your food bill went to processed products? Or what type of coupons (store coupons, newspaper coupons, etc.) is most likely to get you to switch brands? I bet someone out there knows.

This kind of data mining is the modern companion to Customer Relations Management, which is the science of understanding customers and trying to get repeat business. CRM can offer many valuable benefits, but ultimately the control lies with the vendor, not the customer.

This bothers long-time marketing maverick and Cluetrain Manifesto coauthor Doc Searls. Several years ago he thought up an alternative that would put the data and the control back in the hands of customers, and called it Vendor Relationship Management. He's been pursuing that dream for two years as a Berkman Center fellow at Harvard, and this week he ran the second workshop hosted by Harvard on the topic.

I dropped in and out for a few hours and picked up some ideas, annotating them (as always) with ideas of my own.

(continue reading)

tags: Berkman Center, commerce, CRM, customers, Doc Searls, economics, identity, P2P, peer-to-peer, privacy, reputation, trust, Vendor Relationship Management, VRMcomments: 4
submit: Reddit Digg stumbleupon   

 

Tue

Aug 18
2009

Nat Torkington

Four short links: 18 August 2009

iPhone App Backstory, Cookie Resurrection, The Entrepreneuralism Lickmus test, and An Interesting Database

by Nat Torkington@gnatcomments: 2

  1. The Making of the NPR News iPhone App -- interesting behind-the-scenes look, with sketches and all. Station streams, however, presented a larger challenge. To begin with, NPR didn't have direct stream links for any of its stations, so we built a Web spider that identified and captured more than 300 iPhone-compatible station streams. After that first pass, we worked with our station representatives to manually test each stream. In the process they found enough new streams to double our database. All of these streams are delivered to the app from NPR's Station Finder API. (via mattb on Twitter)
  2. You Deleted Your Cookies? Think Again (Wired) -- Flash keeps its own cookies, which are harder to delete. Several services even use the surreptitious data storage to reinstate traditional cookies that a user deleted, which is called ‘re-spawning’ in homage to video games where zombies come back to life even after being “killed,” the report found. So even if a user gets rid of a website’s tracking cookie, that cookie’s unique ID will be assigned back to a new cookie again using the Flash data as the “backup.” (via Simon Willison)
  3. Would You Lick It? (Rowan Simpson) -- clever example of what it takes to be an entrepreneur.
  4. FluidDB -- a shared "in the cloud" database built around tags: an object is a container for a set of tags which are name:value pairs, tag names have simple namespaces (e.g., "gnat/review" is the "review" tag in my namespace), all objects are world readable and writable but there are ACLs for tags, values can be any type (string, number, URL, Excel spreadsheet), and there's a simple query language. I'm curious to see what applications spring up around shared data. They're in limited alpha, controlling the # of users, so register now to play before everyone else.

tags: big data, databases, design, flash, iphone app, news, npr, privacy, security, startupscomments: 2
submit: Reddit Digg stumbleupon   

 

Tue

Aug 11
2009

Brady Forrest

Locational Privacy: The EFF Weighs in on Safeguarding Your Location

by Brady Forrest@bradycomments: 4

eff locational privacy

Increasingly our devices know where we are and are able to share that information. This is a trend that will enable many new services, but at the same time puts the consumer and the service provider at risk. The consumer is at risk of their "future self" forgetting that they are being tracked and then having their location being recorded unintentionally. The company is put at risk just by having this data stored. If they have user data then it is subject to subpoena or unintentional releases.

The EFF has weighed in on this trend with a timely whitepaper: On Location Privacy and How to Avoid Losing It Forever. The paper includes a number of scenarios with actionable solutions and a number of reason why companies should care. The scenarios are:

Anonymous payment and credentials - Many toll roads use electronic transponders to extract payment from drivers. These systems are not necessarily designed with the driver's privacy in mind. This Boston Globe article from 2007(!) talks of EZ-Pass records being subpoenaed in a case (there are many other articles on Google -- some going back as far as 1997). The EFF suggest letting users using an anonymous and encrypted form of electronic cash(ecash). This will still allow the service provider to track flow and estimate revenue on a realtime basis, while protecting their users. For those working on mobile payments or with sensors this is a scenario (and potential solution) to pay attention to. If you need to make sure that only certain people can gain access then you may need to use anonymous credentials to preserve locational privacy.

Location-based search - Often when a user does a search from their cellphone they are identified along with their location and their query. The user "needs" to be identified so that any personal information can be shared. The EFF correctly depicts this interaction as such:

"This is Frank's Nokia here. I see the following five WiFi networks with the following five signal strengths". The service replies "okay, that means you're at the corner of 5th and Main in Springfield". Then your device replies, "What burger joints are nearby? Are any of Frank's friends hanging out nearby?".

This is something that all of us with smartphones (and who use Loopt, Brightkite, Twitter, or use Find My iPhone to update Latitude) are doing multiple times day. An alternative method would be to have the phone send their location and query anonymously. The service can return that data along with a set of encryped data for that location. If any of it is aimed at the user they will be able to decrypt it. The EFF depicts this interaction as such:

"Hi, this is a mobile device here. Here is a cryptographic proof that I have an account on your service and I'm not a spammer. I see the following five wireless networks." The service replies "okay, that means you're at the corner of 5th and Main in Springfield. Here is a big list of encrypted information about things that are nearby". If any of that encrypted information is a note from one of Frank's friends, saying "hey, I'm here", then his Nokia will be able to read it. If he likes, he can also say "hey, here's an encrypted note to post for other people who are nearby". If any of them are his friends, they'll be able to read it.

The company still gets anonymized location data and the query, while delivering the same features. The problem with this scenario is that the web (and mobile in particular) favor speed. If a mobile service added several seconds to send down an encrypted payload of data that is much larger than needed then that service will lose users (or never gain them). The mobile handsets and networks that most of us are using now are too limited and to handle anything more than the bare minimum.

What's the value of locational privacy to a service?
If a service provider does not ever receive location data, the EFF points out that company potentially giving itself a competitive advantage. If you don't log it then you can't be subpoenaed to provide that user data and you (probably) won't ever inadvertently reveal someone's location incorrectly. The EFF is correct: not having to answer subpoenas can save costs for companies and not having a well-publicized privacy debacle is priceless.

The paper also points out that people are becomingly increasingly cognizant of privacy issues and that you can champion privacy as a selling point. I am not sure that I buy this argument completely. I think that quite often people don't realize their location and identity are being recorded. So though there may be increasing awareness it's not a selling point that will get a company much right now. Based on the adoption of social location services, I think that people are more concerned with how their location is shared with other people on and off the service than whether it is logged at all.

When considering these solutions, we need to take into account what the impact on the user experience will be. If it requires too much extra work or is not very transparent on the user's part then the solutions may end up killing the product before it starts. I fear that the encrypted payload used to anonymize local search would hamper any mobile service that tried it -- given our current set of handsets and networks (at least in the US).

Personally, I am a fan of sharing (and in some cases storing) my location data with a limited set of third-party services. However, the services that exist right now are lacking. They do not necessarily make it clear how long they will keep the data or how it will be shared with others. I often do not have the ability to delete my data from a service. I want to share my location (within bounds) and I pay attention to when I do so, but I do fear that my future self will forget--and I think that service providers have a responsibility to protect their users from themselves.

(Disclosure: I am a member of the EFF and Tim is a former Board Member)

tags: eff, geo, privacycomments: 4
submit: Reddit Digg stumbleupon   

 

Fri

Aug 7
2009

Nat Torkington

Four short links: 7 August 2009

Recovery.gov, Meme tracking, RFID Scans, Open Source Search Engines

by Nat Torkington@gnatcomments: 1

  1. Defragging the Stimulus -- each [recovery] site has its own silo of data, and no site is complete. What we need is a unified point of access to all sources of information: firsthand reports from Recovery.gov and state portals, commentary from StimulusWatch and MetaCarta, and more. Suggests that Recovery.gov should be the hub for this presently-decentralised pile of recovery data.
  2. Memetracker -- site accompanying the research written up by the New York Times as Researchers at Cornell, using powerful computers and clever algorithms, studied the news cycle by looking for repeated phrases and tracking their appearances on 1.6 million mainstream media sites and blogs [...] For the most part, the traditional news outlets lead and the blogs follow, typically by 2.5 hours [...] a relative handful of blog sites are the quickest to pick up on things that later gain wide attention on the Web. Confirming that blogs and traditional media have a symbiotic relationship, not a parasitic one. (via Stats article in NY Times)
  3. Feds at DefCon Alarmed After RFIDs Scanned (Wired) -- RFID badges make for convenient security, and for convenient attack. Black hats can read your security cards from 2 or 3 feet away, and few in government are aware of the attack vector. To help prevent surreptitious readers from siphoning RFID data, a company named DIFRWear was doing brisk business at DefCon selling leather Faraday-shielded wallets and passport holders lined with material that prevents readers from sniffing RFID chips in proximity cards.
  4. A Comparison of Open Source Search Engines and Indexing Twitter -- Detailed write-up of the open source search options and how they stack up on a pile of Tweets. While researching for the Software section, I was quite surprised by the number of open source vertical search solutions I found: Lucene (Nutch, Solr, Hounder), Sphinx, zettair, Terrier, Galago, Minnion, MG4J, Wumpus, RDBMS (mysql, sqlite), Indri, Xapian, grep … And I was even more surprised by the lack of comparisons between these solutions. Many of these platforms advertise their performance benchmarks, but they are in isolation, use different data sets, and seem to be more focused on speed as opposed to say relevance. (via joshua on Delicious)

tags: big data, gov2.0, meme wars, open source, privacy, rfid, search, security, transparency, twitter, visualizationcomments: 1
submit: Reddit Digg stumbleupon   

 

Mon

Aug 3
2009

Andy Oram

Privacy and open government: conversations with EPIC and others about OpenID

by Andy Oram@praxagoracomments: 2

A few days ago I proposed a way to offer more privacy to people visiting government web sites. This blog builds on that proposal, which was largely technical, by examining the policy and organizational issues that swirl around it.

My ideas are informed by a discussion I had with Lillie Coney, Associate Director of the Electronic Privacy Information Center. The blog is also inspired by two comments on the earlier blog and brief email I exchanged with one commenter, which intertwine with Coney's in intriguing ways.

As I said in the first blog, my proposal focused on a very narrow question driven by the Obama Administration's interest in revising a memorandum from 2000 concerning the use of cookies in web browsers. The proposal suggested a way to better approach anonymity, but didn't look at the related social and political issues:

  • The kinds of privacy and the degree of privacy people want
  • When it's appropriate to make visitors identify themselves, or at least to provide some persistent identity
  • Whom people trust to maintain identity information

This blog offers a number of points about those issues. The sections are:

(continue reading)

tags: democracy, EPIC, governance, Government 2.0, identity, OMB, open government, OpenID, privacy, transparencycomments: 2
submit: Reddit Digg stumbleupon   

 

Thu

Jul 23
2009

Nat Torkington

Four short links: 23 July 2009

Wave Fed, Fake Steve, Vanish and Reconnoiter

by Nat Torkington@gnatcomments: 0

  1. Google Wave Federation Protocol -- the interesting part of Wave for me is the system for keeping databases coherent. There's a reference implementationl.
  2. I shouldn't have yelled at that Chinese guy so much -- the post that redeemed Fake Steve Jobs in my eyes. We all know that there's no fucking way in the world we should have microwave ovens and refrigerators and TV sets and everything else at the prices we're paying for them. There's no way we get all this stuff and everything is done fair and square and everyone gets treated right. No way. And don't be confused -- what we're talking about here is our way of life. Our standard of living. You want to "fix things in China," well, it's gonna cost you. Because everything you own, it's all done on the backs of millions of poor people whose lives are so awful you can't even begin to imagine them, people who will do anything to get a life that is a tiny bit better than the shitty one they were born into, people who get exploited and treated like shit and, in the worst of all cases, pay with their lives.
  3. Vanish -- time-limited encryption in a Firefox plugin.
  4. Reconnoiter -- holy cow web console and analytics for data centers, from the magic Theo Schlossnagle. He built the screenshots for his OSCON presentation, graphing streams of live performance data from dozens of data centers, while on a Virgin America flight.

tags: analytics, china, data center, encryption, google wave, opensource, privacycomments: 0
submit: Reddit Digg stumbleupon   

 

Fri

Jun 19
2009

Timothy M. O'Brien

Dramatic Increase in Number of Tor Clients from Iran: Interview with Tor Project and the EFF

by Timothy M. O'Briencomments: 2

You may also download this file. Running time: 00:06:15

Anonymous proxies are in the news this week as Iranians are using proxies outside of Iran to communicate information about ongoing protests to others within the country. I've received several queries this week from non-technical colleagues about proxy servers. Is it legal to run a proxy server? Does running a proxy server violate my agreement with my broadband provider? I decided to track down some experts and get some perspective on different proxy servers and the laws surrounding them. In this entry, I speak with Andrew Lewman, the Executive Directory of the Tor Project about Tor and I also get some legal guidance from Peter Eckersley of the Electronic Frontier Foundation.

In this interview I ask Andrew to briefly introduce Tor and talk about some interesting useage statistics that show adoption of this anti-surveillance technology from within Iran. He answers a question about whether Tor is "unstoppable" and comments on the legality of running a Tor node. For the full interview, listen here.

The Tor Project

First, what is Tor? From The Tor Project:

Tor is free software and an open network that helps you defend against a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships, and state security known as traffic analysis. Tor protects you by bouncing your communications around a distributed network of relays run by volunteers all around the world: it prevents somebody watching your Internet connection from learning what sites you visit, and it prevents the sites you visit from learning your physical location. Tor works with many of your existing applications, including web browsers, instant messaging clients, remote login, and other applications based on the TCP protocol.

When you run a Tor node, you are adding another node to a grid of computers that are used to establish random encrypted paths between each node to satisfy any given request. Law enforcement, military agencies, intelligence networks, journalists, and dissidents frequently use Tor to bypass restrictions and avoid surveillance. Andrew Lewman, Tor's Executive Director, wanted to be very clear that the Tor Project itself does not take positions on conflicts, and does not involve itself in resisting oppressive regimes. In response to a question about traffic from Iran, Andrew Lewman produced the following data:

New client connections from within Iran have increased nearly 10x over the past 5 days. Overall, Tor client usage seems to have increased 3x over the past 5 days. There are a lot of rough numbers in these statements, and they are very conservative. However, the source data we're reviewing continues to show these results.

For more information, see Andrew's blog post from last night: "Measuring Tor and Iran". Here's a graph from Andrew Lewman of Tor client count over the past few days, it appears that Tor is becoming an increasingly popular way for people in Iran to use the network to avoid surveillance.

new_tor_clients_from_iranian_ip_space.png

But is it legal? The Legality of Running a Proxy Server

Peter Eckersley, Staff Technologist at the EFF, took some time to answer some very simple questions about EULAs, Tor, and the legality of running a proxy server.

Q: Various broadband providers state in EULAs that a customer must secure the equipment used to provide access to the Internet. What is the position of the EFF with regard to the legality of these EULAs? Are people breaking the law by providing an open access router?

Peter Eckersley: It's impossible to comment on broadband EULAs in general; each of them has different specific language and ISPs deploy them in different ways. We aren't aware of any case in which a broadband subscriber was sued for running an open wireless router, a proxy, or similar technology for sharing their connection with others.

Q: The last update to the Tor FAQ from the EFF on the Tor site was from 2005. Have there been any developments with the EFF in relation to Tor? Since 2005 is there more clarity as to the legality of running an Exit Node in a Tor network?

Peter Eckersley: The EFF Tor FAQ still reflects our opinions about the legality of Tor. It hasn't changed since 2005 because there haven't been any published cases or other events that have changed our views.

Q: What advice would the EFF have for anyone new to setting up a proxy server this week (as many have done to support protestors in Iran)? Is it legal? What issues do people need to be aware of?

Peter Eckersley: EFF's advice at this point is that people should consider setting up Tor bridge nodes or Tor routers instead of proxy servers. Several thousand new proxy servers have appeared in the past week, but we fear than unencrypted proxies leave Iranians vulnerable to surveillance and continued censorship by the Iranian government. SSL ecrypted proxies are better in this respect, but they are harder to set up than Tor routers, and there are some reports that the Iranian government has succeeded in blocking access to at least some encrypted proxies.

Fixed Typo @ 3:23 PM Central Saturday: One of my questions for the EFF had a rather important typo - I had typed Iraq instead of Iran. Fixed.

tags: encryption, government, privacy, securitycomments: 2
submit: Reddit Digg stumbleupon   

 

Fri

Jun 12
2009

Nat Torkington

Four short links: 12 June 2009

by Nat Torkington@gnatcomments: 2

  1. New Media Challenges: Legal and Policy Considerations for Federal Use of Web 2.0 Technology (Center for American Progress) -- report on the issues around Web 2.0 use in Government, which include privacy, security, Public Records Act, advertising, etc. See also It's Not the Campaign Anymore: How the White House Is Using Web 2.0 Technology So Far from the same group.
  2. Government Data and the Invisible Hand -- Ed Felten has written a fantastic piece on the relationship between data, presentations of the data, and the government departments that produce the data. It is full of powerful recommendations: The best way to ensure that the government allows private parties to compete on equal terms in the provision of government data is to require that federal websites themselves use the same open systems for accessing the underlying data as they make available to the public at large. (via timoreilly on Twitter)
  3. Fast Modularity Community Structure Inference Algorithm -- This algorithm is being widely used in the community of complex network researchers, and was originally designed for the express purpose of analyzing the community structure of extremely large networks (i.e., hundreds of thousand or millions of vertices). The original version worked only with unweighted, undirected networks. I've recently posted a version that works on weighted, undirected networks. (via mattb on Delicious)
  4. First Driver for USB 3.0 -- After a year-and-a-half's worth of work, Intel hacker Sarah Sharp announced that Linux will be the first operating system supporting USB 3.0. (via deusx on Delicious)

tags: gov 2.0, government, graphing social patterns, linux, open source, privacy, social software, web 2.0comments: 2
submit: Reddit Digg stumbleupon   

 

Thu

May 14
2009

Andy Oram

Credit card company data mining makes us all instances of a type

by Andy Oram@praxagoracomments: 19

The New York Times has recently published one of their in-depth, riveting descriptions of how credit card companies use everything they can learn about us. Any detail can be meaningful: what time of day you buy things, or the quality of the objects you choose.

The way credit collectors use psychology reminds me of CIA interogators (without the physical aspects of pressure). In fact, they're probably more effective than CIA interogators because they stick to the basic insight that kindness elicits more cooperation than threats.

So who gave them permission to use our purchase information against us? What law could possibly address this kind of power play?

There's another disturbing aspect to the data mining: it treats us all as examples of a pattern rather than as individuals. Almost eleven years I wrote an article criticizing this trend. The New York Times article shows how much we've lost from what we consider essential to our identity--our individuality.

Update

This article drew six comments in a few hours--thoughtful and valid comments, which have made me set down attitudes into words. Now we can look put the attitudes under a light and see what makes sense, or doesn't, to readers.

The article contained two levels of criticism: a criticism of data mining to build up composite pictures of individuals, and a criticism of the use of data accumulated from routine transactions to manipulate those individuals.

Building up a composite picture

Of course, a company that reaches out and does any marketing has to target people. Someone who bought the O'Reilly book Even Faster Web Sites (sorry about the plug) might appreciate a notification about our upcoming Velocity conference, which was founded by the book's author and covers the same topics. Someone who bought a book on a totally different subject wouldn't want or respond to the notification. O'Reilly does this kind of targeting, like most companies, and until everybody participates in truly frictionless information exchanges, companies will have to continue doing it.

Aggregated information is useful too. Organizations that mine public data for evidence of health epidemics can identify likely sites and investigate them further. The data mining is understood to provide an approximation of the truth.

Where I see a problem is when the increasing quantity of constant information refinement shades over into a qualitative change. There's a difference between a campaign targeted to 500 likely customers and a campaign targeted to one.

At some point the composite portrait starts to look so much like a person that corporate decision makers can begin to believe it is the person. The portrait becomes like a replicant, or like the statues that came to life in myths from Pygmalion to Pinocchio.

Joseph Weizenbaum, creator of the classic Eliza program, was shocked to see that people treated his "doctor" program like a human interviewer. There were plenty of computer programs that prompted the user with questions and gave varied responses based on the answers, but none had imitated a person so realistically.

Nowadays, nobody would be drawn in by Eliza. And perhaps companies and customers alike will get used to composite portraits. Perhaps the companies will send their composite to each of us and we can update it to make it more accurate. That will be a very different world, though.

Now we can turn to the next level, manipulation.

Manipulation

I've read numerous accounts in biographies and articles about interrogations, and talked to a couple people who have undergone interrogations. I haven't been on either side of an interrogation, but I've been deposed for a court case. All these situations remind me vividly of the exchanges reported in the New York Times article.

In these exchanges, a well-armed caller is laying, like a silkscreen, a composite over the real person and trying to manipulate the result. It's not exactly a case of asymmetric knowledge (because at least in theory, a customer could also learn a lot about a company and use that knowledge to manipulate it). It's more insidious: an employee carrying out a precise initiative on behalf of a company--a machine in the service of a goal--approaching the targeted customer in an informal manner that brings out a natural, human, empathetic reaction in customer.

Interrogation always takes place in the context of an open or implied threat--there would be no reason for making the contact otherwise--but as I mentioned in the article, the interrogation goes best when the threat is raised only rarely and strategically. A feigned sympathy and heart-to-heart engagement is the path to the most desired outcome.

In a sense, now, the employee has become the replicant. He is using a careful counterfeit of human responses to induce the behavior he or she is paid to induce. This is ethical when dealing with a criminal, although even then US law limits (based on the Fourth Amendment) the gathering of relevant information by the interrogator beforehand. I question how ethical it is in a business situation, especially when exploiting information given by the customer for entirely different purposes.

tags: bill collectors, credit cards, data mining, data retention, mining, privacycomments: 19
submit: Reddit Digg stumbleupon   

 

Tue

Apr 21
2009

Nat Torkington

Four short links: 21 Apr 2009

by Nat Torkington@gnatcomments: 0

Space arrays, mobile hell, book scanners, and open source brains:

  1. Great Brazilian Sat-Hack Crackdown (Wired) -- Satellite hackers in Brazil are bouncing ham signals off a disused US military satellite array. Truck drivers love the birds because they provide better range and sound than ham radios. Rogue loggers in the Amazon use the satellites to transmit coded warnings when authorities threaten to close in. Drug dealers and organized criminal factions use them to coordinate operations. [...] "Nearly illiterate men rigged a radio in less than one minute, rolling wire on a coil." As William Gibson said, "the street finds its own uses for things." One man's space junk is another man's Make project. (via BoingBoing)
  2. My Students, My Cellphone, My Ordeal -- there's probably a market selling lightweight forensic tools to schools, specifically to avoid scenarios like this poor man's.
  3. DIY High Speed Book Scanner From Trash and Cheap Cameras (Instructables) -- $300 of parts gets you a reasonably high-quality scanner. It doesn't have an automatic page turner, but it's still a step up on "open the scanner lid, change the page, close the lid, hit scan, wait, [repeat until braindead]". We have a huge legacy of analog, and we're going to need consumer-grade consumer-priced systems if we are to rip-mix-burn our cultural legacy. What would the Google Books settlement look like if we all had high-speed scanners to do to our bookshelves what iTunes did to our CD shelves? (via BoingBoing)
  4. OpenCog Brainwave Projects in Google's Summer of Code -- in case you think GSoC is all about GNOME apps getting alternate shortcuts for DVORAK keyboards, there's some esoteric stuff being approved. I wish that when I was a college student someone had paid me to work on a Application of Pleasure Algorithm Project.

tags: book search, brain, google, hardware, make, mobile, open source, privacycomments: 0
submit: Reddit Digg stumbleupon   

 

Thu

Apr 9
2009

Nat Torkington

Four short links: 9 Apr 2009

by Nat Torkington@gnatcomments: 1

Scifi, audiences, transparency, and the peril of public life. No links tomorrow, as I'll be preparing for our village fete:

  1. The Fantastic That Denies It's Fantastic: Science Fiction Talk at the Royal Institution -- Matt Jones's fascinating notes from this talk by two academics make thought-provoking reading. “SF is a response to the cultural shock of discovering our marginal place in an alien universe” ... “an attempt put the stamp of humanity back on the universe”
  2. Visualize Your Audience (Rowan Simpson) -- If you don’t think it’s a big deal for your site to be broken or off line while you make changes … think of all of the people who happen to be visiting at that point and imagine what it would feel like to have them all in the room with you while you flick the switch. No matter how small the number it would probably feel like a lot of people. And, you might be motivated to get the site back up more quickly if they were all standing behind you impatiently looking over your shoulder.
  3. Attribution and Affiliation on All Things Digital (Waxy) -- this reminds me how rare it is to see someone about an Internet blowup where someone has actually talked to the parties involved.
  4. We Live in Public (Caterina Fake) -- Caterina watched "We Live in Public" by Ondi Timoner and concurs with Jason Calacanis's musings about the Internet's ability to promote the worst behaviour: This kind of sociopathic behavior -- treating people like things -- is one of the most horrifying aspects of online interactions, and something that its very nature promotes.

tags: journalism, privacy, science, usabilitycomments: 1
submit: Reddit Digg stumbleupon   

 

Fri

Mar 13
2009

Nat Torkington

Four short links: 13 Mar 2009

by Nat Torkington@gnatcomments: 0

Museums, Labs, Businesses, and Hash--all in today's four short links:

  1. Shelley Bernstein Talks About the Brooklyn Museum at the National Library of New Zealand (Paul Reynolds) -- I've written about Shelley's work before. Brooklyn [Museum] is not about using social media as just another marketing and visitor experience tool-set. Rather, as Bernstein said last night, Brooklyn Museum itself is now a social network - that is its job - to be a center for the community to have a conversation. Wonderful to see New Zealand continuing to learn from the best.
  2. Google Labs India -- interesting projects, including Digital Noticeboard and SMS Channels (Google ID required to view the latter). Interesting to see the projects worked on in different countries. The latter is like Mozes.
  3. Privacy and Free Speech, It's Good for Business (PDF!) -- Northern California ACLU have produced a book aimed at businesses that frames free speech issues as a business good: The practical tips and real-life business case studies in this Guide will help you to avoid having millions read about your privacy and free speech mistakes later. The advice is straightforward and specific, not of the vague and "don't be evil" variety. Give users an opportunity to defend their anonymity. Provide notice, within no more than seven days of receipt of a subpoena, to each user whose personal information is sought, and inform the user of her right to file a motion to quash (fight) the subpoena. Give the user at least thirty days from the time notice is received to file a motion to quash the subpoena. (via BoingBoing)
  4. pHash, The Open Source Perceptual Hash Library -- a perceptual hash is a signature for a file, built in a way that two files that represent similar things (e.g., two photographs of the same poster). (via Joshua's delicious stream)

tags: copyright, google, mobile, open source, privacy, social webcomments: 0
submit: Reddit Digg stumbleupon   

 

Thu

Nov 20
2008

Nat Torkington

Web Meets World: Privacy and the Future of the Cloud

by Nat Torkington@gnatcomments: 4

Yesterday I gave a talk to the Privacy Forum in Auckland, New Zealand, titled Web Meets World: Privacy and the Future of the Cloud. The talk was intended as a scene setter for a discussion with the audience, about 70 lawyers, technologists, consultants, and public policy wonks. They responded well to the challenge, and we talked about the nature of privacy, how expectations change over time, trust.salesforce.com, and more. The presentation is embedded below, and can be downloaded (CC-Attribution-ShareAlike) from Slideshare (I recommend expanding the preso to full-screen so you can read the notes, which contain the text of the talk).

tags: presentations, privacy, ubicomp, web as platform, web servicescomments: 4
submit: Reddit Digg stumbleupon   

 

Fri

Jul 11
2008

David Recordon

Is SocialMedia Overstepping Facebook's Privacy Line?

by David Recordon@daveman692comments: 6

SocialMedia is an advertising network which places ads within social applications such as those on Facebook and MySpace. SocialMedia claims to be more effective in this type of advertising, due to a patent-pending technology they've developed named FriendRank. SocialMedia CEO Seth Goldstein claims that SocialMedia ads can pay up to 2.5 times more than traditional ads within social networks and that early tests show people are 200 times more likely to respond to a "social ad". See CNET's coverage of SocialMedia's FriendRank launch for more detailed information.

This sounds very compelling, but immediately raises serious questions around privacy and whether a Facebook user knows that SocialMedia is using their profile information in this way. While technology certainly makes this possible, are user expectations being set correctly? Facebook's Beacon functionality faced an uproar at its launch earlier this year not for the technology it provided, but rather for upsetting expectations around privacy, information sharing, and ultimately ad targeting. So how is SocialMedia getting access to the type of information required to create such a compelling social advertising network?

Facebook provides a customizable public profile page for each user (mine is here) which by default makes your name, picture, and a few friends publicly available. SocialMedia could and most likely is using this public information, just like anyone else could too. Multiple sources including ValleyWag and others familiar with the ad platform say that SocialMedia doesn't stop there. Rather they're very quietly also accessing information from Facebook Platform applications directly. This was originally broken by The Social Times a few weeks ago:

So how does SocialMedia display these targeted ads outside of Facebook? Through a collection of data via applications in combination with images obtained via user public profiles and unique cookies they can piece together who you are and who some of your friends are. This is off of Facebook.

The question then is, are social applications properly disclosing the fact that they give your information to SocialMedia, and is that action covered by a clear privacy policy? This is not about the technology behind how SocialMedia might access this information, but rather making sure that the technological implementation matches user expecations. We can start by looking at the process of adding an application on Facebook which does not appear to use SocialMedia for advertising:

If you've ever installed a Facebook application, you're familiar with this screen, which prompts you to grant explicit permissions to each and every application you choose to install. It should be noted that Facebook Platform does not have any affordance for allowing an application to share your information data with a third party. Facebook's Developer Terms of Service explicitly prohibit such sharing, and the technological implementation of the Facebook Platform API make it extremely likely that sharing such data would sometimes involve sharing a developer's secret key with SocialMedia as well. All of this is explicitly and strictly prohibited by Facebook's Developer Terms of Service: (emphasis is mine for readability):

"Facebook Platform" means a set of APIs and services provided by Facebook that enable websites and applications (collectively, "Applications") to retrieve data relating to Facebook Users made available by Facebook and/or retrieve authorized data from other Applications. The term "Facebook Platform" includes any data, images, text, content, code, APIs, tools or other information or materials provided by Facebook through or in connection with such APIs and services (collectively, the "Facebook Properties").

...

5) You may not sell, resell, lease, redistribute, license, sublicense or transfer all or any portion of the Facebook Properties, or use or store any Facebook Properties for any purpose other than as specifically authorized herein.

The bottom line is that though this might seem like an obscure technical issue, sharing user activity and profile information with SocialMedia would be as objectionable as the worst behaviors ascribed to Facebook Beacon. With Beacon people were surprised that their actions from around the web were starting to be shared with their friends on Facebook. It wasn't that everyone objected to this happening, but rather that it was implemented as opt-out which led to information being shared in ways that normal people didn't expect. This in turn completely killed Beacon with participating brands such as Coke dropping support. A few weeks ago Facebook shut off access to Slide's Top Friends application for "allowing access to non-friends' personal information" as reported by Inside Facebook. The following week Facebook's responded with a blog post Building Trust and Protecting User Privacy which started by saying:

Privacy is at the core of Facebook.

Because we provide users with rich privacy controls and respect their choices, users feel safe using Facebook to share their information with their friends. By opening up Facebook through Platform, developers have the opportunity to innovate on top of this information. In exchange, developers commit to treating user information with the same respect that users expect of Facebook. Our Developer Terms of Service strictly limit use of user data and serves as guidelines to these expectations.

But I truly believe that Facebook does try to protect user privacy and by doing so creates an environment promoting the creation of rich profiles tied to real offline identities and sharing more content between friends. Facebook has shown a history of learning from these situations over time. If Facebook learned so much about the dangers of surprise with Beacon, shut off Top Friends for sharing profile information, and continues to block access to Google's Friend Connect for redistribution of profile information then why are they still permitting Platform applications to possibly share this data with SocialMedia? As technologists we must be extremely careful in making sure that our implementations match a normal person's expectations. If we forget to do this, we could collectively end up killing something that might someday become great.

I've tried contacting SocialMedia to ask about how their advertising network works, though unfortunately while I've received replies have not had my questions answered. As full disclosure, I work for Six Apart which launched an advertising network for bloggers earlier this year, and has a privacy policy here. I'll be at O'Reilly's Open Source conference in Portland at the end of the month.

tags: advertising, facebook, privacy, social media, the social networkcomments: 6
submit: Reddit Digg stumbleupon   

 

Wed

May 14
2008

Andy Oram

Google Friend Connect and limits to sharing

by Andy Oram@praxagoracomments: 3

We're all tired of acquaintances tugging on us to sign up for new social networks, and of the torque we feel bouncing between the networks we're on if we can't resist the herding instinct that brings us to join them. But we wouldn't want to have just one big social network, either. That would inhibit innovation and prevent people from enjoying a site's special features and cultural uniqueness.

Google's Friend Connect, which was announced on Monday and covered by Radar as well as other sites, represents a small step toward a middle ground. It could be considered the natural succession to Google's OpenSocial, also discussed extensively on Radar. The OpenSocial API forms the basis for communications between Friend Connect widgets and the site hosting them, using lightweight Ajax and JSON protocols. Friend Connect uses the APIs provided by other sites for communication with them.

I had a little tour of Friend Connect last night at the party celebrating the opening of Google's new Cambridge office, covered in another blog.

(continue reading)

tags: google, privacy, social networking, the social network, web 2.0comments: 3
submit: Reddit Digg stumbleupon