Entries tagged with “velocity” from O'Reilly Radar

Wed

Oct 28
2009

Nat Torkington

Four short links: 28 October 2009

Great Mail Feature, Speed Talks, Virtualisation History, Science Literacy

by Nat Torkington@gnatcomments: 2

  1. GMail Labs: Got The Wrong Bob? -- When's the last time you got an email from a stranger asking, "Are you sure you meant to send this to me?" and promptly realized that you didn't? Looks at the clusters of CCs you send and, if you normally send to Bob X but are trying to send it to Bob Y, asks you "did you mean Bob X?". This might be the best thing to happen to email since webmail and full-text search--it's ridiculous how little innovation is happening in email given how widely and heavily it is used.
  2. Speedgeeks LA at Shopzilla -- eight talks about making websites faster. Latency Improvements for PicasaWeb - Gavin Doughtie (Google) - Great tips from a web guru about what makes PicasaWeb fast. Watch for when the slides to more talks become available.
  3. 10 Years of Virtual Machine Performance Semi-Demystified -- fascinating history of virtualisation from someone who worked for VMware. Since 2005, VMware and Xen have gradually reduced the performance overheads of virtualization, aided by the Moore’s law doubling in transistor count, which inexorably shrinks overheads over time. AMD’s Rapid Virtualization Indexing (RVI - 2007) and Intel’s Extended Page Tables (EPT - 2009) substantially improved performance for a class of recalcitrant workloads by offloading the mapping of machine-level pages to Guest OS “physical” memory pages, from software to silicon. In the case of operations that stress the MMU—like an Apache compile with lots of short lived processes and intensive memory access—performance doubled with RVI/EPT. (Xen showed similar challenges prior to RVI/EPT on compilation benchmarks.)
  4. Pew Research Science Quiz -- To test your knowledge of scientific concepts and recent scientific findings and events, we invite you to take this 12-question science knowledge quiz. Then see how you did in comparison with the 1,005 randomly sampled adults asked the same questions.

tags: email, google, science, science education, velocity, virtualizationcomments: 2
submit: Reddit Digg stumbleupon   

 

Thu

Oct 1
2009

Jesse Robbins

More on how web performance impacts revenue...

by Jesse Robbins@jesserobbinscomments: 9

At Velocity this year Microsoft, Google and Shopzilla each presented data on how web performance directly impacts revenue.

Their data showed that slow sites get fewer search queries per user, less revenue per visitor, fewer clicks, fewer searches, and lower search engine rankings. They found that in some cases even after site performance was improved users continued to interact as if it was slow. Bad experiences have a lasting influence on customer behavior.

What about smaller websites that aren't yet at this scale?

Alistair Croll and Sean Power, the authors of the new book Complete Web Monitoring, have continued this research for sites at smaller scale.

They used a Strangeloop Networks web acceleration appliance to optimize half the sessions to a smaller production website, tagging optimized and unoptimized visitors so they could be analyzed in Google Analytics. The Strangeloop device applies many of Steve Souders' performance rules to an existing site automatically (a kind of "Steve-in-a-Box" ;-).

The results of their analysis show how significant a reduction in page latency can be. In addition to reducing bounce rates, and increasing pages per visit & time on site, they found a 16.07% increase in conversion rates and a 5.50% increase in average order value.

conversion-rate-and-order-value.png

Check out the full post on the Watching Websites blog.

tags: alistair croll, book related, operations, performance, velocity, velocityconf, watching websites, web monitoringcomments: 9
submit: Reddit Digg stumbleupon   

 

Thu

Aug 6
2009

Jesse Robbins

John Adams on Fixing Twitter: Improving the Performance and Scalability of the World's Most Popular Micro-blogging Site

by Jesse Robbins@jesserobbinscomments: 2

Twitter is suffering outages today as they fend off a Denial of Service attack, and so I thought it would be helpful to post John Adams’ exceptional Velocity session about Operations at Twitter.

Good luck today John & team… I know it’s going to be a long day!

Update: Apparently Facebook & Livejournal have had similar attacks today. Rich Miller from Data Center Knowledge reminds us that this is just the latest in a series of major attacks.

tags: attacks, critical infrastructure, infrastructure, operations, performance, security, twitter, velocity, velocity09, velocityconf, video, web2.0, webopscomments: 2
submit: Reddit Digg stumbleupon   

 

Thu

Aug 6
2009

Nat Torkington

Four short links: 6 August 2009

Ancient Language, NoSQL, Molecular Gastronomy, SQL Weirdness

by Nat Torkington@gnatcomments: 0

  1. Computers Unlock More Secrets of the Indus Valley Script -- Four-thousand years ago, an urban civilization lived and traded on what is now the border between Pakistan and India. During the past century, thousands of artifacts bearing hieroglyphics left by this prehistoric people have been discovered. Today, a team of Indian and American researchers are using mathematics and computer science to try to piece together information about the still-unknown script. The team led by a University of Washington researcher has used computers to extract patterns in ancient Indus symbols. The study, published this week in the Proceedings of the National Academy of Sciences, shows distinct patterns in the symbols' placement in sequences and creates a statistical model for the unknown language. (via ACM TechNews)
  2. NoSQL: If Only It Was That Easy -- war stories of the problems with nosql systems to handle big throughput. We liked Tokyo Tyrant so much, we put it in production. In fact, every request to AboutUs.org hits Tokyo. One of the uses is as a persistent memcached replacement for caching 10 million+ wiki pages (as a json document of all the pieces of our page, which comes out to around 51gb(edited) of data), and it works great. It runs on a single server, it serves up a single type of data, very quickly, and has been a pleasure to use. We keep other ancillary data sets on some other servers too, and it’s great for this. Tokyo Tyrant is a great example of very performant software, but it doesn’t scale. (via straup on Delicious)
  3. WillPowder -- Specialty Powders and Spices from Chef Will Goldfarb -- molecular gastronomy products from "the golden boy of pastry". (via joshua on Delicious)
  4. What is the Deal with NULLs? -- In the past, I’ve criticized NULL semantics, but in this post I’d just like to explain some corner cases that I think you’ll find interesting, and try to straighten out some myths and misconceptions. [...] I believe the above shows, beyond a reasonable doubt, that NULL semantics are unintuitive, and if viewed according to most of the “standard explanations,” highly inconsistent. (via bos on Delicious)

tags: databases, food, history, language, nosql, velocitycomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Jun 24
2009

Nat Torkington

Four short links: 24 June 2009

Open Source Kids, Crowdsourcing Lessons, Flickr Secrets, Hadoop Spatial Joins

by Nat Torkington@gnatcomments: 0

  1. The Digital Open -- The Digital Open is an online technology community and competition for youth around the world, age 17 and under. Building a community of young open source hackers.
  2. Four Crowdsoucing Lessons from the Guardian's Spectacular Expenses Scandal Experiment -- Your workers are unpaid, so make it fun. How to lure them? By making it feel like a game. "Any time that you’re trying to get people to give you stuff, to do stuff for you, the most important thing is that people know that what they’re doing is having an effect," Willison said. "It’s kind of a fundamental tenet of social software. … If you’re not giving people the ‘I rock’ vibe, you’re not getting people to stick around." (via migurski on delicious)
  3. 10+ Deploys/Day: Dev & Ops Cooperation at Flickr -- John Allspaw and Paul Hammond's talk from Velocity. You tell any mainstream company in the world "10 deploys/day" and you'll be met with disbelief.
  4. Reproducing Spatial Joins using Hadoop and EC2 -- bit by bit the techniques for emulating important operations from trad databases are being discovered and shared in the new database scene. (via straup on delicious)

tags: crowdsourcing, django, ec2, flickr, geo, geodata, hadoop, journalism, opensource, velocitycomments: 0
submit: Reddit Digg stumbleupon   

 

Wed

Jun 24
2009

Jesse Robbins

Jonathan Heiliger on Web Performance, Operations, and Culture

by Jesse Robbins@jesserobbinscomments: 0

We were honored to have Jonathan Heiliger, Facebook’s VP of Technology Operations, as our opening keynote speaker at Velocity. Jonathan is one of the most accomplished leaders in our field, and is a master of the craft.

Here is his keynote in its entirety:

Note: Other videos from Velocity are being posted to VelocityConference.blip.tv

tags: development, executive, facebook, jonathan heiliger, leadership, operations, performance, velocity, velocityconf, web2.0, webopscomments: 0
submit: Reddit Digg stumbleupon   

 

Fri

Jun 19
2009

Scott Ruthfield

Announcing: Spike Night at Velocity

by Scott Ruthfield@scottrucomments: 5

Guest blogger Scott Ruthfield is a Program Committee member of the O'Reilly Velocity: Web Performance & Operations Conference. 


Web Operations is not for the casual observer: it's for a particular kind of adrenaline junkie that's motivated by graphs and servers spinning out of control.  Jumping in, on-your-feet analysis, and experience-based-experimentation are all part of solving new problems caused by unexpected user and machine behavior, and keeping a clear head when service owners and executives are panicking is part of the job. 

A core part of operations leadership is spike management - what you do when you see a significantly larger amount of load than you've had before. Sometimes this is predictable months out (Amazon knows, for example, that the first or second Monday of December will be their biggest day each year), sometimes days out (Twitter knew Oprah was coming), and sometimes not at all (what we still call the Slashdot Effect). Every web ops professional deals with some kind of spike - even intranets manage paydays and employee review days - and if you're into it, well, spikes can be fun. Of course, maybe you use EC2 Auto-Scaling, and so (in theory) don't have to worry about it, although of course bottlenecks come in many forms.

So at Velocity this year, we're trying out something new: Spike Night.

Spike Night is a chance to see and learn about how real, high-traffic websites deal with massive increases in load, either expected or unexpected. We'll see real-world management of traffic increases - graphs, tools, the whole shebang.

Now, it turns out that when I called up lots of people on the phone and said "can we throw massive load at your website so you can stand on stage and brag about it," many web ops folks were excited, but then they start worrying about little things like "what if something goes wrong and everyone blogs about it" or "do I have to ask somebody in a PR department" and then calls went unreturned. 

Fortunately, two parties have stepped up, and I can't wait to see what they have to show:
  • Chris Bissell, Chief Software Architect at MySpace, and members of the MySpace team will demonstrate a massive, real increase in traffic, and will manage it on-stage. MySpace already deals with tens of thousands of hits each second - we can't throw enough traffic at them to cause any harm - so they'll cause their own harm and then show how they work through it.
  • Ryan NelsonOperations Director for MLB Advanced Media and MLB.com, will walk us through a combination of war stories and live traffic management to show what happens when millions of baseball fans all want to see what's happened after the commercial break at the exact same time. Between their very popular desktop apps and their newly-announced iPhone game streaming, the MLB is a true leader in technology innovation with a rabid fan base that goes well beyond the Web 2.0 echo chamber.
Spike Night is meant to be a fun event, taking place Tuesday June 23rd @ 7:30PM at Velocity, and open to the larger web community - a Velocity conference pass is not required to attend. I'm looking forward to hosting interesting demos and a fun Q&A, and hope to see all of you there!

tags: cloud, infrastructure, operations, performance, scalability, scale, spikenight, velocity, velocity09, velocityconf, web2.0, webopscomments: 5
submit: Reddit Digg stumbleupon   

 

Mon

Jun 8
2009

Jesse Robbins

Ignite! comes to San Jose June 22nd - Submit your talks now!

by Jesse Robbins@jesserobbinscomments: 0

Ignite! VelocityIgnite! is coming to San Jose on Monday June 22, 2009 at 8:00 pm, attached to the Velocity Conference. Admission is free, open to all, and there will be a cash bar.

The deadline for talks is May 11th, so submit your talks now!

As with all Ignites each speaker will only get 20 slides that each auto-advance every 15 seconds for a total of five minutes. We'll be looking for fun geek topics like hacks, how-to's, and insights. (Talks don't have to be Velocity-related!) If you're not sure what an Ignite talk looks like check out the Ignite Show.

You can RSVP for the event on Upcoming or Facebook.

tags: events, ignite, operations, san jose, velocity, velocityconf, web2.0, webopscomments: 0
submit: Reddit Digg stumbleupon   

 

Fri

May 8
2009

Jesse Robbins

Velocity 2009 - Big Ideas (early registration deadline)

by Jesse Robbins@jesserobbinscomments: 7

what-is-velocityconf.png

(tag cloud created from Velocity session & speaker information using wordle.net)

My favorite interview question to ask candidates is: "What happens when you type www.(amazon|google|yahoo).com in your browser and press return?"

While the actual process of serving and rendering a page takes seconds to complete, describing it in real detail can take an hour. A good answer spans every part of the Internet from the client browser & operating system, DNS, through the network, to load balancers, servers, services, storage, down to the operating system & hardware, and all the way back again to the browser. It requires an understanding of TCP/IP, HTTP, & SSL deep enough to describe how connections are managed, how load-balancers work, and how certificates are exchanged and validated... and that's just the first request!

Web Performance & Operations is an emerging discipline which requires incredible breadth, focusing less on specific technologies and more on how the entire system works together. While people often specialize on particular components, great engineers always think of that component in relation to the whole. The best engineers are able to fly to the 50,000 foot view and see the entire system in motion and then zoom in to microscopic levels and examine the tiny movements of an individual part.

John Allspaw recently described this interconnectedness on his blog:

With websites, the introduction of change (for example, a bad database query) can affect (in a bad way) the entire system, not just the component(s) that saw the change. Adding handfuls of milliseconds to a query that’s made often, and you’re now holding page requests up longer. The same thing applies to optimizations as well. Break that [bad] query into two small fast ones, and watch how usage can change all over the system pretty quickly. Databases respond a bit faster, pages get built quicker, which means users click on more links, etc. This second-order effect of optimization is probably pretty familiar to those of us running sites of decent scale.

Working with these systems requires an understanding not only of the way technology interacts, but the way that people do as well. The structure, operation, and development of a website mirrors the organization that creates it, which is why so many people in WebOps focus on understanding and improving management culture & process.

Organizing a conference like Velocity is a wonderful challenge because it requires the same sort of thinking. We focus on the big concepts that everyone needs to know and then go deep into the technologies that change our understanding of the system. We find ways to share the unique experience that can only be gained by operating at scale. We make it safe to share as much of the "Secret Sauce" as we can.

Please join us at Velocity this year, we have an amazing lineup of speakers & participants. Early registration ends on Monday, May 11th at 11:59 PM Pacific. (Radar readers can use "vel09cmb" for an additional 15% discount.)

Velocity, the Web Performance and Operations Conference 2009

tags: cloud, data, infrastructure, operations, scale, velocity, velocity09, velocityconf, web, web2.0comments: 7
submit: Reddit Digg stumbleupon   

 

Thu

May 7
2009

James Turner

Velocity Preview - Keeping Twitter Tweeting

by James Turnercomments: 3

You may also download this file. Running time: 00:10:46

Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.

If there's a site that exemplifies explosive growth, it has to be Twitter. It seems like everywhere you look, someone is Tweeting, or talking about Tweeting, or Tweeting about Tweeting. Keeping the site responsive under that type of increase is no easy job, but it's one that John Adams has to deal with every day, working in Twitter Operations. He'll be talking about that work at O'Reilly's Velocity Conference, in a session entitled Fixing Twitter: Improving the Performance and Scalability of the World's Most Popular Micro-blogging Site, and he spent some time with us to talk about what is involved in keeping the site alive.

James Turner: Can you start by describing the platforms and technologies that make Twitter run today?

John Adams: Twitter currently runs on Ruby on Rails. And we also use a combination of Java and Scala, and a number of homegrown scripts that run the site. We also use a lot of open-source tools like Apache, MySQL, memcached.

twitter_logo_header.pngJT: What type of hardware are you running on?

JA: It's all Linux, so a lot of x86 hardware. I can't tell you the brands or how many.

JT: Do you make any kind of attempt to stay homogeneous in that?

JA: Yes, we do. All of our hardware is very consistent. It makes deployment of new software very easy. And we also use a number of configuration management tools like Puppet to deliver software to those machines.

JT: As anyone can see, Twitter has had a pretty explosive growth, especially recently. Were you prepared for this kind of ramp up?

JA: I don't think so. I mean we're growing week over week in enormous numbers. And we spend a lot of time calculating the growth and scalability of the site to make sure that we can handle the upcoming load.

JT: I mean obviously there are events like Oprah decides she's going to Tweet that are going to be spikes. Do you try to get warning of that stuff?

JA: Yeah. And frequently we know of major events happening. Major events are very predictable like Macworld, even any massive amount of media interaction, we have some fair warning beforehand.

(continue reading)

tags: interviews, operations, twitter, velocity, velocity09, velocityconf, web2.0, webopscomments: 3
submit: Reddit Digg stumbleupon   

 

Fri

Apr 10
2009

Jesse Robbins

AT&T Fiber cuts remind us: Location is a Basket too!

by Jesse Robbins@jesserobbinscomments: 3

The fiber cuts affecting much of the San Francisco Bay Area this week are similar to the outages in the Middle East last year (radar post), although far more limited in scope and impact.   What I said last year still holds true and is repeated below: 

From an operations perspective these kinds of outages are nothing new, and underscore why having "many eggs in few baskets" is such a problem. I believe we will see similar incidents when we have the first multi-datacenter failures where multiple providers lose significant parts of their infrastructure in a single geographic area.

Remember: Don't put all your eggs in one basket... and Location is a basket too!

To really understand the issue, I recommend Neal Stephenson's incredible (and lengthy) Wired article from 1996 entitled "Mother Earth Mother Board":

[...] It sometimes seems as though every force of nature, every flaw in the human character, and every biological organism on the planet is engaged in a competition to see which can sever the most cables. The Museum of Submarine Telegraphy in Porthcurno, England, has a display of wrecked cables bracketed to a slab of wood. Each is labeled with its cause of failure, some of which sound dramatic, some cryptic, some both: trawler maul, spewed core, intermittent disconnection, strained core, teredo worms, crab's nest, perished core, fish bite, even "spliced by Italians." The teredo worm is like a science fiction creature, a bivalve with a rasp-edged shell that it uses like a buzz saw to cut through wood - or through submarine cables. Cable companies learned the hard way, early on, that it likes to eat gutta-percha, and subsequent cables received a helical wrapping of copper tape to stop it.

[...] There is also the obvious threat of sabotage by a hostile government, but, surprisingly, this almost never happens. When cypherpunk Doug Barnes was researching his Caribbean project, he spent some time looking into this, because it was exactly the kind of threat he was worried about in the case of a data haven. Somewhat to his own surprise and relief, he concluded that it simply wasn't going to happen. "Cutting a submarine cable," Barnes says, "is like starting a nuclear war. It's easy to do, the results are devastating, and as soon as one country does it, all of the others will retaliate."

As the capacity of optical fibers climbs, so does the economic damage caused when the cable is severed. FLAG makes its money by selling capacity to long-distance carriers, who turn around and resell it to end users at rates that are increasingly determined by what the market will bear. If FLAG gets chopped, no calls get through. The carriers' phone calls get routed to FLAG's competitors (other cables or satellites), and FLAG loses the revenue represented by those calls until the cable is repaired. The amount of revenue it loses is a function of how many calls the cable is physically capable of carrying, how close to capacity the cable is running, and what prices the market will bear for calls on the broken cable segment. In other words, a break between Dubai and Bombay might cost FLAG more in revenue loss than a break between Korea and Japan if calls between Dubai and Bombay cost more.

The rule of thumb for calculating revenue loss works like this: for every penny per minute that the long distance market will bear on a particular route, the loss of revenue, should FLAG be severed on that route, is about $3,000 a minute. So if calls on that route are a dime a minute, the damage is $30,000 a minute, and if calls are a dollar a minute, the damage is almost a third of a million dollars for every minute the cable is down. Upcoming advances in fiber bandwidth may push this figure, for some cables, past the million-dollar-a-minute mark. [Link]

It's also worth mentioning the outages to multiple service providers hosted in a single colocation facility when the FBI sized all the equipment in the facility, the big outage at 365 Main from two years ago, and many others (see: Radar posts & comprehensive coverage at Data Center Knowledge).

(If Web Operations & Infrastructure is your interest or passion, you should attend Velocity 2009.  You can use the code "vel09cmb" for a 15% discount)

velocity2009.gif
(Image source: http://www.flickr.com/photos/mundane_joy/2301368102/)

tags: at&t, cloud, failure, failure happens, fiber, infrastructure, operations, outages, velocity, velocity09, web infrastructure, web operations, web2.0, webops, worriescomments: 3
submit: Reddit Digg stumbleupon   

 

Tue

Apr 7
2009

Jesse Robbins

It's Really Just a Series of Tubes

by Jesse Robbins@jesserobbinscomments: 12

Molly Wright Steenson hit the Ignite jackpot at Etech this year with her explanation of the steam powered network of pneumatic tubes of the 1800s. If you're someone that, like me, has a somewhat obsessive relationship with Internet Infrastructure, you must watch this talk.

tags: etech, ignite, ignite show, infrastructure, internet, steam, steampunk, tubes, velocity, velocity09, velocityconf, web2.0comments: 12
submit: Reddit Digg stumbleupon   

 

Thu

Feb 5
2009

Jesse Robbins

Understanding Web Operations Culture - the Graph & Data Obsession

by Jesse Robbins@jesserobbinscomments: 8

We’re quite addicted to data pr0n here at Flickr. We’ve got graphs for pretty much everything, and add graphs all of the time.

-John Allspaw, Operations Engineering Manager at Flickr & author of The Art of Capacity Planning

One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate. Web traffic is something that companies typically keep very secret, and often the only time engineers can talk about it is late at night, at a bar, and very much off the record.

There are many good reasons for keeping this kind of information confidential, particularly for publicly traded companies with complicated disclosure requirements. There are also downsides, the biggest being that is difficult for peers to learn from each other and compare notes.

John Allspaw recently created a WebOps Visualizations group on Flickr for sharing these kinds of graphs with the confidential information removed. Here’s an example of a traffic drop seen both by Flickr & by Last.FM that coincided with President Obama’s inauguration.

John Allspaw shows drop in web traffic to Flickr during Obama inauguration

Similar traffic drop on Last.FM seen on the right

Traffic Drop to Last.FM during Obama inauguration on right

Google saw a similar drop as well

Traffic Drop to Google during Obama Inauguration

Was it because everybody went to Twitter?

Traffic Spike on Twitter during Obama Inauguration

Besides being an interesting story, sharing these kinds of graphs help people build better monitoring tools and processes. As just one example: How should the WebOps team respond to this dip in traffic? Is it an outage? The inaguration was a very well known event and so it’s easy to explain the drop in traffic… what happens when a similar drop in traffic occurs? Should the WebOps team be looking at CNN (or trends in twitter) along with everything else?

How do you tell when that unexpected 10% drop in traffic is really just people with something more important to do than browse your site?

(Note: Updated since original posting to add Google & Twitter graphs and annotations, and to switch the Last.FM graphic with an annotated one after I got permission.)

tags: big data, culture, enterprise 2.0, flickr, infovis, john allspaw, last.fm, metrics, monitoring, operations, velocity, velocity09, web2.0, webopscomments: 8
submit: Reddit Digg stumbleupon   

 

Fri

Jan 30
2009

Nat Torkington

Four short links: 30 Jan 2009

by Nat Torkington@gnatcomments: 1

Two serious links and two fun today, thanks to Waxy and BoingBoing:

  1. EveryBlock Business Model Brainstorming -- Adrian Holovaty's project was funded by a Knight Foundation grant that's about to run out. The software will be open sourced but he's inviting suggestions of business models that would enable the project team to continue working on it full-time. Having used and created open source to show newspaper companies how to do journalism online, will he now work on an open source way for them to make money?
  2. Infrastructure for Modern Web Sites -- Leonard Lin lays out what's required in systems and platforms for modern web sites. Perl succeeded in part because its data types were the things you had to deal with (files, text, sockets). Will the next gen of tools (the 'Rails killer' if you will) offer users, taggable objects, social objects, etc. as primitives?
  3. Academic Earth -- takes open courseware from different universities and integrates them into a coherent UI. Transcripts. Slurp.
  4. Love2D -- a Lua-based 2D game engine. I'm looking at it to see whether it works for me as the next step for 9 year-old kids interesting in programming games in my computer club.

tags: adrianholovaty, education, games, infrastructure, journalism, lua, open source, programming, velocitycomments: 1
submit: Reddit Digg stumbleupon   

 

Sat

Nov 29
2008

Jesse Robbins

Data Center Power Efficiency

by Jesse Robbins@jesserobbinscomments: 8

James Hamilton is one of the smartest and most accomplished engineers I know. He now leads Microsoft's Data Center Futures Team, and has been pushing the opportunities in data center efficiency and internet scale services both inside & outside Microsoft. His most recent post explores misconceptions about the Cost of Power in Large-Scale Data Centers:

jameshamilton.jpg

I’m not sure how many times I’ve read or been told that power is the number one cost in a modern mega-data center, but it has been a frequent refrain. And, like many stories that get told and retold, there is an element of truth to the it. Power is absolutely the fastest growing operational costs of a high-scale service. Except for server hardware costs, power and costs functionally related to power usually do dominate.

However, it turns out that power alone itself isn’t anywhere close to the most significant a cost. Let’s look at this more deeply. If you amortize power distribution and cooling systems infrastructure over 15 years and amortize server costs over 3 years, you can get a fair comparative picture of how server costs compare to infrastructure (power distribution and cooling). But how to compare the capital costs of server, and power and cooling infrastructure with that monthly bill for power?

The approach I took is to convert everything into a monthly charge. [...]

James Hamilton explains Datacenter Costs

[link]

tags: cloud computing, energy, james hamilton, microsoft, operations, performance, platforms, utilities, utility computing, velocity, velocity09, web2.0comments: 8
submit: Reddit Digg stumbleupon   

 

Thu

Nov 20
2008

Jesse Robbins

Velocity 2009: Themes, ideas, and call for participation...

by Jesse Robbins@jesserobbinscomments: 0

velocity2009_120x421.gifLast year's Velocity conference was an incredible success. We expected around 400 people and we ended up maxing out the facility with over 600. This year we're moving the conference to a bigger space and extending it to 3 days to accommodate workshops and longer sessions. Velocity 2009 will be on June 22-24th, 2009 at the Fairmont Hotel in San Jose, CA.

This year's conference will be especially important. I've said many times that Web Performance and Operations is critical to the success of every company that depends on the web. In the current economic situation, it's becoming a matter of survival. The competitive advantage comes from the ability to do two things:

  1. Generate more revenue with fewer resources
  2. Respond quickly to change
Our Velocity 2009 mantra is "Fast, Scalable, Efficient, Available", a slight change from last year. (We've replaced "Resilient" with "Efficient" to make focus clear.)

I'm excited to announce that joining Steve Souders & I on this year's program committee are John Allspaw, Artur Bergman, Scott Ruthfield, Eric Schurman, and Mandi Walls.  We've already started working on the program, and have just opened the Call for Participation.

(continue reading)

tags: artur bergman, conferences, Eric Schurman, John Allspaw, mandi walls, operations, performance, scott ruthfield, steve souders, velocity, velocity09, web2.0, webopscomments: 0
submit: Reddit Digg stumbleupon   

 

Thu

Aug 7
2008

Jesse Robbins

Kaminsky DNS Patch Visualization

by Jesse Robbins@jesserobbinscomments: 4

Dan Kaminsky has posted the details of the widespread DNS vulnerability. Clarified Networks created this visualization of DNS patch deployment over the past month:

Red = Unpatched
Yellow = Patched, "but NAT is screwing things up"
Green = OK

tags: internet policy, operations, platform plays, velocity, worriescomments: 4
submit: Reddit Digg stumbleupon   

 

Thu

Jul 31
2008

Jim Stogdill

Energy Savings, Strange Attractors, ...

by Jim Stogdill@jstogdillcomments: 4

... the Intrinsic Cost of State Change, Orbiting Alien Voyeurs, and 200 Square Kilometers of Solar Panels Somewhere in Texas

The Silicon Valley Leadership Group and Berkeley National Labs recently published the results of their first Data Center Demonstration Project (pdf). (Disclosure: My colleague Teresa Tung of Accenture R+D labs was the report's principal author). The study follows up on last year's publication of the EPA's report to Congress (pdf) on data center energy consumption. That report, among other things, estimated the range of savings that data center operators could achieve with varying degrees of technology and practice improvement. This more recent report is based on real world studies and was intended to validate the estimates in the EPA report.

Both reports are good reads if you are interested in reducing the megawatts being consumed in your organization's silicon (though the EPA report has been criticized as being a bit toothless). However, I should warn you that they are fairly long and detailed so the bedside table might not be the best home for them if you want to get through them, at least until the manga versions are released.

The EPA study estimated that "state of the art" technology and processes in the data center might cut energy usage by 55%, the more readily achievable "best practices" come in at 45% savings. State of the art includes a range of approaches including better server utilization through virtualization, better cooling techniques, improved power distribution, sensor networks, etc.

electricity-usage-graph.jpg

The more recent study, testing those techniques in working data centers, validates the EPA's estimates but also offers the initially surprising conclusion that legacy data centers can be retrofitted to achieve efficiencies close to that of new builds. That conclusion follows from the less surprising finding that the most bang for the buck comes from improvements on the "IT" side of the energy draw (energy efficient servers, virtualization, etc.) rather than from the harder to retrofit "site" side (cooling systems etc.). The dog wags the tail after all and if you can reduce the direct power consumption by the IT equipment, you will simultaneously reduce associated cooling costs whether in an old building with relatively inefficient HVAC or a shiny new one.

The last finding that I'll mention here is that it doesn't look like the time is right yet for widespread adoption of more advanced load management techniques outside of niche applications. The demonstration project had facilities that experimented with them, but the risk aversion that stems from high reliability requirements in production data centers has these experiments mostly restricted to centers that serve R+D rather than production functions.

Maybe one of the most interesting things about the report is what it doesn't (can't) say.

(continue reading)

tags: datacenter, energy, epa, thought provoking, trends, velocitycomments: 4
submit: Reddit Digg stumbleupon   

 

Sat

Jun 28
2008

Jesse Robbins

The new internet traffic spikes

by Jesse Robbins@jesserobbinscomments: 5

Theo Schlossnagle, author of Scalable Internet Architectures, gave a great explanation of how internet traffic spikes are shifting:

Lately, I see more sudden eyeballs and what used to be an established trend seems to fall into a more chaotic pattern that is the aggregate of different spike signatures around a smooth curve. This graph is from two consecutive days where we have a beautiful comparison of a relatively uneventful day followed by long-exposure spike (nytimes.com) compounded by a short-exposure spike (digg.com):

The disturbing part is that this occurs even on larger sites now due to the sheer magnitude of eyeballs looking at today's already popular sites. Long story short, this makes planning a real bitch.

[...]What isn't entirely obvious in the above graphs? These spikes happen inside 60 seconds. The idea of provisioning more servers (virtual or not) is unrealistic. Even in a cloud computing system, getting new system images up and integrated in 60 seconds is pushing the envelope and that would assume a zero second response time. This means it is about time to adjust what our systems architecture should support. The old rule of 70% utilization accommodating an unexpected 40% increase in traffic is unraveling. At least eight times in the past month, we've experienced from 100% to 1000% sudden increases in traffic across many of our clients.

[Link]

tags: operations, trends, velocity, web 2.0, worriescomments: 5
submit: Reddit Digg stumbleupon   

 

Tue

Jun 24
2008

Jesse Robbins

Video of Rich Wolski's EUCALYPTUS talk at Velocity

by Jesse Robbins@jesserobbinscomments: 1

Rich Wolski gave a truly impressive talk at Velocity about an open-source software infrastructure for cloud computing called EUCALYPTUS . The API is compatible with Amazon's EC2 interface, and the underlying infrastructure is designed to support multiple client-side interfaces. EUCALYPTUS is implemented using commonly-available Linux tools and basic Web-service technologies making it easy to install and maintain. Watch and learn...

You can see more videos from Velocity on Blip.tv.

tags: cloud computing, ec2, movers and shakers, open source, operations, platform plays, science, utility computing, velocity, velocity08, videos, web 2.0comments: 1
submit: Reddit Digg stumbleupon