Search
Recommended for You

Scraping AppStore Reviews


I know I promised last week to post about intra-app notification, and I'm still planning to but after chatting with Øivind Kjellnø over email, I decided to change directions for a little while. I'll get back to notifications but for today, I'm going to show you how to access your app reviews throughout the world.

As you know, iTunes currently exists in 60-odd countries. 62 if I've done my counting right. Each store has its own storefront code, which I have laboriously produced for you below. Seriously, this took forever! These country codes allow you to access AppStore for each country and retrieve the review data you're looking for.

AppStore's primary user review URL is

http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wa/viewContentsUserReviews?id=%d&pageNumber=0&sortOrdering=2&type=Purple+Software
, where that %d ID can be grabbed from the Application URL itself. To get that number, right-click any of your applications in iTunes, copy the URL and examine the 'id=' field in that link. You can set the page offset to any number desired.

The user review URL shown here returns the first page of the most recent (sortOrdering=2) reviews. You can retrieve those reviews directly from the Terminal command-line.

To talk to iTunes from curl, spoof the user agent to pretend to be iTunes and set your store front to one of the legal values. Here, I set the store by passing it as a header field using curl's -H switch.

curl -A "iTunes/4.2 (Macintosh; U; PPC Mac OS X 10.2)" -H
 "X-Apple-Store-Front: 143441-1" 
'http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wa/viewContentsUserReviews?id=284222807&pageNumber=0&sortOrdering=2&type=Purple+Software' 
| gunzip

The results are, to say the least, not very readable. You'll probably want to filter them out. Here's one way to do so, piping the results through the following really helps:

xmllint --format - | grep normalStyle | grep -v \"\>$ | grep 
-v "Sort by:" | grep -v "GotoURL" | grep -v "by.*<b>" 
| grep -v "Copyright" | sed "s/.*\">//" | sed "s/<.*//"

The last step involves looping this over all available stores. It's easy to throw together a perl script that does exactly that, opening the results in TextEdit. The source for that script follows at the bottom of this post.

Hopefully these steps will help readers who are looking to find out how their application is performing around the world. Let me know whether you find the techniques and script useful in the comments.

International Store Codes

United States 143441
Argentina 143505
Australia 143460
Belgium 143446
Brazil 143503
Canada 143455
Chile 143483
China 143465
Colombia 143501
Costa Rica 143495
Croatia 143494
Czech Republic 143489
Denmark 143458
Deutschland 143443
El Salvador 143506
Espana 143454
Finland 143447
France 143442
Greece 143448
Guatemala 143504
Hong Kong 143463
Hungary 143482
India 143467
Indonesia 143476
Ireland 143449
Israel 143491
Italia 143450
Korea 143466
Kuwait 143493
Lebanon 143497
Luxembourg 143451
Malaysia 143473
Mexico 143468
Nederland 143452
New Zealand 143461
Norway 143457
Osterreich 143445
Pakistan 143477
Panama 143485
Peru 143507
Phillipines 143474
Poland 143478
Portugal 143453
Qatar 143498
Romania 143487
Russia 143469
Saudi Arabia 143479
Schweitz/Suisse 143459
Singapore 143464
Slovakia 143496
Slovenia 143499
South Africa 143472
Sri Lanka 143486
Sweden 143456
Taiwan 143470
Thailand 143475
Turkey 143480
United Arab Emirates 143481
United Kingdom 143444
Venezuela 143502
Vietnam 143471
Japan 143462

Scraping Perl Script

#! /usr/bin/perl
# Autofetch reviews

# iPocket
# print "iPocket Reviews:\n";
# $currentSoftware = 285898097;
# getAllReviews();

# ToDo
print "To Do Reviews:\n";
$currentSoftware = 284222001;
getAllReviews();

# Light -- removed

# Moo
print "Moo Reviews\n";
$currentSoftware = 284222807;
getAllReviews();

# Ad Hoc Helper
print "Ad Hoc Helper Reviews\n";
$currentSoftware = 285691333;
getAllReviews();

sub getAllReviews()
{

$country="\nCOUNTRY: United States";
$store =  143441;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Argentina";
$store =  143505;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Australia";
$store =  143460;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Belgium";
$store =  143446;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Brazil";
$store =  143503;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Canada";
$store =  143455;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Chile";
$store =  143483;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: China";
$store =  143465;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Colombia";
$store =  143501;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Costa Rica";
$store =  143495;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Croatia";
$store =  143494;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Czech Republic";
$store =  143489;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Denmark";
$store =  143458;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Deutschland";
$store =  143443;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: El Salvador";
$store =  143506;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Espana";
$store =  143454;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Finland";
$store =  143447;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: France";
$store =  143442;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Greece";
$store =  143448;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Guatemala";
$store =  143504;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Hong Kong";
$store =  143463;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Hungary";
$store =  143482;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: India";
$store =  143467;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Indonesia";
$store =  143476;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Ireland";
$store =  143449;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Israel";
$store =  143491;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Italia";
$store =  143450;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Korea";
$store =  143466;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Kuwait";
$store =  143493;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Lebanon";
$store =  143497;
print $country, "\n";
fetchReviews();

$country="\nCOUNTRY: Luxembourg";
$store =  143451;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Malaysia";
$store =  143473;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Mexico";
$store =  143468;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Nederland";
$store =  143452;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: New Zealand";
$store =  143461;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Norway";
$store =  143457;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Osterreich";
$store =  143445;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Pakistan";
$store =  143477;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Panama";
$store =  143485;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Peru";
$store =  143507;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Phillipines";
$store =  143474;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Poland";
$store =  143478;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Portugal";
$store =  143453;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Qatar";
$store =  143498;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Romania";
$store =  143487;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Russia";
$store =  143469;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Saudi Arabia";
$store =  143479;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Schweitz/Suisse";
$store =  143459;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Singapore";
$store =  143464;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Slovakia";
$store =  143496;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Slovenia";
$store =  143499;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: South Africa";
$store =  143472;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Sri Lanka";
$store =  143486;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Sweden";
$store =  143456;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Taiwan";
$store =  143470;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Thailand";
$store =  143475;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Turkey";
$store =  143480;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: United Arab Emirates";
$store =  143481;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: United Kingdom";
$store =  143444;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Venezuela";
$store =  143502;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Vietnam";
$store =  143471;
print $country, "\n";
fetchReviews();


$country="\nCOUNTRY: Japan";
$store =  143462;
print $country, "\n";
fetchReviews();

}

sub fetchReviews()
{
    my $doit = qq{curl -s -A "iTunes/4.2 (Macintosh; U; PPC Mac OS X 10.2" -H "X-Apple-Store-Front: $store-1" 'http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wa/viewContentsUserReviews?id=$currentSoftware&pageNumber=0&sortOrdering=2&type=Purple+Software' | gunzip | xmllint --format -};

   my $riz = `$doit`;
   my @rizray = split('\n', $riz);
   my @rizray = grep(/normalStyle/, @rizray);
   my @rizray = grep(!/GotoURL/, @rizray);
   my @rizray = grep(!/Sort by:/, @rizray);
   my @rizray = grep(!/by.*<b>/, @rizray);
   my @rizray = grep(!/">$/, @rizray);
   my @rizray = grep(!/Copyright/, @rizray);
   my @rizray = grep(!/> \/</, @rizray);
   my @rizray = grep(!/>..</, @rizray);

   foreach my $item (@rizray)
   {
      # print $item, ": ";
      $item =~ s/.*">//;
      $item =~ s/<.*//;
      print "* ", $item, "\n";
   }
}
AddThis Social Bookmark Button
Comments (31)

31 Comments

FatMax said:

Hi Erica!

Your script has one typo. Norway isnt spelled Normway, last time I checked ;)

- FatMax

Russell said:

Thanks for the script. It works great! I was just about to write something like this when I saw your version. 5 stars!

Perfect!!! Thanks a lot Erica!
That's what I was looking for since day one!

Joe said:

Nice! Thanks!

Øivind Kjellnø said:

Thanks :)

Let's just hope Apple will let us choose which language/review source we want in the App Store in the near future without turning to scripts, but I'll use this for now :)

Øivind Kjellnø said:

For us non-developers from small countries where reviews are hard to come by, the script would probably have been better if it prompted the user for the application ID. I don't know perl scripting at all, but after a little bit og google magic I found some example code and modified the script to prompt for an application ID and only show results in languages I understand. So now it's just perfect, all thanks to you Erica. No more empty review sections on applications I'm interested in buying.

Excellent! Thank you very much.

Steph said:

Thanks for that,this is very useful! do you think it would be possible to get also how many stars each person gives and what is the average rating per country.

And could it be possible to do a similar script to count in which position you app is by popularity? basically it would iterate over the all applications sorted by popularity and counting how many apps are displayed before yours.Do you think this is possible

Paul Hargreaves said:

Is there a list somewhere or way of obtaining all the ids? With that it should be trivial to break away from having to find applications via the awfully slow iTunes browser and just have lists of them automatically generated...

That is a great script Erica. You are wonderful.

Jeremy Wohl said:

Erica,

I've hacked the XML a bit more to pull rating, subject, author and body.

See http://github.com/jeremywohl/iphone-scripts/tree/master

Thanks for pulling all of this together.

Hey thanks for the cool script.

I have made a cool Dashboard widget from this script that seems to run really well.

http://www.hurl.ws/7db

Will look at the new script on github from Jeremy Wohl now.

John Ballinger

jb said:

thanks,

but do you know how from an iphone launch (with uiapplication openurl) the appstore on the search page??

jb

Justin Noel said:

Erica,

We are scraping content from the app store for use on AppBeacon.com .

Unfortunately, I just haven't been able to resolve some issues with the data we get. The downloads from the app store are supposedly UTF-8. However, we see all kinds of character sequences to represent trademark, copyright, apostrophe, bullets and other characters.

Here is an example:
http://appbeacon.com/apps/011430/legends-swords-of-raemllyn-3-blood-fountain . Look at the title on our site.

Here is the xml data for that :


Legends: Swords of Raemllyn #3 ###1m~@~S Blood Fountain

Do you have any suggestions for dealing with this? Am I simply missing something obvious?

Thanks,
Justin Noel


Peter said:

Great post, I was looking for something like this -- for those interested, I made an AIR application using this technique so you have a desktop application to check worldwide reviews on your apps.

http://www.peterelst.com/blog/2008/12/25/iphone-app-reviews-air-application/

Karun said:

Hey everyone,

I am trying to run this script on windows. Till now I havent been able to do it. Can anyone please please help me out in doing the same scraping on windows.

Please...

Regards,
Karun

Benjamin said:

Just wanted to share that I fixed a Perl script to extract the app rankings for all top 100 apps per category from all stores worldwide. Find my blog post here:

http://www.pearcomp.com/2009/07/03/app-ranks-for-all-categories-and-all-app-stores

Iphone Application Developers said:

Hey great script. Very useful, Thanks for sharing

Michael said:

Checkout APPlyzer.com for Top1000 Appstore Rankings WorldWide

Hunnenkoenig said:

I would need this desperately, but I am a total idiot with perl and other programming languages.

I would need to scrape user names, given stars, and the review itself only from the us store.

How do I use this?

As I said, I don't know anything. I just have a server, an FTP access and some PHP programming skills.

Can somebody write a step by step walkthrough, how to set it up ready for showing it on my page?

What is the actual script? Where to put it? How to call it from PHP etc.

I would really really appreciate it!

Thomas said:

All you need is: http://www.iphoneappsplus.com/
It’s all free.

Igor said:

Thanks for this information. It helped us to build AppComments: http://appcomments.com/
Makes RSS feed of user reviews.
Supports sharing, translate, emoji icons and more!

Hunnenkoenig said:

Apple changed their links in iTunes again.

Old links to the xml files don't work anymore.

Does somebody know, how to change them?

bath mate said:

very good posting. i liked it. :-)

herf"http://www.bathmateus.com">bathmate

ed hardy said:

Mobile phone is the symbol of status, but also the symbol of fashion, here, you can find many useful things.

bath mateus said:

That’s looks so nice your posting.
Everything looks good in your posting.
That will be necessary for all. Thanks for your posting.
Bathmate

TiG said:

We are currently looking at scraping the app store for game reviews etc... for http://www.touchiphonegames.com thanks for the script. I think it may need looking at again since Apple changed the url format. But it is an awesome start!

Ilene said:

An updated working modification of the script can be found here:
http://blog.weatherangel.com/2009/07/app-store-adds-new-countries/

I was looking for the format for getting hot new games categories, etc since we have an app in there now...

Karl said:

The reviews are working great thanks for all the help.

Does anyone know the URL for the application information? (Name, Description, Screen shots, etc)

Thanks!

I was getting an error until I removed the 'gunzip' portion in the request URL. Perhaps Apple no longer uses compression?

Thanks Erica! You've been a real asset to the iPhone developer community.

Also had a problem with $store-1 .... I took out the -1 part and got good reviews... with the -1 I just got totals.

Leave a comment