Digital Media Mac Blogs > Mac

Free Statistics Package for Your Mac: The R Project for Statistical Computing


If you search for the letter "R" in Google, the first hit that comes up is the R Project for Statistical Computing. R is a full statistical platform that provides a statistical programming language, a computational window, and programmable graphics output. It is an Open Source project that provides ready to run installable binaries for Mac OS X as well as Windows. If your spreadsheet's basic statistical functions fall short of your needs, you might want to take a look at this powerhouse statistical package.

The R Project for Statistical Computing

R is an implentation of S (An Interactive Environment for Data Analysis and Graphics) from Bell Labs. I recall get S on a reel of tape for installation on a DEC VAX running Ultrix in the ancient times (1980s). Installing R from a DMG file is much simpler and faster.

r-install.jpg
Some of you might be amused to note that part of the R installation suite includes GNU Fortran. The installation process is completed automated, simple, and fast. By the way, the project just released R 2.7.1 on June 23.


r-mac-console.jpg

Although you can run the old text shell interpreter from a Terminal window or the X11-based graphical interpreter, most Mac users probably would prefer the Mac friendly R Console.


r-console-cmdcompletion.jpg

If you, like me, keep a Terminal shell window open all the time when using your Mac, you will probably feel very comfortable with R Console. Using the up and down arrow keys moves you through your keyboard history. You can also press the ESC key for command completion. As you can see from the screen capture near this paragraph, the Mac R Console displays a list of possible commands in a drop-down menu.


r-histogram.jpg

Although you can manually enter data from the R Console, I prefer bringing data from other sources. I tend to enter a lot of data into spreadsheets since my Windows Mobile phone has Microsoft Excel Mobile built-in. Although R is capable of reading Excel XLS files directly, the R documentation recommends exporting data from Excel to a CSV file and importing data from a CSV file to R. I've been keeping a log of my automobile gasoline fillups for years now for two consecutive cars. My spreadsheet consists of the following columns: Date, Miles (driven), Gallons (of gas), Price (in US$), MPG (miles per gallon), and Station (the gas station name). Here's how I imported this data from a CSV file named GasMileage.csv).


gas = read.csv("GasMileage.csv", header=TRUE)

The header=True parameter tells the read.csv function that the first row in the sheet are labels for the column.

The histogram of the MPG data column displayed in the graphic here was generated using the command:

hist(gas$MPG)

The gas$MPG tells the histogram (hist) function to use only the data from the MPG column. You can save all your work in a workspace when leaving R. This workspace (you can have more than one) can be loaded the next time so you can pick your work up where you left it.

r-simpleplot.jpg
I also have electricity usage data for a building with a data center in it for fiscal years 2000 through 2006. I imported the table of information from a CSV file using:

kal = read.csv("buildingpower.csv", header = TRUE)

I created a plot of the kilowatt hours used for each fiscal year and connected the dots with lines using these two commands:

plot(kal$Year, kal$KWH)
lines(kal$Year, kal$KWH)

The dip after the year 2000 can be attributed to replacing hundreds of light fixtures and lights. But, as you can see, the ever increasing number of servers eventually took its toll over the years.


r-demo-persp.jpg

Don't let my simple examples make you think that R is limited to simple calculations and plots though. Type demo() in the R Console to see a list of R demostrations. The shaded perspective graph here is part of the examples called by demo(persp).

The statistical features start with simple descriptive ones, move on T-Tests, ANOVA, and linear regression, and then lets you move on to more complex multivariate statistical functions. If none of the packaged functions meets your needs, remember that R is a statistical programming language. So, write what you need.

Categories





AddThis Social Bookmark Button



Comments (3)
Read More Entries by Todd Ogasawara.

3 Comments

Todd

My friend and I have worked in I.T. customer service for 25 years. I think you will find it both readable and useful.

Michael: As an old SAS and SPSS user (with a little BMDP thrown in for good measure), thanks for the pointer! I sure hope Amazon discounts that book a bit when it gets released though :-) Stats and academic-type books always tend to be a bit pricier than mass market books (for obvious reasons). But, the amazing thing is that at $59.95, your friend's book has a way lower retial price than "The R Book" which lists for $110! :-)

Todd

You might be interested in:

R for SAS and SPSS Users by Robert A. Muenchen
http://www.amazon.com/SAS-SPSS-Users-Statistics-Computing/dp/0387094172/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1214399348&sr=8-1

Bob is a friend and works here at UT.

Michael, :-)

Leave a comment


Type the characters you see in the picture above.

Topics of Interest

Related Books

Archives


 
 


Or, visit our complete archive.  

Stay Connected