## R In a Nutshell 91

joel.neely writes

As a curious newcomer to R who wanted to get going quickly, I was well-served by Part 1, which provided an R kickstart. Chapter 1 covers the process of getting and installing R. It is short, to the point, and just works, addressing Windows, Mac OS X, and Linux/Unix with equal attention. Chapter 2, on the R user interface, introduces the range of options for interacting with R: the GUI (both the standard version and some enhanced alternatives), the interactive console, batch mode, and the RExcel package (which supports R inside a certain well-known spreadsheet). Chapter 3 uses a set of interactive examples to provide a quick tour of the R language and environment, establishing a task-oriented theme that carries through the rest of the book. The last chapter of part 1 covers R packages. It summarizes the standard pre-loaded packages, introduces the tools to explore repositories and install additional package, and concludes by explaining how to create new packages.
*"R is a statistical computing environment that is fully-compliant with state-of-the-art buzzwords: free, open-source, cross-platform, interactive, graphics, objects, closures, higher-order functions, and more. It is supported by an impressive collection of user-supplied modules through CRAN, the 'Comprehensive R Archive Network.' And now it has its own O'Reilly Nutshell book,*Read on for the rest of Joel's review.*R in a Nutshell*, written by Joseph Adler. I am pleased to report that Adler has risen to the challenge of the highly-regarded 'Nutshell' franchise. As is traditional for the series, this title mixes introduction, tutorial, and reference material in a style that is well suited to a reader who already has a background in programming, but is a new or occasional user of R."R in a Nutshell | |

author | Joseph Adler |

pages | 672 |

publisher | O'Reilly |

rating | 9/10 |

reviewer | Joel Neely |

ISBN | 978-0-596-80170-0 |

summary | A practical and engaging introduction to the R statistical system and its usage |

As a polyglot programmer who is always interested in seeing how a new language approaches programs and their construction, I enjoyed Part 2, which described the R language. This section begins with an overview in chapter 5, and then devotes a chapter each to R syntax, R objects, symbols and environments (central to understanding the dynamic nature of R), functions (including higher-order functions), and R's own approach to object-oriented programming. This section closes in chapter 11, with a discussion of techniques and tips for improving performance.

As a busy professional with data sitting on my hard drive that I'd like to understand better, I appreciated Part 3, with its practical emphasis on using R to load, transform, and visualize data. Chapter 12 presented alternatives for loading, editing, and saving data, from the built-in data editor, through file I/O in a variety of formats, to a mature set of database access options. Chapter 13 illustrated a range of techniques for manipulating, organizing, cleaning, and sorting data, in preparation for presentation or more detailed analysis. Chapter 14 introduces the reader to the wealth of graphical presentation options built into the R environment. There are so many charting types and details that this chapter could have been overwhelming, but Adler keeps the interest high and the mood light by drawing on an engaging variety of data: toxic chemical levels, baseball statistics, the topography of Yosemite Valley, demographic data, and even turkey prices. Chapter 15 is devoted to lattice graphics, the R implementation of the "trellis graphics" technique for data visualization developed at Bell Labs. This chapter illustrates the power of lattice graphics by exploring the question of why more babies are born on weekdays than weekends.

As a non-statistician who still occasionally needs to do some number-crunching, I'm sure I'll be returning to Part 4, with its detailed explanations and illustrations of analysis tools and techniques–almost two-hundred pages worth. In chapters 16 through 20, Adler surveys topics in data analysis, probability, statistics, power tests, and regression modeling. As someone who has been offered too many medications and lost fortunes, I found much to enjoy in chapter 21, which used a variety of spam-detection techniques to illustrate the concepts of classification. Chapter 22, on machine learning, discusses several of the data mining techniques that R supports. Chapter 23 covers time series analysis, which may be used to identify trends or periodic patterns in data. Finally, chapter 24 offers an overview of Bioconductor, an open-source project focused on genomic data.

The book closes with a detailed reference to the standard R packages.

This is an impressive piece of work. In a volume of this size (about 650 pages), navigation is crucial, and I found both the organization of the chapters and index up to the task. I was able to follow the instructions and examples through the first several chapters of the book essentially without a hitch, and in the latter chapters the variety of illustrations and data sources added interest to what could have been very dull going.

I won't claim perfection for this book. There were a couple of explanations that could have been clearer, and one or two odd turns of phrase or rough edits. Out of all the code examples that I tried, I found exactly one that didn't seem to work without a minor correction. For a work of this size, that's actually pretty amazing!

As a long-time O'Reilly reader, I see Joseph Adler's

*R in a Nutshell*as a welcome addition to the menagerie.

You can purchase

*R in a Nutshell: A Desktop Quick Reference*from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

## Emphatic Agreement (Score:5, Interesting)

In a volume of this size (about 650 pages)

Not to criticize the reviewer but there's not enough written above to do this book justice. From the author's emphasis on preprocessing the data in another language (like Perl I think he uses in the Chapter 3 tutorial) so that it can be effortlessly ingested by R to the very last pages on machine learning in R, it's a good book. I actively lament that in college I was relegated to Matlab instead of R today and the many packages available on CRAN.

I too would give this book a 9/10. It sometimes tries to inject tutorials in what should probably stick to being a reference and it might have too large of a scope for a single volume (I've read sets of books on machine learning and classification models) but this book is great for R beginners and R intermediates and as an R reference.

Seriously if you know a statistician who codes or if you know a developer who values statistics then this is their book. Given the nature of the subject matter and the GPL'd beauty of R, you'll undoubtedly have a hard time finding a negative review of this book anywhere.

## R Tools (Score:4, Interesting)

## Fantastic book for a fantastic language (Score:3, Interesting)

This book always sits right on my desk.

R is a language that more people should really learn. The statistics community has definitely gravitated strongly to it. These days, with the thousands of packages on CRAN, it's much superior in functionality compared to other packages like STATA or SAS (I won't even go into people who use matlab for statistics), not to mention open source.

It still is a bit slower than matlab for some matrix operations, but hopefully that will be improved in the future.