A Complete Guide to Pivot Tables 261
A Complete Guide to Pivot Tables: A Visual Approach | |
author | Paul Cornell |
pages | 368 |
publisher | Apress |
rating | 10/10 |
reviewer | Raymond Lodato (rlodato AT yahoo DOT com) |
ISBN | 1590594320 |
summary | A well-researched step-by-step tutorial on the use and programming of Microsoft Excel PivotTables for data analysis. |
If you have any need to analyze data in Excel, you must read this book. I learned so much more about PivotTables after I read it that they are now actually useful, rather than just being something I would occasionally try out. Every chapter is packed with excellent information in an easy-to-follow format. A beginning to intermediate user can understand most of the book; only the chapter on programming PivotTables requires intermediate to advanced knowledge to fully comprehend.
Cornell takes a tutorial approach to explaining what PivotTables are, what they are capable of doing, and how you can apply them to your needs. Each chapter in the Complete Guide gives you an overview of a single topic, a series of step-by-step examples, a Try-It section for more practice, and a summary of next steps. The book was written for Excel 2003, but most of the techniques can be applied to Excel 2002 and even Excel 2000.
When you read this book, I would recommend that you sit at your computer to try these techniques as you read them. I tried to just read the book at first, but you really get itchy to try each feature out. Take it in sequence, as there is a definite building from one chapter to the next.
Chapter 1 gives you an overview of the PivotTable feature, what it's meant to do, and why you would use it. Chapter 2 starts the in-depth training of building basic PivotTables from Excel Lists, external data sources, other PivotTables, etc. It also includes tips on formatting the information and tweaking the fields and table to your liking. Chapter 3 goes even deeper, with information on advanced settings, filters, calculated fields, and other little gems that make analysis easier. These three chapters complete your basic training and lead to chapter 4, "Using PivotTables in the Real World." Paul proceeds to give not just one, but three examples of how PivotTables could be used to provide insight into company operations.
While everything I'd known about PivotTables before picking up this book was covered in the first three chapters, the last three chapters explain additional capabilities that really make Excel valuable for data analysis. Chapter 5 explains PivotCharts, which are simply a graphical representation of the information shown in a PivotTable. Paul goes into detail on the different chart types and how they interact with the underlying PivotTable.
If you need to distill and analyze multidimensional, relational data, PivotTables are up to the task, as chapter 6 will demonstrate. The book describes how you use cube files, OLAP databases, and Microsoft Query to get the data and manipulate it. If you're a really capable programmer, comfortable with VBA, chapter 7 shows you how to work with PivotTable programmatically. There is program after program showing you exactly how to get at the data, massage it, and create the resulting PivotTable.
Finally, there is one appendix that describes the differences between Excel 2000, Excel 2002, and Excel 2003. I was originally reviewing this book while using a computer with Excel 2000. I was delighted to find out that most of the information in the book works exactly as Paul described (although the screen shots didn't match). I did recheck some of the examples on my other machine running Excel 2002, and had no problems at all.
Granted, this book is specifically written for Microsoft Excel. However, OpenOffice, the free competitor from Sun MicroSystems, mimics most of the Microsoft Office suite. How does it compare, you ask? Well, OpenOffice has a similar facility to PivotTables, called DataPilot; however, DataPilot is primitive in comparison. For example, you must select the data to summarize, choose (from the menu bar) Data --> DataPilot --> Start, then drag the fields to the appropriate place in the diagram and click OK. Like Excel, you can freely move the fields between row, column, and data areas, and change the data operation from Sum to Min, Max, or a number of others. Unlike Excel, there isn't much more you can do. You don't have Page fields; you can't sort fields on their data; PivotCharts aren't represented; and there's no programming. If you only want to do simple data analysis in OpenOffice, you can get the basics from chapters 1 and 2 of this book.
You can purchase A Complete Guide to PivotTables: A Visual Approach from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Re:What IS a pivot table anyway? (Score:5, Informative)
Re:OK, so what is it? (Score:5, Informative)
Pivot Table History (Score:5, Informative)
Pivot tables were originally developed in Lotus Improv [wikipedia.org], Lotus's incredibly advanced word processor for the NeXT machine. Lotus attempted to develop a new paradigm (can't believe I used that word) for spreadsheet interaction, something more sophisticated that +A1+@sum(B1..B3). Pivot tables were a component of this formulaless spreadsheet.
Improv was, is friggin' amazing.
I suppose this is another example of Microsoft getting credit for company's innovations? [apple.com]
Summary of what a Pivot Table is (Score:5, Informative)
Name Age Height (cm)
---------------------
Jane 22 174
Dick 22 212
Mary 24 150
Greg 24 198
Dave 23 244
You can use pivot tables to come up with tables like this (but with correctly calculated values
Age Avg Height
--------------
22 190.5
23 244
24 174.5
For the most simple case, take any N-dimensional data, choose one of the dimensions to be your rows, choose one of the dimensions to be your columns, and choose a function of a dimension to be your field values. The more dimensions and data points you have, the more useful this becomes.
Re:What IS a pivot table anyway? (Score:3, Informative)
First, you've got to have some data to describe. Here's one I'm looking at right now: I've got a table in my Oracle database that lists services doctors performed on given dates, and how much they charged for them.
A conventional report would describe that data one-dimensionally. Total amount of money each doctor charged or total amount of money charged in each date range.
A pivot table (or matrix report, as Oracle Reports calls them) shows the same data two-dimensionally. Down the left is a list of doctors. Across the top is a list of months. Cells in the middle show how much each doctor billed in a given month. There are monthly totals at the bottom, doctor totals on the right, and a grand total in the lower right.
So essentially, a pivot table is just a tool for putting data from a RDBMS into the spreadsheet form you know and love.
Re:What IS a pivot table anyway? (Score:3, Informative)
You can use a pivot table turn this data points (all data for that point is contained on a row, with the top row giving the header for each column quantity, color, PO, date,) into a table. This way you can have a table for the total number sock delivered each week by color. Or just the number of shipments of socks of a particular color in a month.
Might not sound really useful, but if you get a lot of data on a day to day basis out of some SQL query and need to reorganize it into a more readable format (or make some nice charts for the PHB,) it is a life saver. I've also found other quick uses for when you need all your data in a certain format and either lack the skill or time to program a custom SQL query to get the data out in the right format. Pivot Tables, Text-to-Columns, and the CONCATENATE function in excel have been life savers for me before on projects once I get to the number crunching point.
Excel Feature? (Score:2, Informative)
A simple example of pivot tables (Score:5, Informative)
Using a pivot table I could end up with a matrix that looks like: Now I have the data sorted by continent. If I decide I'm not interested in population but instead in GDP, it is a simple drag and drop operation to get a table like: Basically pivot tables let you explore lists of data very quickly and efficiently. If you deal with lists of data regularly like I do, they are one of the most indespensible features in a spreadsheet. Excel has the best ones I've used but most modern spreadsheets have some version of them.
PIvot tables? Yawn (Score:4, Informative)
Anybody reading this article who actually thinks these pivot tables sound "powerful" should look into some of the real row-based data mining tools out there. For starters I suggest looking at Weka [waikato.ac.nz] and Orange [uni-lj.si].
Weka in particular is extremely easy to use and you don't have to be a researcher to figure it out.
Re:What IS a pivot table anyway? (Score:1, Informative)
Description of a Pivot Table (Score:3, Informative)
A pivot table:
or...
But in matrix form so that you end you with one cell for each color+location instead of one record. So the size of the matrix grows approximately by a power of 0.5 with respect to the number of results.
Re:What IS a pivot table anyway? (Score:3, Informative)
Okay, the format isn't magical, it's just a table. The magic-ish is the interface for changing it, basically, you can drag and drop ("pivot") your data categories (columns) and quickly say: okay, lets sum by widgets. Now by gadgets. Now by widgets that are gadgets.
All this can be done on the SQL command line. But this is one place where drag and drop works better than the command line (and I'm a sworn CLI addict for most things).
The un-magic about Excel pivot tables is that your're limited to 64k rows of data. In using these from Oracle, much of the trick is writing the base query to get a result down to 64K lines, then pivoting the results.
Re:What IS a pivot table anyway? (Score:3, Informative)
Pivot table info and resource (Score:4, Informative)
There seems to be some natural and/or knee-jerk confusion that pivot tables are some Microsoft-only creation. For some folks that naturally means that they'll never want to use them. That's too bad, a little bit like concluding that referential integrity or relational table structures are "so Microsoft" just because they're discussed in SQL Server documentation and literature.
Best discussion I've read of pivot tables as a generic tool for managing data appears in O'reilly's thin "Transact SQL Cookbook". Excellent book. The first few chapters are devoted to pivot tables alone. According to the authors, PTs are fundamental to solving many data storage and display problems. Have a read, then post more. :)
Hey Cool! (Score:1, Informative)
Quantrix does it better (Score:3, Informative)
Disclaimer: I'm not a shill--not even a customer, in fact. But I'm a friend of the author, and was a contributor to Quantrix's NeXTstep based predecessor.
Re:Pivot Table History (Score:3, Informative)
I'm getting tired of just about every discussion about Microsoft being used as an opportunity by Apple fans to promote their favorite company. Keep that sort of stuff to the Apple groups, please. Whether or not Microsoft copied a feature from Lotus Improv has nothing to do with Apple.
Furthermore, it is stupid for Apple fans to point fingers when it comes to copying: without copying other companies' innovations, Apple wouldn't exist; they copied the very core of their platform from others (SRI [sri.com], PARC [parc.com], Alan Kay [smalltalk.org]). Apple does have better taste than Microsoft in what they copy, but I hardly think they are more original.
Re:A simple example of pivot tables (Score:3, Informative)
Excel is a great tool for certain tasks. Since it is the only data manipulation tool many people know, it gets used for things it really shouldn't.
sPh
Re:What IS a pivot table anyway? (Score:3, Informative)
Okay - I'll take a stab at this. I've been using pivot tables rather heavily since being introduced to them in the last 12 months or so. The step up from a pivot table is a full-blown business intelligence (BI) tool.
Pivot tables allow you to "slice-n-dice" your data. Say you have a set of data that contains the following items: Business unit, sales person, sales territory, customer, customer buying group, order number, item information, cost and revenue. It's all just row after row of data, got it so far?
With SQL or just a big honkin' Excel sheet you can easily calculate data in one dimension (i.e. select/calculate all orders by salesperson X).
It's much more difficult to work in more than on dimension with data in this format. This is where pivot tables are really handy. Say you have the above data, but need to quickly produce a report to show revenue, by customer group, by item, by salesperson, sorted by date. That's more tricky.
This is a bit simplified; with a pivot table you simple select your data "elements" customer group, item and salesperson on the left, your date element on the top and then drop your "revenue" data element (subtract COGS from revenue) in the middle of the pivot and all the fields are automagically calculated. The best part is this all happens from your big, honkin' list of data. You didn't need to figure out any tricky SQL joins, etc when you got the data. You let the pivot table do the work.
Where it gets nice is the "slice-n-dice" capability I mentioned. Say you present this fancy report to your boss who says, "Great! Now show me the same by customer and not customer buying group!" All you have to do is replace the "customer group" element with the "customer" element and all the data is recalculated as fast as you can drop the fields. You didn't have to go back and touch your source data at all, it takes literally seconds to change how you represent the data.
There are a lot of OLAP (online analytical processing) tools out there. We use Cognos' Powerplay [cognos.com] tool. It's nothing but a pivot table on steriods. It works on cubes (collections of data) which we produce from our ERP system. Then we can look at and present our data in the OLAP tool extremely easily. We even have executives using this tool (albeit at a less productive level than the hardcore BI geeks).
If you've ever seen someone whiz around with a pivot table or a BI tool you'll understand why I'm rather excited by all of this. It's powerful stuff and makes looking at data, trends in data and "drilling down" into data painless and rather fun. To see it for youself, whip up some bogus data in Excel and run the pivot table wizard (data - PivotTable and PivotChart Report...). It will walk you through your first pivot table in about 2 minutes!
Re:A simple example of pivot tables (Score:2, Informative)
So, back to the SQL prompt, GROUP BY sexMF ORDER BY sexMF, back to the end-user, to the inevitable "Thanks, that's just perfect. But... how do the ages and sexes total up across all the countries in Sales Region 3?".
The bonus, for our company anyway, of the Pivot table is that we can write a single query to bring back lots of raw data without any of the GROUP BY clauses in place, and then the Account Managers or Operations Supervisors or whoever can use the Pivot Table feature to choose their own GROUPings and subtotals, counts, averages, sort orders and so forth, in their own time, generating dozens of business-useful reports from a single half-hour chunk of chargeable IT time. And since IT time is recharged at about ten times the price for Account Management time, the Pivot Table approach saves us a lot of time and a lot of expensive IT time to use on more profitable work.
Re:What IS a pivot table anyway? (Score:3, Informative)
And allows you to choose any subset of the fields of the spreadsheet for display in a mini-spreadsheet, lets you choose any field you want to serve as a "grouping agent" and dynamically updates subtotals for you. Plus, the subtotals and totals field can be changed to a variety of formulae, including Sums, Averages, even just counts. And you can choose whether or not to show every record that contributed to that subtotal, or just the subtotal line, and make that choice for *each* group.
So, to continue your example, you could check to see what the average price was for a particular service, or count the number of times a particular procedure was performed during a particular month, or each month, whatever. You can twist and mold your data and get views into trends and summaries like never before. Well, perhaps you could get at it in the past, but not with this level of ease and thoroughness.
And whoever suggested using awk against a csv to achieve this is so far off-base, its laughable.