PMD Applied 108
Simon P. Chappell writes "It's a fundamentally agreed fact within our industry that code reviews are good. Really good. Sliced bread good. But have you actually tried organizing one? If you can get everyone together that needs to be there at the same time in the same meeting room, then you still have the challenge of trying to keep a roomful of geeks from getting trapped in minutiae and squabbling over details like formatting conventions. Well, what if I told you that you could get your code reviews done in less than five minutes and that there would be no arguing? Enter PMD, an open-source Java static analyzer. Think of it as a code review in a box. As if that weren't wonderful enough, there's even a book, PMD Applied, written by Tom Copeland, the author of PMD." Read on for the rest of Simon's review.
PMD Applied | |
author | Tom Copeland |
pages | 221 |
publisher | Centennial Books |
rating | 9 |
reviewer | Simon P. Chappell |
ISBN | 0976221411 |
summary | A good book if you use or are considering using PMD for static analysis of Java code |
I hope that it's not too much of stretch to suggest that this book is primarily written for users of PMD. That said, I actually purchased it (yup, this one wasn't a freebie) because I was considering a recommendation to use PMD as part of the build process at a previous employer. This is not a slam on the documentation on the PMD website, but Mr. Copeland was offering signed editions of the book if you pre-ordered and I wanted to add to my growing collection of books signed by the author. (No contest; geeky as charged.)
PMD Applied has ten chapters. The first chapter is the mandatory introduction and the second covers installation. Nothing unusual there. Chapter three covers using PMD from within Ant or Maven. I'm glad to see both build systems covered here. Both have a pretty good sized user base and both deserve to be represented. Chapter four covers using PMD within an IDE. The range of IDEs covered is excellent and includes all of the ones that I would have covered and a couple of others that I hadn't heard of.
Chapter five begins the serious job of using PMD for static analysis by examining the ability of PMD to detect duplicate code using the wonderfully named Copy and Paste Detector (CPD). Chapter six is titled Best Practices. It's a wonderful collection of advice from Mr. Copeland, on how to begin applying PMD to your Java development process. As a Team Leader, I understand many of the points he makes. Teams generally don't enjoy having impartial tools dumped on them in the middle of a project that do nothing but nitpick their code and programming style; such things are a fast-track to being unpopular. Mr. Copeland's guidelines will increase the chance of your team accepting PMD, as you start with a subset of helpful rules with low chances of false positives.
For all of its cleverness, PMD is nothing without well written rules and chapter seven has this covered. PMD ships with a good sized rule set of its own, even though not all of these will be suitable for every site. Creating custom rule sets is an important first step in customizing PMD for your team. Once the team is used to PMD, they're likely to want to start adding new custom rules to the mix and the book has you covered there as well, with instructions on writing rules with both Java and XPath expressions.
Chapter eight addresses the matter of customizing PMD. Naturally, making changes to the code and recompiling is the first customization covered, but then it looks at custom report formats, adding new languages for the CPD and advice on making contributions of code to the PMD project. Chapter nine is a look at the internals of PMD. Now, perhaps this won't be of much interest to some of the readers of this book, but I found it fascinating. There's nothing quite like having the internal workings of a useful tool explained clearly by the person who wrote it. If you don't like it, it's only twenty pages, so skip over it and check out chapter ten where the playing field of similar open-source tools is examined. The book wraps up with a large appendix with a very useful explanation of all of the rules that ship in the standard PMD deliverable.
I'm going to let a good part of my personal biases show here, but I absolutely love slim and detailed technical books. PMD Applied is 221 pages, which is about the size of the pre-ANSI version of Kernighan and Ritchie's The C Programming Language; the greatest technical book in the known universe. This is the perfect size for a technical book, allowing the author to introduce their subject, explain everything that needs saying, wrap up and be done before the reader's eyes roll back into their head and information overload descends upon them. This book is perfectly sized.
The level of technical discussion matches the size of the book and is just right. While the first few chapters deal with what is PMD and how do you install and run it, the rest of the book deals with good solid code analysis using PMD. This is why we bought the book. We're geeks ... show us the code! Mr. Copeland doesn't disappoint us and there are many excellent code examples throughout the book.
The appendix is a snapshot of the rules that shipped with PMD at the time of the book's publishing. While new rules have been added since this time, the list of every rule and a portion of example code that would trigger that rule are useful. Even if you don't use PMD, buy the book so you can use the appendix as a comprehensive list of examples of how not to write Java code!
After having been out for a year, the exact level of the software described in the book has been passed by. This may bother you, but I feel that the basic principles of PMD have remained true and so I wouldn't let the version numbers dissuade you from purchasing.
The book is only available from the publishers, Centennial Books. They're a smaller publisher, but I had no problem with purchasing through their website.
If you use or are considering using PMD for static analysis of Java code, then this book should be by your side."
Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Misses the point (Score:5, Insightful)
Re: (Score:1)
Re: (Score:2)
Re: (Score:3, Insightful)
Re: (Score:3, Insightful)
Heh - you would think people compile with all warnings turned on. I've been working on a project recently that didn't.
Also, the thing that neither static checkers nor semantic analysis can replace is a really good up-front requirements and architecture. Those things are often the ones that are the most lacking, and really have the most impact on things. A bad architectural decision early on can make life so much worse later, and that's not something that can be "fixed" in the traditional code review proces
Re: (Score:2)
Re: (Score:3, Funny)
This is, in my experience, more of a rule than an exception. Some moron starts programming, doesn't bother with the compiler, figures someone will clean that up later ...
and five years later you cannot even add a CFLAGS=-ansi -Wall without having the
whole codebase barf thousands of warnings at you.
Re: (Score:3, Informative)
It is an excellent thing to run before doing code review. It flags piddling sources of code errors and formatting non-compliance so that your peer review process doesn't have to. The au
Re: (Score:2)
Re: (Score:3, Insightful)
So using a style which leads to the minimum of errors is less important than your personal preference? It's not surprising that most software is bug-laden shit when so many programmers have attitudes like yours.
Re: (Score:2)
When you're not a first-year graduate anymore, you start seeing why they're useful. They are in fact MORE clear and MORE logical than a bulky if-then-else. Where I work no one has a problem with that, so we disabled that rule and everyone is happy. You're SUPPOSED to do that.
Re: (Score:1)
Re: (Score:1, Informative)
Re: (Score:2)
You are absolutely correct.
I've done [Java] code reviews with and without static analysis tools.
If you don't have static analysis tools in place, the review tends to spend its time on finding and discussing little bugs. You'll still find the big ones, but the room is likely to be so worked up (or tuned out) by the discussion of "basics" to some and "nit-picking" to others that the discussion of important stuff is difficult to keep constructive.
If you do have static analysis tools in place, you can se
Re:Misses the point (Score:5, Insightful)
of code) {
is to be }
[ understood, then
formatting standards]
// can really help.
Re: (Score:2)
astyle dirty-file
Re:Misses the point (Score:4, Interesting)
I've felt for many years that source should be stored as syntax trees and that editors should show you the source how every _you_ want to see it so it's most clear to you, not to the author.
Re: (Score:3, Interesting)
However, it would be cool if your text editor automatically cleaned up the code for your display without actually changing the file. The trick here though wo
Re: (Score:1)
Re: (Score:2)
I do not believe that is possible. My tastes in formatting cannot be encoded into an algorithm.
And besides, how many fellow programmers do you know who have horrible formatting, but at the same time happen to name variables and types well, and document tastefully? I know none.
This idea was hot back in the early 199
Re: (Score:2)
I'm curious. Do you have examples?
I don't think it's a matter of "horrible" formatting. It's just a matter of personal taste, like where the brace belongs, how many spaces to indent, etc. It certainly would be nicer if everybody could use their personal style
Re: (Score:2)
Re: (Score:3, Insightful)
On the contrary, if you have a code formatting standard (and imho you should), then formal code reviews most certainly should flag transgressions.
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
PMD defined (Score:4, Informative)
Look at the sourceforge definition of PMD [sourceforge.net].
Re: (Score:1)
However, I don't trust boxes. The first box I opened had Justin Timberlake inside I'm still paying child support to the second box.
Re: (Score:2)
Easy, Erick and Parish Making Dollars [wikipedia.org], well, PMD was when Parrish went solo.
A note to the reviewer (Score:2)
Freshmeat [freshmeat.net] says, "PMD is a Java source code analyzer. It finds unused variables, empty catch blocks, unne
Re: (Score:2)
Won't somebody think of the
I have (Score:2)
Re: (Score:2)
still need human eyes (Score:4, Insightful)
Re: (Score:2)
Static analysis is the sh-t (Score:1)
Re: (Score:2, Informative)
From what I understand from my friend who works for M$, one of the main reasons it took so bloody long to write Vista was they did precisely that. He may be wrong or the auto-buffer overflow exploit finder may not have been perfect, but at least they're trying.
Re: (Score:1)
Re: (Score:2)
Re: (Score:3, Insightful)
As an aside though, I question the sanity of such tools. Why does the language allow you to use constructs that an automated tool could tell you were screwy?!? Shouldn't such practices the tool catches be compiler errors, or at least glaring warnings? It seems to me that the underlying problems lie in the language specifications and the tools are only there as a band-aid.
Re: (Score:1)
If you're writing a device driver, how are you going to have a language that tells you not to call a sleeping function when the IRQ level
Re: (Score:2)
You aren't. But I doubt a static analyzer is going to catch that either.
By abstracting away the mutation. You can do this with C++ RAII, or Monadically, or any number of other ways.
We will always need device drivers. C and ASM are great for that, no question about it. On the other hand, when
Re: (Score:1)
There are tools that are powerful enough to find if small programs (such as drivers) have these bugs. They are limited by assumptions they make so they can't catch all of them, but they will usually either prove a program free of these bugs up to their assumptions or find a concrete counterexample.
(In general such a problem is of course undecidable, but for the tiny subsets of possible programs that represent what we actually write, the
Re: (Score:2)
Fortify does this (Score:1)
http://www.fortifysoftware.com/products/sca/scaHow ItWorks.jsp [fortifysoftware.com]
I think these guys compile an object that makes it easy to examine every possible logic path in the source.
It's more than just source code pattern matching.
Re: (Score:2)
Because the linter is only right 90% of the time. A language has to be flexible to be useful, a linter doesn't.
Re: (Score:2)
Without even getting into non-mutable semantics, we
Re: (Score:1)
everyone's got a different name for it (Score:1)
Hey, if you want to call it [youtube.com] a "code review", that's your prerogative.
Re: (Score:2)
Business logic? Algorithms? (Score:3, Insightful)
Re: (Score:2)
I think that's exactly right - use tools like PMD to find nickle and dime things like certain null pointer exceptions [sf.net], unused code [sf.net], empty try blocks [sourceforge.net], etc. Let the code reviews be focused on things like "hey, we don't need all these accessors", "we should be using the business rule package here", "this is really more of a Map than a List" and that sort of thing.
The tools do the gruntwork and the people do the thinking... good tim
Re: (Score:3, Insightful)
Re: (Score:2)
(What is "business logic", anyway? Can I assume it's a synonym for "logic", only made more palatable to PHBs?)
I guess that's way of putting it, yes. It's also a tool for people to learn each others parts of the code, and for understanding each others' styles and design choices.
Static Analyzer Run != Code Review (Score:3, Insightful)
No, no, no!
Neither of these development tools is a substitute for the other. Static analyzers are very good at catching *certain* *limited* *kinds* of obvious and likely or conceivable bugs in a short span of time, but in general they cannot deduce the intentions of the developer -- let alone the desires of the user. That's where your peers come in: they review your designs before you write the code and review patches prior to merging them into a revision control system. Static analyzers are *debugging tools*: use them while coding and/or when you run regresssion tests.
Why wasn't that obvious to the submitter?
Re: (Score:2)
Having worked in corporate I/S for far too many years now, I understand that the "right thing" is rarely what happens.
I agree with your point that PMD is not a substitute for a REAL code review, undertaken by a small group of talented programmers. Unfortunately, these rarely happen, so PMD starts to be a very useful second line of defense. For the pragmatic among us, using PMD makes much sense. I wish it wasn't so, but I find that it is. If you work in an environment unlike that which I've des
Re: (Score:1)
Ok, we agree so far...
Why do you wish that using PMD wouldn't make sense?
I think perhaps you missed my other point: Code review is not a substitute for running a static analyzer.
Static anal
Re: (Score:1)
Nice summary. Static analysis tools have a firm place in a well-planned development cycle, and that place is on the developer's desktop, while they're coding. Applying static analysis as a "last thing before ship" stage is a recipe for failure.
So, apply static analysis while coding and get all the advantages that it brings in terms of de
Re: (Score:2)
To be fair, good static analysis tools can catch a LOT of non-obvious bugs. MS Research's SLAM project found a bug in the parallel port driver, a race condition that was triggered if a program was closing the port at the same time the port was being physically removed from the computer. (Hey, it could happen if you have a laptop docking station.) This bug went undetected for like a decade to be
Re: (Score:1)
they = static analysis tools
Got my antecedents confused...
Re: (Score:1)
Very true; and I've seen SAs catch some very non-obvious bugs so I shouldn't have said "obvious". (:
How good is that code? (Score:5, Funny)
Our diabolical prof fed this analysis code itself to it as data. It spat out a D. He got a D.
Re: (Score:3, Interesting)
PMD has a bunch of metrics stuff, including NPathComplexity [sourceforge.net] (thanks to Jason Bennett for writing that one) and CyclomaticComplexity [sourceforge.net]. All the "codesize" PMD rules are here [sourceforge.net].
Re: (Score:2)
Our diabolical prof fed this analysis code itself to it as data. It spat out a D. He got a D.
What an idiot! Come on, if it's his freaking *project*, you'd think he'd at least do the vanity/sanity check of running it on itself. Something like that had better damned well produce an 'A' for itself. Did it never occur to him to at least practice the principles of design he's preaching?
I'd have done the same thing if I were the prof, it's just too damned obvious. And incredibly funny. If I was feeling char
Re: (Score:2)
If it was anything like my senior deign project, he was up late the last week just getting the thing to work right, so making lots of comments and doing lots of extra testing was more than likely not the top priority. Don't forget, you only have about 2 months to design, code, test, and make a presentation and working example of a project.
If you're doing anything like this, you make sure it gives itself an A through the entire development cycle even if it does nothing else. Even if you code the entire t
Re: (Score:1)
findbugs is better (Score:2, Insightful)
PMD - OK, but... (Score:1)
It's like checkstyle - good for catching some things, but sometimes too irritating to take seriously.
Structural analysis for Java (Score:1, Interesting)
http://www.alphaworks.ibm.com/tech/sa4j [ibm.com] back in the 04's.
Yes, heuristics can catch probable mistakes in developed code, but the above sort of thing, if sophisticated enough, could warn you about serious architectural mistakes as you are making them.
Finding Time For Code Reviews Is The Problem (Score:2)
I don't believe a machine, no matter how well programmed, can replace a proper review done by real live human beings. But to do that, you need to give people the time to do it right. Unfortunately,
What happened to Slashdot? (Score:1)
This software cannot ever work properly. (Score:2)
Thus a general algorithm to solve the halting problem for all possible program-input pairs cannot exist, let alone validate the semantic approach or validitiy of the code itself.
Unfortunately there are enough managers who want to devalue software development from a skilled art into a brain-de
Re: (Score:1)
I understand the computer theory behind your comment. However, I don't see what this has to do with the task of improving code quality. Do I need to prove whether a program halts to be able to suggest improvements through static analysis?
Static analysis tools are one part of the complex software development process. I wouldn't be so quick to throw them out completely because of something a very smart computer scientist once said!
Re: (Score:2)
Re: (Score:2)
There were many posts that pointed out that this software was not meant to take the place of code reviews. The original submitter/reviewer had it wrong. No need to rant about the Halting Problem and managers.
Defence contracting (Score:3, Insightful)
Its easy - the source is printed out and circulated
It is the responsibility of each peer to be familiar with the requirements spec (even if its not part of their project), and the design that has been signed off. If they dont agree with the design - too bad - the process allows for plenty of time to look at design options, document those options, and document reasons for choosing one approach above another. Dont waste people's time by trying to flog a dead horse at a code review.
The meetings are known in advance, so its never a problem getting people in the same room at the same time.
Code review meeting is informally divided into 4 sections :
1) Intro - if this is NOT the first code review, quickly list the TODO items that came up from the last code review. Good developers will refer to these recurrent issues quickly, and stick to making use of the time by covering new ground. 10 minutes to discuss any code formatting / spelling / naming convention issues. These ARE important for maintainability and consistency, so egos need to be put on hold for the duration of this 10 minute period. Its not hard to do.
2) Critical issues section - no time limit. During this phase of the meeting, we look at
Strangely, when your peers rip the shit out of your code, yes it hurts, but it gives you more respect for those peers. There is nothing better than feeling that you are surrounded by really top-notch people that you CAN bounce ideas off, and get better insights into what you are working on. That is one side-benefit of tight code reviews.
Good, disciplined peers will stick to the agenda, and not wander off into talking about design options or formatting issues during this period.
Document everything - even if the conversation winds around and ends up coming to the conclusion that the code is not at fault after all - document it, so you dont end up going back down the same path later on down the track.
3) Requirements review : quick session to discuss whether the code matches the requirements. Dont bother turning up to a meeting without a thorough understanding of those requirements - you will look like a dick, and nobody will invite you to a pissup on friday arvo. Document any areas where the code falls short of requirements AND ALSO document any areas where the developer has gone over the top and added a pile of 'features' that are outside of the requirements. Keep the code lean and mean.
4) 5 minute wrap up : write up a TODO list of things that need to be done, estimate a timeframe, make sure it gets plugged into the gannt chart, and set a fixed time for the next review.
Time and schedule should be on everyone's mind, but thats a management and sales problem - not a technical one. Good developers will ignore schedule constraints and stick to doing things right. Having this attitude MAY turn some people's hair grey, but thats not our problem
And we do build good shit.
Most importantly, everyone should come out of a code review with that smug feeling that they really are part of an elite team of coders, who look at things from every possible angle. It builds morale and raises standards all round.
Re: (Score:1)
Re: (Score:2)
Why ? Do you something I dont ?
easy enough (Score:2)
The key here is that the developer improves his code, or learns more about how the other developers work. That's all.
Lintian checking != peer reviews (Score:2)
So it seems that PMD is something like lint, splint or QAC, but then for Java.
Where I work we use QAC on C code (embedded development), to obtain information and statistics on such things. However, there is also an extensive code review practice set up.
Presubmission reviews are the strongest ways (Score:1)
The process that I have implemented within my team is quite simple.
1) Before review, the developers prepare the submission comments as well (low quality checkin comments are as bad as low quality code)
2) The developers then run a script to prepare a diff for review
3) The review is set to *all* developers
4) Any developer can hold the review based on logic
Review Tooling (Score:1)
checkstyle - enforce a coding style. We integrate this into our SCM system so people can't check in unless their code meets the official style. It's draconian but our codebase would have 12 different styles and be close to unreadable without it.
SmartBear's CodeCollaborator - a tool which makes code reviews easy. This is a necessity when working with junior developers on another c