Forgot your password?
typodupeerror
Book Reviews

Book Review: Java Performance 160

Posted by samzenpus
from the read-all-about-it dept.
jkauzlar writes "The standard Oracle JVM has about sixty 'developer' (-XX) options which are directly related to performance monitoring or tuning. With names such as 'UseMPSS' or 'AllocatePrefetchStyle', it's clear that Joe Schmo Code Monkey was not meant to be touching them, at least until he/she learned how the forbidding inner recesses of the JVM work, particularly the garbage collectors and 'just-in-time' compiler. This dense, 600-page book will not only explain these developer options and the underlying JVM technology, but discusses performance, profiling, benchmarking and related tools in surprising breadth and detail. Not all developers will gain from this knowledge and a few will surrender to the book's side-effect of being an insomnia treatment, but for those responsible for maintaining production software, this will be essential reading and a useful long-term reference." Keep reading for the rest of jkauzlar's review.
Java Performance
author Charlie Hunt and Binu John
pages 693
publisher Addison Wesley
rating 9/10
reviewer Joe
ISBN 0-13-290525-6
summary Java performance monitoring and tuning
In my experience, performance tuning is not something that is given much consideration until a production program blows up and everyone is running around in circles with sirens blaring and red lights flashing. You shouldn't need a crisis however before worrying about slow responsiveness or long pauses while the JVM collects garbage at inconvenient times. If there's an opportunity to make something better, if only by five percent, you should take it, and the first step is to be aware of what those opportunities might be.

First off, here's a summary of the different themes covered:

The JVM technology: Chapter 3 in particular is dedicated to explaining, in gory detail, the internal design of the JVM, including the Just-In-Time Compiler and garbage collectors. Being requisite knowledge for anyone hoping to make any use of the rest of the book, especially the JVM tuning options, a reader would hope for this to be explained well, and it is.

JVM Tuning: Now that you know something about compilation and garbage collection, it's time to learn what control you actually have over these internals. As mentioned earlier, there are sixty developer options, as well as several standard options, at your disposal. The authors describe these throughout sections of the book, but summarize each in the first appendix.

Tools: The authors discuss tools useful for monitoring the JVM process at the OS level, tools for monitoring the internals of the JVM, profiling, and heap-dump analysis. When discussing OS tools, they're good about being vendor-neutral and cover Linux as well as Solaris and Windows. When discussing Java-specific tools, they tend to have bias toward Oracle products, opting, for example, to describe NetBean's profiler without mentioning Eclipse's. This is a minor complaint.

Benchmarking: But what good would knowledge of tuning and tools be without being able to set appropriate performance expectations. A good chunk of the text is devoted to lessons on the art of writing benchmarks for the JVM and for an assortment of application types.

Written by two engineers for Oracle's Java performance team (one former and one current), this book is as close to being the de facto document on the topic as you can get and there's not likely to be any detail related to JVM performance that these two men don't already know about.

Unlike most computer books, there's a lot of actual discussion in Java Performance, as opposed to just documentation of features. In other words, there are pages upon pages of imposing text, indicating that you actually need to sit down and read it instead of casually flipping to the parts you need at the moment. The subject matter is dry, and the authors thankfully don't try to disguise this with bad humor or speak down to the reader. In fact, it can be a difficult read at times, but intermediate to advanced developers will pick up on it quickly.

What are the book's shortcomings?

Lack of real-world case studies: Contrived examples are provided here and there, but I'm really, seriously curious to know what the authors, with probably two decades between them consulting on Java performance issues, have accomplished with the outlined techniques. Benchmarking and performance testing can be expensive processes and the main question I'm left with is whether it's actually worth it. The alternatives to performance tuning, which I'm more comfortable with, are rewriting the code or making environmental changes (usually hardware).

3rd Party tool recommendations: The authors have evidently made the decision not to try to wade through the copious choices we have for performance monitoring, profiling, etc, with few exceptions. That's understandable, because 1) they need to keep the number of pages within reasonable limits, and 2) there's a good chance they'll leave out a worthwhile product and have to apologize, or that better products will come along. From my point of view, however, these are still choices I have to make as a developer and it'd be nice to have the information with the text as I'm reading.

As you can see, the problems I have with the book are what is missing from it and not with what's already in there. It's really a fantastic resource and I can't say much more than that the material is extremely important and that if you're looking to improve your understanding of the material, this is the book to get.

You can purchase Java Performance from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

Book Review: Java Performance

Comments Filter:
  • by idontgno (624372) on Friday February 17, 2012 @05:44PM (#39079663) Journal

    It seems this kind of volatile deep non-documented black magic might change radically from JVM revision to revision. Although the Oracle "documentation" page [oracle.com] seems to contain a lot of "legacy" options, there still seems a risk that this book would be outdated as soon as the next JVM release.

    Oh, well, the tech publishing industry seems to be doing pretty well, even if the rate of technology change means that a tech fact is OBE before it's committed to ink.

    • by MightyMartian (840721) on Friday February 17, 2012 @05:49PM (#39079715) Journal

      Indeed. These sorts of options are so version dependent (not even going to alternative implementations) that I think the overwhelming majority of developers would want to stay far away from this sort of book.

      • They're not going to throw out the JVM and rewrite it from scratch between releases. If there are 60 options now, there may be 66 in the next release. That means 90% of the book is still useful and the other 10% is just missing.

        On top of that, as the reviewer clearly states "Unlike most computer books, there's a lot of actual discussion in Java Performance, as opposed to just documentation of features.... there are pages upon pages of imposing text, indicating that you actually need to sit down and read it...". So this book is already the kind of book that isn't going to be overturned by one more JVM release. It may contain actual wisdom rather than a list of flags.

        • Exactly. Take even the simplest Linux command, like 'rm'. Now look at this excerpt from the man page:

          --no-preserve-root
          do not treat `/' specially

          What does that mean? That little blurb really isn't sufficient to learn what that option does. If you already are familiar with rm, then that blurb will likely remind you of the intended action. Unfortunately, the goal of a lot of online documentation is to refresh yo

    • by idontgno (624372)

      And, in good Slashdot form, replying to myself (explain again why don't we have an "edit" button?), I read on the very Oracle web page I cited earlier:

      Options that are specified with -XX are not stable and are not recommended for casual use. These options are subject to change without notice.

      So, the "-XX" options are unstable and are subject to change without notice, which is why we have to commit them to a $25 to $35 pile of dead trees.

      • by DiegoBravo (324012) on Friday February 17, 2012 @06:19PM (#39080003) Journal

        Apparently just the chapter 7 is about the XX:

          Chapter 1: Strategies, Approaches, and Methodologies, Chapter 2: Operating System Performance Monitoring, Chapter 3: JVM Overview, Chapter 4: JVM Performance Monitoring, Chapter 5: Java Application Profiling, Chapter 6: Java Application Profiling Tips and Tricks, Chapter 7: Tuning the JVM, Step by Step, Chapter 8: Benchmarking Java Applications, Chapter 9: Benchmarking Multitiered Applications, Chapter 10: Web Application Performance, Chapter 11: Web Services Performance, and Chapter 12: Java Persistence and Enterprise Java Beans Performance.

        BTW, of course you should avoid the XX options, but when in need, it is better to have some authoritative reference than having to rely on the uninformative Oracle docs or random forums.

    • by medv4380 (1604309)
      How many years were we on Java6 before Java7 came out?
  • Currently reading (Score:2, Informative)

    by Anonymous Coward

    The development team at my company is currently reading this book. I'm three chapters in, and am having a hard time following it all. I often have to reread paragraphs, or entire pages, as soon as I finish them just to keep terms and names straight. Some applications which are discussed are five words long, like Microsoft Process Performance Analysis Console, or something head-spinning like that. That's the actual name of the application, so there is no other way to refer to it, so it's just the nature

  • I can see the use for options to specify heap sizes, and to tweak latency vs. throughput of the GC. These are critical for memory-constrained and timing-constrained applications, respectively. I presume the other options are performance related, but how effective are they really?

    • by DalDei (1032670) on Friday February 17, 2012 @06:37PM (#39080185) Homepage
      -XX11 is useful as its +1 faster than -XX10, should be the default really.
      • by naasking (94116)

        And does the numeric value of that option dictate the degree of JIT-time optimization performed?

      • -XX11 is useful as its +1 faster than -XX10, should be the default really.

        I hear they're contemplating a "-XX12" option that threatens to rip asunder the very fabric of time and space.

  • Too many options (Score:5, Insightful)

    by medv4380 (1604309) on Friday February 17, 2012 @06:38PM (#39080205)
    The JVM really needs to get smarter. 60 different controls and switches is just too much. How hard can it be for the JVM to look at the available number of cores and just turn on the Parallel Garbage Collector. Do I really have to manually turn it on so that Minecraft will use it? Why can't the JVM allocate more memory on its own? Does it really need permission to use more than 1 Gig of memory? It just sits there waiting for the day some user decides to import every single possible datapoint into it, crashes, having used 1 Gig of 8, with a "Out of Memory" error. It's not like Developers know what the Xmx and Xms setting need to be. They just set them arbitrarily high in hopes that some user doesn't try to find out what the maximum datafile it can take is. That just slows it down and makes it so when the GC finally does fire off it has 10x the amount of trash it should have if the value was set lower. Those options are only useful on internal applications that never get into the hands of everyday users. It's probably a great book for server side development, but it is highlighting a major failing of the JVM.
    • Re: (Score:2, Interesting)

      by Anonymous Coward

      Really, just set -Xincgc -Xmx# and walk away.

      It does periodic GC with that, not waiting for the crash.

      • Re:Too many options (Score:5, Interesting)

        by medv4380 (1604309) on Friday February 17, 2012 @07:53PM (#39080971)
        You didn't prove anything except point out yet another option to be set. If that was the best way to set the JVM why isn't that the default. Why is it left up to the user to specify it. Why do users have to figure out how to tweak it so that Minecraft work "optimally" on multicore machines? When you have to figure out this

        java -Xmx1024M -Xms1024M -XX:+UseFastAccessorMethods -XX:+AggressiveOpts -XX:+DisableExplicitGC -XX:+UseAdaptiveGCBoundary -XX:MaxGCPauseMillis=500 -XX:SurvivorRatio=16 -XX:+UseParallelGC -XX:UseSSE=3 -XX:ParallelGCThreads=4 -jar /media/storage/minecraft.jar

        -Xincgc might sound good but then again

        -Xincgc Enable the incremental garbage collector. The incremental garbage collector, which is off by default, will eliminate occasional garbage-collection pauses during program execution. However, it can lead to a roughly 10% decrease in overall GC performance.

        • by devent (1627873)

          I think you already answered your own question. There is no "one size fits all" and if you can live with a 10% decrease in overall GC performance, then you can enable -Xincgc and have less GC related pauses. it's just depending on what you want and what your use case is.

          If I understand Performance Options Option and Default Value [oracle.com] correctly, then most the options you mentioned are enabled by default already anyway. Except the flags DisableExplicitGC and UseParallelGC, some other you mentioned I didn't found

    • by bws111 (1216812)

      The major use of Java is server side enterprise stuff, and all those controls are critical there. It is certainly not a major failing of the JVM, it is an important feature. The alternative to the JVM having all those controls is for each and every application to have its own controls.

    • Some of these directives may be VM specific and might change the way synchronization or allocation works in such a way that it's inconsistent with the "default" and break applications that weren't tested with these options. Additionally, Java has a tendency to pre-allocate memory when it's not needed in preparation for future allocation. Operating systems will likely show this as "memory in use" even though Java will give up this memory when it detects the OS is running low. Users that aren't aware of this
    • by TheSunborn (68004)

      "The JVM really needs to get smarter. 60 different controls and switches is just too much. How hard can it be for the JVM to look at the available number of cores and just turn on the Parallel Garbage Collector. "

      Its easy enough to count cores and enable the parallel collector, but the total cpu usage increases when using the Parallel Garbadge Collector. This mean that if you have a lightly loaded system the Parallel Garbadge Collector is a net gain, but if your system is already running close to 100% at al

  • by roman_mir (125474) on Friday February 17, 2012 @06:42PM (#39080227) Homepage Journal

    In my experience, performance tuning is not something that is given much consideration until a production program blows up and everyone is running around in circles with sirens blaring and red lights flashing.

    - if production blows up it signals that the underlying problem is not likely to be fixed with 'performance tuning'.

    There is performance deficiency and then there is "production blows up" and those are different things and must be addressed by different sets of practices at different times.

    Production blowing up means the design is flawed, it means misunderstanding of how the application was going to be used.

    Slow response time on the other hand is about tuning, but it's very unlikely that environment tuning can help really to fix this.

    Back in 2001 I was working for then Optus that was bought out by Symcor and the main project they brought me to (contract) was for this Worldinsure [insurance-canada.ca] insurance provider, and the project was to do some weird stuff, business wise speaking, allow clients to compare quotes from different insurance providers. Business model was changing all the time, because insurance providers do not want their products to be compared against one another on line (big surprise).

    The contract was expensive (5million) and WI wouldn't pay the last bit (a million I think) until the application would start responding at a 200 requests per second, and it was doing 20 or so :)

    If anybody thinks that just some VM tuning can fix a problem where application is 10 times slower than expected, well, you haven't done it for long enough to talk about it then.

    It took a month of work (apparently I wrote a comment on it before) [slashdot.org] that included getting rid of middleware persistence layer, switching to jsp from xslt, reducing session size by a factor of 100, desynchronising some data generators, whatever. Finally it would do 300 requests per second.

    But the point is that when things are crashing or when performance is really a huge issue, you won't be optimising the VM.

    VM optimisation is not generally done because I think the application has to do something that is not generic.

    Imagine an application that only does one thing - say it only reads data from a file and then runs some transformation on it, maybe it's polygon rendering. Well then you know that your app. is doing only ONE THING, then you can probably use VM optimisation, because you can check that the one thing your app does will become faster or more responsive, whatever.

    But if your app includes tons of functionality and tons of libraries that you don't have control over and it runs in some weird container on top of JVM, then what do you think you are going to achieve with this?

    You likely will optimise something very specific and then you'll introduce a strange imbalance in the system that will hit you later and you won't see it coming at all.

    If your app does one thing, maybe you have a distributed cluster with separate instances being responsible for one type of processing, then you probably can use specific optimisation parameters.

  • by msobkow (48369) on Friday February 17, 2012 @07:31PM (#39080711) Homepage Journal

    I've worked with Java since 1.0. The only optimization options I've ever used were the heap and stack size adjustments.

    Setting your memory heap too high actually degrades performance, oddly enough. I've got 4GB on this box and over 2.5G is normally used by disk cache, but if I allocate more than about 768MB to the heap, the performance suffers.

    Maybe some of these options have real effects on certain production code characteristics, but I've found the best performance tuning options are:

    • Whenever and wherever possible, use intrinsic types, especially extracting a char from a string for evaluation rather than using the object accessors for a String. For whatever perverse reason, the Oracle Java compiler will keep re-fetching the value by re-executing the getChar() rather than realizing it's a constant once the value has first been extracted, because the String isn't changing. Net performance boost for my code: over 30% improvement for a days coding on a multi-year project.
    • Instead of allocating and destroying objects, consider hanging on to used objects and using your own allocator. This won't help for implicit object construction, but reusing modifiable objects helps performance dramatically. I saw about a 10% performance improvement when I experimented with this. Raw object allocation is EXPENSIVE.
    • Wherever possible, tighten your loops into a single statement of execution. For unknown reasons, the JVM seems to perform better calling a small function fragment than it does executing inline code in a for/while loop block. This makes no sense based on what I know of C++ tuning, but there you have it: Java likes functions better than inline code when executing loops. Maybe there is some optimization that kicks in for a function that doesn't happen for a code block.
    • If possible, construct a huge single String assignment using conditional expressions (i.e. ( bool-expr ) ? ret-if-true : ret-if-false ) instead of appending to a string buffer with a sequence of if-then-elses. The code is harder to read, but for anything of moderate complexity you can achieve up to a 30% performance improvement by doing this.

    So there you have it -- my favourite REAL WORLD, TESTED, and PROVEN TO WORK performance tweaks.

    • Actually, re-using objects is an excellent strategy if you are using JavaBeans (where you have symmetric getters and setters). Unfortunately, so many programmers hide setters away from you, supposedly to protect you from yourself, but it makes it impossible to re-use objects even where it would otherwise not only be perfectly possible and valid but also very efficient as well. In short, use JavaBeans when you need complex (non-intrinsic) value-objects and make the setters symmetric with the getters, *alway

      • by dr2chase (653338)

        Problem with object pools (if you are multi-threaded) is that they you need to synchronize access to them, and THAT adds overheads and potential bottlenecks. You can make your pools thread-local, but then you've got to worry about them getting too large (how many threads do you have, also?). Unless you're doing some really expensive object preparation, if you've got a generational collector, that's usually a good bet -- simpler, fast enough, not a source of confusion six months down the road.

        • All good points. You don't even need do use a thread local, just have an independent pool per thread (assuming your threads are long lived and do a lot of work). Simple and is nearly as efficient as having a global pool, but without the locking overhead.
        • by msobkow (48369)

          I haven't had to do this with Java yet, but I have implemented thread-specific pools with multi-threaded C++ application code, where each thread had it's own pool. It was a critical performance tweak for one system I worked on 12-15 years ago, as we were pushing the hardware so hard that within a year the IO bus wouldn't be able to move the expected data even if it did nothing but shuffle MY module's data 24x7. (Yes, we EXPECTED to need new hardware, but we needed a solution NOW.)

          Fortunately, Java seem

          • by dr2chase (653338)

            Most likely it uses its own allocation pools for all objects, and may include some localization for non-escaping objects. It is possible that it could spot objects whose interfaces are side-effect free and that are never tested for object equality; those are effectively "values" and subject to loop-hoisting and common-subexpression optimizations.

    • If you're building strings, use a sensible API like java.lang.StringBuilder (or if you're feeling unnecessarily anachronistic, java.lang.StringBuffer).

      StringBuilder s = new StringBuilder("whatever");
      s.append(somestuff);
      s.append("someotherstuff");
      return s.toString(); //note: all uses of APIs may be entirely inaccurate

      • by msobkow (48369)

        Practical experience and years of testing are why I recommend the ugly looking conditional statements. The Java compiler seems to build one big string buffer with one allocation when you use that technique, rather than repeated calls to StringBuffer.append(). But go ahead, stick with what's easy to read if you're not concerned with raw performance -- readability is every bit as important as performance when it comes time to enhance or maintain the code by hand.

        • by dkf (304284)

          Getting good performance in Java really requires knowing about StringBuilder (unless you're not doing anything much at all with strings). That's because the compiler changes this:

          s += "the quick " + "brown " + fox + " jumps " + over;

          Into this:

          s = new StringBuilder(s).append("the quick brown ").append(fox)
          .append(" jumps ").append(over).toString();

          The adjacent constants are combined into one, and the variable parts are done as separate calls to append(). (The old-style StringBuffer has the same interface, b

  • An issue recently came up on my Engineering team where a pig mapreduce job that stores in hbase slowed over the course of completing tasks until all the tasks failed due to timeouts. What appeared to be happening was a gc failure and pause due to tenure region exaustion and the built in cluster function to kill off the garbage collecting regionserver. The link below describes the issue and possible workarounds by implementing a custom memory allocation strategy. It's also a must read for anyone who isn't

After an instrument has been assembled, extra components will be found on the bench.

Working...