Decompiling Java 221
Decompiling Java | |
author | Godfrey Nolan |
pages | 264 |
publisher | apress |
rating | 8/10 |
reviewer | Richard Rodger |
ISBN | 1590592654 |
summary | Learn how decompilation works in order to properly protect your intellectual property. |
If you are interested in Decompiling Java, then this book tell you exactly how to do that. There's no fluff and every chapter counts. I can safely concur that Fiachra's observations are indeed correct. You'd better be prepared for some serious hard core details, but then that's what you'd paid for. It is really great to read a book that doesn't end each chapter with a few links to the real material because the author couldn't be bothered to write it up.
So what do you get? As a battle-hardened Java coder of not a few years programming, I wanted to find out about the gory details of bytecodes and how to get at them. Now it's a subject I always knew I should know about, but never took the time to read up on it. Decompiling Java puts all that knowledge into one place.
Here's a quick run-through of the chapters so you know what you're getting:
Ch.1 IntroductionDecompilation isn't just another coding tool - there are other, real world issues like ending up in jail to think about. Godfrey proposes a sort of code-of-honour for decompilers. This book could so easily have been positioned for the fr33ky kod3r skript kiddie market, and I'm glad that the author and publishers took a mature and sensible approach to the subject. I have had to decompile purchased code because of bugs and I'm glad that someone took the time to think about an ethical framework for doing this.
Ch.2 Ghost in the Machine
A good and solid introduction to the JVM and the classfile format. If you're in the market for this book, you probably already know most of this, but a refresher course is always good. For me, it definitely sorted out a lot on internal hand-waving on the subject. Just remember kids, the only thing to fear is fear itself - it's only binary data after all.
Ch.3 Tools of the Trade
Although the author builds his only decompiler later in the book, it nice to get a chapter devoted to the existing toolset and the Java decompiler scene.
Ch. 4 Protecting your Source
For the honest developer, knowing how to decompile code is more about protecting your own source code than breaking someone else's (who wants to read other people's smelly code anyway!). This chapter is one of the most directly practical. I had always assumed that obfuscation was a magic fix that I could apply if necessary. In reality, good obfuscation is just like good encryption (that is, uncommon, difficult to verify, and still subject to lateral attacks). Even compiled bytecode has relatively low entropy, so the value of obfuscation must be considered carefully.
Ch.5 Decompiler Design
This is were it starts getting a wee bit technical. Decompilation, as you can imagine, is a bit of a black art, and there are many ways of doing it. Some of them involve scary maths and some involve scary coding and the rest both. But that's why you don't meet many people who can write decompilers. Godfrey does a great job of taking you on a practical run through this fog of decompilers. At the end of this chapter you will be able to decide for yourself what approach is best suited to your problem domain. Again, this material can be challenging but it's like boot camp: You just gotta.
Ch.6 Decompiler Implementation
If the previous chapter hurt your brain and scared you silly then this chapter will have you weeping for joy. The author takes a practical, effective, and most importantly, understandable approach to actually implementing a compiler. Now, as he freely admits, his design may encounter difficulties with edge effects and infrequently used idioms, but it will take you to the point where you can solve them yourself. I really had to smile at how simple and effective the approach taken here is - instead of the expected multiple passes and mind bending parse tree manipulation, we have a single-pass, source-generating decompiler for Java. You won't follow it all first time, but it does work and you can verify it for yourself. Like I said at the start, you don't get that empty feeling from this book, and this chapter is pretty much why. I bought a book about decompiling Java, and now I can.
Ch.7 Case Studies
This chapter addresses the "why" of decompiling, returning again to the moral questions raised at the start. It's more food for thought than prescriptive preaching though, which again is refreshing. I have admit to dipping into this chapter while reading the rest of the book - the human interest angle always works a treat!
Of course, no book is perfect. What I think could have helped a bit overall would have been a introductory chapter to bytecode. But it's not a great loss and bytecode is actually pretty simple once you get your head around it. Still it might have lessened the learning curve somewhat.
Decompiling Java is a great addition to that section of your bookshelf dedicated to serious books that will be around for a while. The JVM specification and Java bytecode are not going to change that much, so this book is something you'll be able to use for a long time. Personally the best thing about this book for me was that it took me to the next level. Not many books can do this. As a working coder, I pretty much put things like decompilation into the "too hard, just for academics, and I could never grok it", category. It's great when a book comes along that can can you out of that comfort zone.
You can purchase Decompiling Java from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, carefully read the book review guidelines, then visit the submission page.
What ethical problems? (Score:5, Insightful)
What ethical problems? Decompiling is perfectly moral and ethical. Whether it is illegal is a seperate and, for me, almost irelevant issue. If I legally own a copyrighted work I am allowed to read it, period and end of story. Corporate licences excepted, software is SOLD, not licensed despite the scary words on the box and the dread click through EULA.
Hell, I learned assembly by writing a disassembler (in BASIC) and reading the Microsoft BASIC roms, then later reading the commented listings that ran in Color Computer Magazine. (TO avoid a copyright fight, and because M$ refused to grant them permission, CCM ran only the comments and memory locations, leaving the reader to run their own dissassembly for the opcodes.)
The only ethical problem would be lifting the code and reusing it without permission and I think we all know that is wrong.
This is one of the features of Java (Score:2, Insightful)
As systems get more open/advanced, the sources are more difficult to hide. In case of web apps, there is no need to decompile anything, the javascripts are available for all to see in plain text. Even more advanced applications that use ASP pages that execute on the server, can be seen by changing the URL to list the source rather than execute them (I dont remember the exact syntax, but I think it is related to the alternate data stream in NTFS)
That is the reason, we have copyright. On a more personal note, I think it serves the community if someone can see your implementation in code, get inspiration and either correct mistakes or expand on the code.
Re:no bytecode intro? (Score:0, Insightful)
You didn't sell it. (Score:5, Insightful)
>knowing how to decompile code is more about protecting your own source code.
There are many reasons to learn about, implement and use decompilers, but I don't think "to properly protect your intellectual property" should be one of them.
I'm got somewhat interested in this book (never heard about it before), but I think I'm going to pass. Sounds like the decompiling described is too much of a one-trick pony -- which is fine, it's about decompiling java after all -- but I'd really like something like an extension and update of Cifuentes work in book form, with the lessons from the IDA team too.
You know, from the beginning; starting with machine descriptions and disassembly for a generic front-end, efficent IR, and on up through the back end.
Now that'd be a tome [worth paying for].
Re:What ethical problems? (Score:3, Insightful)
I disagree here. I am a strong believer that people should be able to trade goods/services for prices/conditions they mutually agree upon. If I write software and say I will sell it to you for $x on condition that you do Y (perhaps Y is not decompiling the source), and you agree to these terms, I think it is morally repugnant of you to break our agreement and decompile. You had the choice to not purchase my product, after all.
Maintainance nightmare (Score:2, Insightful)
Let me get this straight: the author recommends that 'honest' developers obfuscate their code?
I've read programs that I thought were obfuscated, but later found out were just poorly written. Other times I've run into programmers who, tin hats firmly affixed, went to great lengths to make sure no one learned their Merlinesque techniques for getting the most out of BASIC.
In context, the author seems to be talking about obfuscating object code. Yikes! What's the opposite of debugging? Buggery?
Encrypting object code to make it harder to reverse engineer is a giant waste of time. Here are more productive ways to spend the the same amount of energy:
In fact, I can't think of many worse wastes of time than making a compiled program hard to understand [multicians.org].
Re:Maintainance nightmare (Score:5, Insightful)
Re:What ethical problems? (Score:5, Insightful)
Re:What ethical problems? (Score:3, Insightful)
Which is really surprising to me (Score:4, Insightful)
who, as a compiler hacker, would have expected an optimization pass to transform the first form into the second form before generating the bytecode.
Or more precisely, to understand that both forms are testing for the same thing, and to produce identical simplified bytecode.
Re:What ethical problems? (Score:3, Insightful)
Very.
I just don't understand people with your greedy, assbackwards, mindset.
I don't understand people with your mindset, a mindset that strips individuals of their rights. Listen, if I have created something, and want to sell it to you with conditions, why shouldn't I be able to do that? If you don't want to abide by those conditions: DON'T FREAKING BUY WHAT I'M SELLING. Have a little restraint, Mr. Consumer. Jebus.
I am 100% for free trade between people. You, on the other hand, are against that, since you don't think a seller should be able to make a condition, and a buyer free to choose to accept or deny the sale based on that condition.
Finally, capitalism works. In a true capitalistic marketplace, having unnecessary, artificial conditions wouldn't be benefitial to the seller, since other sellers could enter the market without such fluff conditions and make the sale. Going back to the author denying readers to read certain chapters, who would buy those books? Rather, the authors who granted full access would far outsell those who did not.
Re:This is one of the features of Java (Score:3, Insightful)
Web applications are typically implemented server-side. Javascript is client-side code.
Javascript != web applications
Perhaps what you are referring to is the source for ASP and JSP/servlets. There have been bugs in servlet containers (specifically, I believe the issue was that the web server in front of a servlet container wasn't configured correctly, and thus instead of passing the request to the SC for handling, just retrieved the file and returned the content to the user's browser), but the code in a JSP or ASP is executed on the server before it ever reaches the client -- this means that it is not possible in the normal course of events for a client to see the "source" contents of such a server-side object.
This constraint can of course break down when web application servers are not built and/or configured correctly.
Re:This is one of the features of Java (Score:4, Insightful)
Even more advanced applications that use ASP pages that execute on the server, can be seen by changing the URL to list the source rather than execute them
Are you smoking crack?
You can't arbitrarily get at source code on someone's web server. Do you think eBay would want you seeing the passwords to their database servers?
Web apps aren't written in JavaScript. Sure, there might be some to drive calendar selection or something, but pretty much all real apps (shopping carts, etc.) are done server side.
Please get a clue and stop spreading your FUD around.
Additionally, this isn't a "feature" of Java. It's just a side-effect of its machine-independent bytecode. You could argue that it's not all that hard to reverse engineer compiled C - if you step it through a debugger you can see what it does fairly easily.
Systems being more "advanced" (let's wave our hands a little bit more) won't make it any more difficult to hide the source. Many many people run Java on the server side of web apps. It will always be impossible to view the source for such applications (unless the developers put it up for the world to see, of course). As for being "open", what do you mean? If you mean, "open source" then, well, duh...
Re:Links to books on Amazon (Score:2, Insightful)
Yes, it does make them less useful. Because now it is impossible to tell whether you are saying things like "a more in-depth look" because you really mean it, or because you stand to make a quick buck by making bogus claims about the book.
Nothing personal, of course; you can probably see yourself why the rest of us simply can't know if you are being honest or running an astroturf con.
obfuscators don't work? (Score:3, Insightful)
I'm not talking about tiny programs; but who even bothers decompiling tiny midlets? Isn't it obvious what they're doing? With tiny programs, if you know enough to be cracking Java programs, you might as well just write the thing out yourself. It's not magic.
But for larger applications, any decent obfuscator can make it very time-consuming to decompile and edit the programs. I posted more on this in another thread, so let me just say you really have to try it out before you say obfuscators don't work. They definitely DO work at foiling the average cracker who won't spend hours and hours reconstructing a $100 piece of software.
Re:What ethical problems? (Score:3, Insightful)
> anything to use GPL'd software, or copy it, or distribute it, as long
> as they meet its conditions. So if I don't have to sign anything, then
> the GPL isn't binding.
Exactly correct. If you copy a GNU program and distribute it you do not have to accept the GPL. However when RMS and his squadron of elite attack lawyer ninjas descend upon you for violating their copyright, smiting thee with their rightous fury, only saying "I accept the GPL and have followed all of it's conditions" will make them stop, because otherwise that have you dead to rights on copyright infringement. See the difference? The GPL is a LICENSE to perform an action otherwise forbidden by law. You don't have to sign it, but if you want to take advantage of the additional freedoms it grants you must accept it in whole, both the THOU SHALL and the THOU SHALL NOT parts, because nothing else gives you the right to distribute a copy of a GPL licensed work.
All the GPL is, in essence the following statement. "This program is copyrighted. This means that by law you may not copy it. However, because we are good hoopy froods and want software to be Free, we grant you the right to copy and redistribute it under the following conditions. By distributing copies it is presumed that you accepted the limitations of this license since nothing else gives you permission to distribute copies so any copies made under terms and conditions not covered by this license are by definition not permitted by this license. QED."
Now take the typical EULA, it removes rights the end user already has, offers nothing of value in exchange and expects to be taken sight unseen in most cases. Where is the implied consent as in the GPL? By ignoring it I still have the right to run the program because I purchased it, I can reverse engineer it because I I bought the copy and have as much of a right to read it as my computer does.