Book Review: Scalability Rules 54
eldavojohn writes "As a web developer in the 'Agile' era, I find myself making (or recognizing) more and more important decisions being made in developing for the web. Scalability Rules cemented and codified a lot of things I had suspected or picked up from blogs but failed to give much more thought to and had difficulty defending as a member of a team. A simple example is that I knew state is bad if unneeded but I couldn't quite define why. Scalability Rules provided this confidence as each of the fifty rules is broken down in a chapter that is divided into what, when, how, why and key takeaways. A strength of the book is that these rules cover all aspects of web development; but that became a double edged sword as I struggled through some rules meant for managers or technical operators."
Read below for the rest of eldavojohn's review.
You might recognize the authors as two of the three partners of AKF Partners which means that the book pushes a lot of their concepts like the AKF Cube. A bonus is that they have a very long list of clients and aren't afraid to remind the reader that they have consulted to hundreds of companies so when they say they see these rules solving problems frequently, there's weight to that. Also, they have two books but don't confuse Scalability Rules with The Art of Scalability as the latter focuses on people, processes and technology instead of the rules of scaling. Scalability Rules: 50 Principles for Scaling Web Sites | |
author | Martin L. Abbot and Michael T. Fisher |
pages | 272 |
publisher | Addison-Wesley Professional |
rating | 8/10 |
reviewer | eldavojohn |
ISBN | 978-0321753885 |
summary | 50 Principles for Scaling Web Sites |
First off this book gives you a primer of rules for you to start with depending on whether you are mostly a manager, software developer or technical operations personnel. I'll concentrate on the specifics of the software developer chapter and summarize the others at the end of this review. Also note that aside from some SQL, I only saw PHP code in this book. Luckily there's only a handful of snippets presented and they are easy to follow. Additionally each chapter ends with solid references (usually online resources) to back up the claims listed in those sets of rules.
The first chapter is devoted to reducing the equation and focuses on removing needless complexity from your solution. You can find this chapter here if you want to see how the layout looks. They give a lot of solid reasons for this and also a lot of good examples like understanding what your users care about. Why build a prompt to export a blog post as a PDF if 99% of the users don't care about it? Next up they say the rule to design to scale means designing for 20x capacity, implementing for 3x capacity and deploying to 1.5x capacity. A strength of the book are the grids that illustrate what is low, medium or high cost and impact through the chapters. Every time they discuss options at different parts of the solution development process, the user is given a chart to understand why. The next rule stresses that you can usually identify 80% of your benefit achieved from 20% of the work (80-20 rule). Rule 4 is strangely specific and implores the reader to simply reduce DNS lookups. However — and this is the first of many — they remind the reader that this rule must be balanced with putting your system all on one server just to reduce DNS lookups. Such a strategy can result in that becoming a choke point. Rule 5 quite simply instructs the reader to use as few objects as possible in your webpage.
The final rule of chapter one is the first one I disagree with in the book. The rule says "Don't mix the vendor networking gear." And this goes against every fiber of my being. Why even have networking standards if you are not to mix the vendor networking gear? Looking to upgrade one component? Better stick to brand X no matter how crappy they have become. This results in being nickeled and dimed and vendor lockin. If scalability is your sole goal than perhaps this is sound instructions. But I cannot understand how anyone would indicate lockin to a vendor — especially in today's networking gear.
Chapter two is incredibly short but potent. It covers some basic database concepts like why ACID properties of databases make them difficult to split. This chapter is spot on and calls upon the AKF cube for dimensions of scalability. Three dimensions are: You can clone things, split different things and split similar things (like by country region). This cube reappears throughout the book and it should be noted that the book does a good job of giving examples of when each dimension is a good choice for scaling and when it is a bad choice compared to the other two. In my line of work, massive scaling solutions have implemented all three.
Skipping to the next developer chapter on not duplicating your work, the text ranged from the incredibly obvious "Don't double check your work" to relaxing temporal constraints. The chapter is short like chapter two and didn't offer me a whole lot. A third rule was again oddly specific in saying not to do redirects and even getting down into the very fine specifics of what HTTP codes are and how they affect your response times.
The next chapter for developers is chapter ten on avoiding or distributing state. Rule 40 actually came in useful at my job as it simply states "Strive for Statelessness." There was an easy solution to a problem in one of our projects that involved storing an object in the session to keep track of what was being displayed to the user. Having read the book, I instead made this web application nearly stateless (except user authentication and the like). Later on, as we started testing the application in multi-tabbed browsing and users began opening many search tabs and viewed several objects at once to compare them, I was glad that I had not gone down this path. Doesn't have much to do with scalability but I think all web developers should read this chapter as it really does pay to avoid state when possible.
As the rules grew closer to 50, they lost their potency. The authors did a good job of trying to put a bit of ranking in the appearance to these rules. The final developer chapter on asynchronous communication and message buses is probably the most specific and was the least useful for me. While all the rules in this chapter are true, they again border on the banal with examples like "Avoid Overcrowding Your Message Bus."
Having read this book cover to cover, it is a very short book with extremely succinct and organized summaries (the final chapter is a short review of each rule). The manager and operations chapters didn't really do a lot for me overall but would occasionally have very interesting chapters that opened up a lot of the logic behind content delivery services to me. Occasionally I would take slight issue with some rules but the most egregious rule I read was Rule 28 "Don't Rely on QA to Find Mistakes" and then the chapter opens with calling the title of this rule "ugly and slightly misleading and controversial." Because it is and could probably be replaced with another sentence from the chapter: "You can't test quality into your system." Why rely on sensational headlines when I'm already holding your book? I think this book would have been a solid 9/10 if not for this oddity in the large rule set.
I've given each of these rules a decent amount of thought and will keep them at the back of my mind as I write code in an agile environment. Mistakes made early on can be very costly in scaling terms. This book will definitely be kept around at work when I need a solid argument for those design decisions that might take more work but save in the future when it needs to scale.
You can purchase Scalability Rules: 50 Principles for Scaling Web Sites from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Don't mix the vendor networking gear? (Score:2)
Did they say why? That kind of strikes me as odd to the point of ridiculous too, but if they at least explain it, maybe it would get a little less weird.
No, I don't really think it would get less weird, but .. WTF! WHY?
Re:Don't mix the vendor networking gear? (Score:5, Insightful)
Did they say why? That kind of strikes me as odd to the point of ridiculous too, but if they at least explain it, maybe it would get a little less weird.
No, I don't really think it would get less weird, but .. WTF! WHY?
My best guess would be for support reasons. Once you are dealing with a network problem, being able to have a single support line for hardware / OS problems on your network gear really makes life easier. Especially when dealing with setups that cost lots of money, no one is happy when you have more than one vendor pointing the finger at the other (or worse, having to send your gear into their lab for diag and they come back much later to tell you it must be a problem with the other vendors gear, and can you please send that in as well so we can work it in the lab (at growing cost to yourself/company, and depending on your contract, likely the loss of your million dollar router while they look at it)), although it sure can make for some interesting conference calls if you can get the engineers from competing companies on the phone together, and just listen to them duke it out.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
notwithstanding vendor incompatiblities, the "one throat to chock" position is often held by organizations that lack technical competency, at least in a lot of my experience.
Re: (Score:2)
Re: (Score:1)
The fundamental problem is accountability, not competence. When you have two vendors, both will point at the other, and you're left in the middle.
Re:Don't mix the vendor networking gear? (Score:4, Informative)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
I would agree with homogeneous equipment in specific layers. Homogenous throughout? Not so much, especially when going through firewalls.
Re: (Score:1)
Re: (Score:2)
So you're really saying don't mix 2 types of load balancers in a cluster, and that type of thing? That would be insanity in production systems.
Honestly, while I have seen this in the real world, I often wonder if those responsible for it are qualified to even touch a keyboard or a network cable. For a production rollout, I spec homogenous layers, as it removes "configuration hell" from the equation. I only have to worry about single sets of configuration per equipment type in the layers, and removes an enti
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
Sorry, but no. I've been in the industry for over 10 years, and it's rare to experience these types of problems. Consumer-grade equipment is notorious for this type of thing, but it's much less common with a major vendor with a reputation to protect. Single vendor rarely means best of breed.
It may be rare, but when you're in a conf. call with Juniper and Cisco and F5 because you're finding that multicast is dropping packets, you can be pretty sure that the one that fixes it is the one who has a proposal to replace all of the others' equipment with their own.
Re: (Score:1)
Re: (Score:2)
Recommending a homogenous network infrastructure is a critical, rudimentary security mistake.
It's stunning that Addison Wesley allowed something like that through their editing process.
Re: (Score:2)
1) Vendors sometimes cut corners on standards. And sometimes standards that are supposed to guarantee interoperability between all conforming equipment turn out to have corner cases where two pieces of equipment implementing the standard in different (but both conforming) ways can't work together. All one vendor tends to guarantee interoperatilibty
2) It also eliminates finger-pointing. It's all their equipment, so no matter what's broken, it's their job to fix it.
Say Wha? (Score:2, Insightful)
How did a summary this poor get approved? (Score:2)
It is shocking enough that it was written, much less 'approved.' Lol...
"I find myself making (or recognizing) more and more important decisions being made in developing for the web"
"Scalability Rules Read below for the rest of eldavojohn's review"
Re: (Score:2)
Re: (Score:2)
Well, content-wise the review author shares that state is bad for scalability, but not why. (It is because it makes load balancing less efficient, as consecutive requests from the same user have to be routed to the same server, the one that keeps that user's state. Optimally the load balancer would be able to send each request to the server with the lowest load for the moment.)
Re: (Score:2)
It's a book review, not Cliff's Notes.
Re: (Score:2)
Maybe you meant "superior writing skills" tongue in cheek (cheeky of you) but my skills in this regard are pretty average. BTW, I started looking through submissions posted by samzenpus and have noticed a bit of a trend... ;)
Re: (Score:2)
Yes, absolutely. Can't have those non-nutritional BluRay players, now, can we?
Re: (Score:2)
This does not seem to explain how to make a scalable server. Just advice for making your website grow in a maintainable way. However, the title sounds more catchy this way. I think it is misleading advertising.
The full title is actually "Scalability Rules: 50 Principles for Scaling Web Sites". Therefore, it is not about scalable servers, but about scalable web sites. I give your reading comprehension a D, or maybe a D=.
--
The previous "typo" is actually an inside joke. For more information, please Google "Taylor Mali".
Re: (Score:2)
Give him some slack, he could have inserted an xkcd link as well.
Agile era (Score:2)
The main reason that Agile has become so popular with developers is that it gives the appearance and metric generation for a clueless management of an impressive amount of work being accomplished, while actually just providing cover for an entire team of developers to systematically learn to overestimate tasks.
An Agile project to tie your shoes would consist of no less than two full walls of tasks written on notecards, not one of which would be estimated at less than one hour. Velocity will be through the r
Re: (Score:2)
I don't know what are you talking about. Agile has much smaller overhead than e.g. RUP or SCRUM. So would you please tell, what is your reference point when evaluating Agile?
Re: (Score:2)
Most of my experience is with scrum and XP, both of which are considered "agile" development methodologies, and both of which have the problems of which I speak.
Your post doesn't even make sense.
Re: (Score:2)
You're right, I meant waterfall and other traditional techniques.
Re: (Score:2)
My point is that by using velocity and the closing of tasks as the metric of efficiency, it just encourages developers to create ever more pointlessly explicit tasks, while also rewarding overestimating them.
In every scrum group I've been in, this has ended up happening. Nobody wants to be the velocity killer, so there is an unspoken trend to allocate increasing amounts of time to things as the project drags on.
The iterative process has some strong points, but they are mostly obliterated by reality of corpo
"You can't test quality into your system." (Score:3)
How true it is. You can approach defect-free by testing against clear requirements, but genuine quality has much more to do with decisions than defects.
"I knew state is bad if unneeded but" (Score:2)
It's very simple. State is something you have to understand while looking at the code under consideration but isn't *defined* in the code under consideration. That means you have to analyze and understand all the state before you can work properly with code, and that can be hard to do, and hard to know that you have in fact gotten it all.