Follow Slashdot stories on Twitter


Forgot your password?

Book Review: Scalability Rules 54

eldavojohn writes "As a web developer in the 'Agile' era, I find myself making (or recognizing) more and more important decisions being made in developing for the web. Scalability Rules cemented and codified a lot of things I had suspected or picked up from blogs but failed to give much more thought to and had difficulty defending as a member of a team. A simple example is that I knew state is bad if unneeded but I couldn't quite define why. Scalability Rules provided this confidence as each of the fifty rules is broken down in a chapter that is divided into what, when, how, why and key takeaways. A strength of the book is that these rules cover all aspects of web development; but that became a double edged sword as I struggled through some rules meant for managers or technical operators." Read below for the rest of eldavojohn's review.
Scalability Rules: 50 Principles for Scaling Web Sites
author Martin L. Abbot and Michael T. Fisher
pages 272
publisher Addison-Wesley Professional
rating 8/10
reviewer eldavojohn
ISBN 978-0321753885
summary 50 Principles for Scaling Web Sites
You might recognize the authors as two of the three partners of AKF Partners which means that the book pushes a lot of their concepts like the AKF Cube. A bonus is that they have a very long list of clients and aren't afraid to remind the reader that they have consulted to hundreds of companies so when they say they see these rules solving problems frequently, there's weight to that. Also, they have two books but don't confuse Scalability Rules with The Art of Scalability as the latter focuses on people, processes and technology instead of the rules of scaling.

First off this book gives you a primer of rules for you to start with depending on whether you are mostly a manager, software developer or technical operations personnel. I'll concentrate on the specifics of the software developer chapter and summarize the others at the end of this review. Also note that aside from some SQL, I only saw PHP code in this book. Luckily there's only a handful of snippets presented and they are easy to follow. Additionally each chapter ends with solid references (usually online resources) to back up the claims listed in those sets of rules.

The first chapter is devoted to reducing the equation and focuses on removing needless complexity from your solution. You can find this chapter here if you want to see how the layout looks. They give a lot of solid reasons for this and also a lot of good examples like understanding what your users care about. Why build a prompt to export a blog post as a PDF if 99% of the users don't care about it? Next up they say the rule to design to scale means designing for 20x capacity, implementing for 3x capacity and deploying to 1.5x capacity. A strength of the book are the grids that illustrate what is low, medium or high cost and impact through the chapters. Every time they discuss options at different parts of the solution development process, the user is given a chart to understand why. The next rule stresses that you can usually identify 80% of your benefit achieved from 20% of the work (80-20 rule). Rule 4 is strangely specific and implores the reader to simply reduce DNS lookups. However — and this is the first of many — they remind the reader that this rule must be balanced with putting your system all on one server just to reduce DNS lookups. Such a strategy can result in that becoming a choke point. Rule 5 quite simply instructs the reader to use as few objects as possible in your webpage.

The final rule of chapter one is the first one I disagree with in the book. The rule says "Don't mix the vendor networking gear." And this goes against every fiber of my being. Why even have networking standards if you are not to mix the vendor networking gear? Looking to upgrade one component? Better stick to brand X no matter how crappy they have become. This results in being nickeled and dimed and vendor lockin. If scalability is your sole goal than perhaps this is sound instructions. But I cannot understand how anyone would indicate lockin to a vendor — especially in today's networking gear.

Chapter two is incredibly short but potent. It covers some basic database concepts like why ACID properties of databases make them difficult to split. This chapter is spot on and calls upon the AKF cube for dimensions of scalability. Three dimensions are: You can clone things, split different things and split similar things (like by country region). This cube reappears throughout the book and it should be noted that the book does a good job of giving examples of when each dimension is a good choice for scaling and when it is a bad choice compared to the other two. In my line of work, massive scaling solutions have implemented all three.

Skipping to the next developer chapter on not duplicating your work, the text ranged from the incredibly obvious "Don't double check your work" to relaxing temporal constraints. The chapter is short like chapter two and didn't offer me a whole lot. A third rule was again oddly specific in saying not to do redirects and even getting down into the very fine specifics of what HTTP codes are and how they affect your response times.

The next chapter for developers is chapter ten on avoiding or distributing state. Rule 40 actually came in useful at my job as it simply states "Strive for Statelessness." There was an easy solution to a problem in one of our projects that involved storing an object in the session to keep track of what was being displayed to the user. Having read the book, I instead made this web application nearly stateless (except user authentication and the like). Later on, as we started testing the application in multi-tabbed browsing and users began opening many search tabs and viewed several objects at once to compare them, I was glad that I had not gone down this path. Doesn't have much to do with scalability but I think all web developers should read this chapter as it really does pay to avoid state when possible.

As the rules grew closer to 50, they lost their potency. The authors did a good job of trying to put a bit of ranking in the appearance to these rules. The final developer chapter on asynchronous communication and message buses is probably the most specific and was the least useful for me. While all the rules in this chapter are true, they again border on the banal with examples like "Avoid Overcrowding Your Message Bus."

Having read this book cover to cover, it is a very short book with extremely succinct and organized summaries (the final chapter is a short review of each rule). The manager and operations chapters didn't really do a lot for me overall but would occasionally have very interesting chapters that opened up a lot of the logic behind content delivery services to me. Occasionally I would take slight issue with some rules but the most egregious rule I read was Rule 28 "Don't Rely on QA to Find Mistakes" and then the chapter opens with calling the title of this rule "ugly and slightly misleading and controversial." Because it is and could probably be replaced with another sentence from the chapter: "You can't test quality into your system." Why rely on sensational headlines when I'm already holding your book? I think this book would have been a solid 9/10 if not for this oddity in the large rule set.

I've given each of these rules a decent amount of thought and will keep them at the back of my mind as I write code in an agile environment. Mistakes made early on can be very costly in scaling terms. This book will definitely be kept around at work when I need a solid argument for those design decisions that might take more work but save in the future when it needs to scale.

You can purchase Scalability Rules: 50 Principles for Scaling Web Sites from Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.


This discussion has been archived. No new comments can be posted.

Book Review: Scalability Rules

Comments Filter:
  • The rule says "Don't mix the vendor networking gear."

    Did they say why? That kind of strikes me as odd to the point of ridiculous too, but if they at least explain it, maybe it would get a little less weird.

    No, I don't really think it would get less weird, but .. WTF! WHY?

    • by Tynin ( 634655 ) on Wednesday September 28, 2011 @08:16PM (#37547836)

      The rule says "Don't mix the vendor networking gear."

      Did they say why? That kind of strikes me as odd to the point of ridiculous too, but if they at least explain it, maybe it would get a little less weird.

      No, I don't really think it would get less weird, but .. WTF! WHY?

      My best guess would be for support reasons. Once you are dealing with a network problem, being able to have a single support line for hardware / OS problems on your network gear really makes life easier. Especially when dealing with setups that cost lots of money, no one is happy when you have more than one vendor pointing the finger at the other (or worse, having to send your gear into their lab for diag and they come back much later to tell you it must be a problem with the other vendors gear, and can you please send that in as well so we can work it in the lab (at growing cost to yourself/company, and depending on your contract, likely the loss of your million dollar router while they look at it)), although it sure can make for some interesting conference calls if you can get the engineers from competing companies on the phone together, and just listen to them duke it out.

    • by Tim99 ( 984437 )
      One throat to choke?
      • by HogGeek ( 456673 )

        notwithstanding vendor incompatiblities, the "one throat to chock" position is often held by organizations that lack technical competency, at least in a lot of my experience.

        • by Tim99 ( 984437 )
          "One throat to choke", is as you say, sometimes followed by organizations without a high degree of competency. It is also a guiding mantra for organizations who do know what they are doing; and just want stuff to work. It is not a happy time when two venders tell you that they are both OSI compliant for Layers 1 - 7, but something is probably wrong at the Session or Transport Layers. I have been around this stuff since the 70's - Maybe the lessons I learned with Oracle V5 made me think that way...
        • The fundamental problem is accountability, not competence. When you have two vendors, both will point at the other, and you're left in the middle.

    • by wabb ( 2459758 ) on Wednesday September 28, 2011 @08:36PM (#37548018) Homepage
      Hi folks - I'm one of the authors. All good answers in here and all correct. One throat to choke is a great response with which we agree. The point in our book though is along the lines of the "it should work, but it doesn't" statement already given. Most vendors implement proprietary protocols that they claim to be consistent with RFCs 792, 1058, 4271, 5340 (ICMP, RIP, BGP, OSPF for IPV6) and others . These RFCs are loose enough to allow for differences in implementation. This in turn can cause a bit of a "stacked dependency" problem where 2 or more providers' gear works correctly IAW the RFCs, but not always with each other. Based on our analysis of many companies (n>200) and our personal experiences at both large and small companies the best solution is just to stick with one provider. Specifically, we often see device communication errors between providers as one of the top 5 reasons for down times across our client companies where they have heterogeneous networks. The most common fix is to replace with homogeneity - take your pick of provider. The problem is exacerbated in that device providers don't test each other's equipment communication as well as they check their own. So more problems are bound to occur. It's really not that much different than the browser problem where your site may work with 3 browsers but get jacked up with another 2. If you haven't had problems, it may not be worth fixing. If you are designing from scratch, our view is that it's best to stay homogeneous. Thanks for the review and the perspective!
      • Sorry, but no. I've been in the industry for over 10 years, and it's rare to experience these types of problems. Consumer-grade equipment is notorious for this type of thing, but it's much less common with a major vendor with a reputation to protect. Single vendor rarely means best of breed.
        • Sorry, but yes. The theme is scalability which means individuals will be taking care of huge infrastructures and cannot waste time troubleshooting little compatibility glitches. We've found that keeping the environment homogeneous pays a lot in terms of keeping variables under control and if you need to change vendors, do it in big chunks and fast with the goal of returning to homogeneity quickly.
          • by Gr8Apes ( 679165 )

            I would agree with homogeneous equipment in specific layers. Homogenous throughout? Not so much, especially when going through firewalls.

            • by wabb ( 2459758 )
              Good point Gr8Apes - we point out in the book that we typically exclude Firewalls from the homogeneity rule.
              • by Gr8Apes ( 679165 )

                So you're really saying don't mix 2 types of load balancers in a cluster, and that type of thing? That would be insanity in production systems.

                Honestly, while I have seen this in the real world, I often wonder if those responsible for it are qualified to even touch a keyboard or a network cable. For a production rollout, I spec homogenous layers, as it removes "configuration hell" from the equation. I only have to worry about single sets of configuration per equipment type in the layers, and removes an enti

                • by wabb ( 2459758 )
                  I agree - it would be really odd to see an F5 and a Netscaler paired in an HA configuration in our experience and to do so would seem like a recipe for disaster. By network device we mean core and border routers, distribution and access switches, etc - the explanation is laid out much more clearly within the rule. The lowest rate of network failures and total calculated contribution to non-availability within our companies happened when the non-load balancing and non-fire-walling network devices were homo
          • I'm happy to hear that you've found a strategy that works for you. There are many functional infrastructures out there that don't follow this model, and they are doing very well. The entire internet routing infrastructure is heterogeneous, for one example. I just think it's disingenuous to suggest that fundamental incompatibilities between major players with protocols like BGP and OSPF occur so often. Which equipment vendor do you work for?
        • by spacey ( 741 )

          Sorry, but no. I've been in the industry for over 10 years, and it's rare to experience these types of problems. Consumer-grade equipment is notorious for this type of thing, but it's much less common with a major vendor with a reputation to protect. Single vendor rarely means best of breed.

          It may be rare, but when you're in a conf. call with Juniper and Cisco and F5 because you're finding that multicast is dropping packets, you can be pretty sure that the one that fixes it is the one who has a proposal to replace all of the others' equipment with their own.

          • Your scenario (multi-vendor conference call, urgency of resolution) implies a problem in a production environment. When architectures are fully end-to-end tested before being put into production, problems like these are eliminated long before solutions like complete hardware replacement are palatable to the business people signing the checks. In my experience, these are best practices and extend to things like configuration management and code upgrades for everything from access switches to firewalls, load
      • Recommending a homogenous network infrastructure is a critical, rudimentary security mistake.

        It's stunning that Addison Wesley allowed something like that through their editing process.

    • 1) Vendors sometimes cut corners on standards. And sometimes standards that are supposed to guarantee interoperability between all conforming equipment turn out to have corner cases where two pieces of equipment implementing the standard in different (but both conforming) ways can't work together. All one vendor tends to guarantee interoperatilibty

      2) It also eliminates finger-pointing. It's all their equipment, so no matter what's broken, it's their job to fix it.

  • Say Wha? (Score:2, Insightful)

    by lee1 ( 219161 )
    Does anyone understand the first sentence of the summary? Or most of the review, for that matter?
  • It is shocking enough that it was written, much less 'approved.' Lol...

    "I find myself making (or recognizing) more and more important decisions being made in developing for the web"

    "Scalability Rules Read below for the rest of eldavojohn's review"

    • Because someone with your superior writing skills has never chosen to write a review. Geeks prefer content over presentation, you should realize that by now.
      • Well, content-wise the review author shares that state is bad for scalability, but not why. (It is because it makes load balancing less efficient, as consecutive requests from the same user have to be routed to the same server, the one that keeps that user's state. Optimally the load balancer would be able to send each request to the server with the lowest load for the moment.)

      • Maybe you meant "superior writing skills" tongue in cheek (cheeky of you) but my skills in this regard are pretty average. BTW, I started looking through submissions posted by samzenpus and have noticed a bit of a trend... ;)

  • The main reason that Agile has become so popular with developers is that it gives the appearance and metric generation for a clueless management of an impressive amount of work being accomplished, while actually just providing cover for an entire team of developers to systematically learn to overestimate tasks.

    An Agile project to tie your shoes would consist of no less than two full walls of tasks written on notecards, not one of which would be estimated at less than one hour. Velocity will be through the r

    • I don't know what are you talking about. Agile has much smaller overhead than e.g. RUP or SCRUM. So would you please tell, what is your reference point when evaluating Agile?

      • Most of my experience is with scrum and XP, both of which are considered "agile" development methodologies, and both of which have the problems of which I speak.

        Your post doesn't even make sense.

        • You're right, I meant waterfall and other traditional techniques.

          • My point is that by using velocity and the closing of tasks as the metric of efficiency, it just encourages developers to create ever more pointlessly explicit tasks, while also rewarding overestimating them.

            In every scrum group I've been in, this has ended up happening. Nobody wants to be the velocity killer, so there is an unspoken trend to allocate increasing amounts of time to things as the project drags on.

            The iterative process has some strong points, but they are mostly obliterated by reality of corpo

  • by Wordplay ( 54438 ) <> on Thursday September 29, 2011 @03:15AM (#37550700)

    How true it is. You can approach defect-free by testing against clear requirements, but genuine quality has much more to do with decisions than defects.

  • It's very simple. State is something you have to understand while looking at the code under consideration but isn't *defined* in the code under consideration. That means you have to analyze and understand all the state before you can work properly with code, and that can be hard to do, and hard to know that you have in fact gotten it all.

If you suspect a man, don't employ him.