Forgot your password?
typodupeerror
Perl Books Media Programming Book Reviews

mod_perl Developer's Cookbook 80

Posted by timothy
from the boiling-over dept.
davorg writes "Over the last few years mod_perl has become a serious force in web development. If you're building a web site to run on an Apache server and you want to write the code in Perl, then you're going to want to install mod_perl on your server too as it's the best way to avoid many of the performance issues with traditional CGI. It's taken a while for publishers to wake up to the fact, however, and there haven't been many books in the shops. It looks like this will be the year that this changes. A number of mod_perl books are about to be published and this is the first." Read on below for Daveorg's thoughts on this one.
mod_perl Developer's Cookbook
author Geoffrey Young, Paul Lindner & Randy Kobes
pages 630
publisher Sams
rating 9
reviewer Dave Cross
ISBN 0-672-32240-4
summary What mod_perl programmers have been waiting for

This book uses the popular "cookbook" approach, where the content is broken down into short "recipes" each of which addresses a specific problem. There are almost two hundred of these recipes in the book arranged into chapters which discuss particular areas of mod_perl development. In my opinion the cookbook approach works much better in some chapters than in others.

It's the start of the book where the cookbook approach seems most forced. In chapter 1 problems like "You want to compile and build mod_perl from source on a Unix platform" provide slightly awkward introductions to explanations about obtaining and installing mod_perl on various platforms (kudos to the authors for being up-to-date enough to include OS X in the list). All the information you want is there however, so by the end of the chapter you'll have mod_perl up and running.

Chapter 2 looks at configuration options. It tell you how to get your CGI programs running under mod_perl using the Apache::Registry module which simulates a standard CGI environment so that your CGI programs can run almost unchanged. This will give you an immediate performance increase as you no longer have the performance hit of starting up a Perl interpreter each time one of your CGI programs is run. This chapter also addresses issues like caching database connections and using mod_perl as a proxy server.

We then get to part II of the book. In this section we look at the mod_perl API which gives us to the full functionality of Apache. This allows us to write Perl code which is executed at any time during any of the stages of Apache's processing.

Chapter 3 introduces the Apache request object which is at the heart of the API and discusses various ways to get useful information both out of and back into the object. Chapter 4 serves a similar purpose for the Apache server object which contains information about the web server and its configuration.

In chapter 5 the authors look at Uniform Resource Identifiers (URIs) and discuss many methods for processing them. Chapter 6 moves from the logical world of URIs to the physical world of files. This chapter starts by explaining the Apache::File module before looking at many ways to handle files in mod_perl.

The previous few chapters have built up a useful toolkit of techniques to use in a mod_perl environment, in chapters 7 and 8 we start to pull those techniques together and look in more detail at creating handlers - which are the building blocks of mod_perl applications. Chapter 7 deal with the creation of handlers and chapter 8 looks at how you can interact with them to build a complete application.

Chapter 9 is one of the most useful chapters in the book as it deals with benchmarking and tuning mod_perl applications. It serves as a useful guide to a number of techniques for squeezing the last drops of performance out of your web site. Chapter 10 is a useful introduction to using Object Oriented Perl to create your handlers. While the information is all good, this is, unfortunately, another chapter where the cookbook format seems a little strained.

Part III of the book goes into great detail about the Apache lifecycle. Each chapter looks at a small number of Apache's processing stages and suggests ways that handlers can be used during that stage. This is the widest ranging part of the book and it's full of example code that really demonstrates the power of the Apache API. I'll just mention one particular chapter in this section. Chapter 15 talks about the content generation phrase. This is the phase that creates the actual content that goes back to the user's browser and, as such, is the most important phase of the whole transaction. I was particularly pleased to see that the authors took up most of this chapter looking at methods that separate the actual data from the presentation. They have at recipes that look at all of the commonly used Perl templating systems and a few more recipes cover the generation of output from XML.

Finally, two appendices give a brief reference to mod_perl hooks, build flags and constants and a third gives a good selection of pointers to further resources.

This is the book that mod_perl programmers have been waiting for. The three authors are all well-known experts in the field and it's great that they have shared their knowledge through this book. If you write mod_perl applications, then you really should read this book.


You can purchase mod_perl Developer's Cookbook from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

This discussion has been archived. No new comments can be posted.

mod_perl Developer's Cookbook

Comments Filter:
  • by MrBoombasticfantasti (593721) on Wednesday September 18, 2002 @10:03AM (#4281019)
    It doesn't actually add much to the info already available at CPAN. Still nice to have it on the shelve.
    • Except that I can read it while on the train.
    • by lindner (257395) <lindner@inuus.com> on Wednesday September 18, 2002 @10:51AM (#4281350) Homepage
      It doesn't actually add much to the info already available at CPAN. Still nice to have it on the shelve.

      [disclaimer: author post follows]

      The problem with CPAN is knowing what's useful and what's not. This book isn't just a collection of modules and documentation. Instead it's geared to people who are writing mod_perl code. The code examples are used to show you not just how to do some task, but also (in most cases) how the code does what it does.

      In fact, distilling mod_perl code into short, sweet examples was where most of the effort went into writing this book. You don't want pages and pages of code to illustrate one or two simple ideas.

      So, perhaps we didn't write a book that was useful to you. Given the feedback I've read, it is useful to many other people.

      • You raise a valid point: CPAN is overwhelming because it is (in essence) a disorganized heap of information. I'm sure the book helps a great many people that do not have time and/or skills and/or need to look it al up for themselves. Distilling the info into a more pallatable format is a Good Thing. The fact remains however that for me there is not much news in the book. As I said before, but in different order, it's a nice book to have on the shelve, but (for me) there isn't much of an addition to CPAN.
      • i haven't seen the book around my regular sydney bookshop, but then i generally tend to linger in the ora section... but i will certainly seek this one out.

        to those who would say that perl is useless and unmaintainable beyond your basic 2K-line shopping cart application - i have 125K-lines of OO mod_perl app (not counting any CPAN modules!) which would argue otherwise. i would say that this app is *more* maintainable than its java equivalent by virtue of the fact that perl is so much more expressive (and therefore easy to grok) than java (which is exceedingly verbose and clumsy IMHO). and while the running footprint for this app is large (~25meg/process), it does a hell of work and request/response times are still very good (though i would kill to be able to dynamically unload large, rarely-used modules in perl as easy as it is to dynamically load them).

        matt
    • I disagree. It is better organized and more clearly presented than most of the on-line documentation, and it provides more examples. It also shows how to do things that are not discussed anywhere else, like automatically caching the output of a content handler. It's a very handy book to have.
  • I appreciate the reviewers candor, but couldn't he have done a more thorough review instead of just focusing on some of the book???
    • What he really did was summarize the book, not review it. I agree with the popular view though. This is a decent book but doesn't add to much to existing literature on this subject. More dead trees.
  • by ajs (35943) <<moc.sja> <ta> <sja>> on Wednesday September 18, 2002 @10:08AM (#4281054) Homepage Journal
    mod_perl provides a means for transparently wrapping CGI programs so that they run continuously instead of starting up (and thus parsing) every time a request requirest them.

    However, it's much more than a CGI accelerator. It provides hooks into all of the stages of an apache transation.

    As an example of the kind of power this gives you, you can write a Perl plugin for Apache that intercepts 404s, and generates a dynamic page which you then cache to disk for future access (far out-stripping even native C dynamic page generation speeds on subsequent hits). This is just one example. You can write whole content management systems using mod_perl, and in fact many have.
    • exactly how do you run interpreted perl code faster than compiled C code?
      • I may be wrong, but I think he was saying that by caching the output to disk, additional requests are served faster than compiled C code could dynamically generate the page.

        Of course, the binary could also cache the page...

      • Request 1: xyz.html

        file not found
        mod_perl intercept of 404 calls xyz.pl
        xyz.pl writes xyz.html (e.g. from database)

        Request 2: xyz.html

        file exists
        sendfile or tux used to fire file to socket


        Even C cannot dynamically generate a file as fast as it can be read from disk. Granted, you could write the same plugin in C as you wrote in mod_perl (mod_perl uses the C API for apache after all), but it would be a lot more work, and all you would get is the performance boost on that first page generation, after that they perform the same.

        This is the model used by at least one major content management system that uses a language that make Perl look zippy by comparison. They still compete because most page views are found on disk.

        Of course, now you get to play the cache management game, but that's the right problem to have when serving lots of content.
        • Even C cannot dynamically generate a file as fast as it can be read from disk.

          That depends upon what the file is, and how fast your disk system is. Many large scientific computations which, in the past, precomputed values and stored them to disk now recompute as necessary, simply because the recomputation is faster than a disk access.

          You won't be able to regenerate a file as fast as it can be read from cache; but unless you have an infinite amount of cache memory, there are likely to be cases where you're better off to recompute and allow something else to be cached.
          • True, but only in the case of huge files that require no disk access to generate dynamically. Since most dynamic content on the Web requires a database....

            C's sendfile can (when possible) perform a DMA transfer from the disk controler to the ethernet controler, which will beat the snot out of any relational database access.
            • True, but only in the case of huge files that require no disk access to generate dynamically.

              Except that the database entries used are more likely to be reused for other requests -- so if the output could be cached, the database certainly would be.

              Obviously, in some cases it is better to precompute entire pages; but it is really something which has to be determined on a case-by-case basis.
              • Except that the database entries used are more likely to be reused for other requests -- so if the output could be cached, the database certainly would be.

                I'm going to explain why this is wrong, but first let me explain that you're in some very good company in having made this assumption. I and just about everyone I know who've seen a good caching content management system in use have been stunned by the simplicity and correctness of the solution. In the case of Vignette (the one I'm most familliar with), I was also stunned that such slipshod software written in a language that couldn't even do lexical scoping (TCL) was doing this one thing so well :-)

                Ok, on to the technical. Yes, you can cache your database in memory (Oracle lets you cache gigs and gigs in RAM), but that buys you a lot less than you would think.

                You still have to execute millions upon millions of instructions just to generate the simplest page. When an HTML file is on disk, apache just calls sendfile(2), which copies the file from disk to socket with no userland code in between. Trust me when I say that this is so much more efficient that it's not even worth the comparison.
                • Caching data, as opposed to just caching generated HTML, allows you to reuse that data in other pages, some of which can't be cached. For example, I worked on an application where we would cache data from a product catalog and use that data in the browsing pages, the shopping cart, the gift registry, etc.

                  A good system will allow for caching of both data and generated HTML.

                • You still have to execute millions upon millions of instructions just to generate the simplest page

                  Only if you write crappy code, or you have extremely complicated pages. A few hundred thousand cycles is reasonable for well written code generating a web page from cached data.
                  • Only if you write crappy code

                    Nope

                    or you have extremely complicated pages.

                    Nope

                    A few hundred thousand cycles is reasonable for well written code generating a web page from cached data.

                    Most assuredly nope!

                    Sure, I too can come up with a home page for peeling paint that I can generate with a six-line C program. But, even moderate complexity would run you aground.

                    How are you caching data? How are you locking/cleaning/managing/clearing that cache? Your page generation will have to be in bed with that to some extent in order to determine if a new page request invalidates some or all of the cached data that it touches. Then, you're going to have the small matter of how you share this cached data. Is it in a simple database (e.g. Berkeley DB) or a second-tier relational database or do you try to manage a live, shared memory cache. Cache consistency management on that's going to get ugly fast!

                    Now, you start dealing with protocol management, HEAD vs GET vs POST requests, parsing POST bodies. URL-encoding, cookie access, security, etc, etc.

                    "Well written code" as defined by number of cycles consumed usually means that many of these needs are handled in a one-off way that does not take into account the mountain of special-cases that makes up what we call the World Wide Web.

                    Instead I suggest you spend all of that premature optmization energy on writing a good cache management system that can mix and match static HTML cache with dynamically generated pages on the fly. That would benefit everyone, not just one Web page.
        • This is the model used by at least one major content management system that uses a language that make Perl look zippy by comparison. They still compete because most page views are found on disk.

          They compete with other commercial software. But at US$500,000 for licensing (average), they're nowhere near competing with mod_perl.
    • However, it's much more than a CGI accelerator. It provides hooks into all of the stages of an apache transation.

      Yeah, I found it extremely appealing for two reasons: First, I hate writing configuration file readers - and with mod_perl, $request->dir_config('whatever'); to read stuff that is set with PerlSetVar in .htaccess or server conf. The second reason: Logging with various debug levels. Easy with Apache::Log.

    • Actually, one of the barriers to mod_perl use is that mod_perl by default does *not* provide transparent wrapping of CGI programs. It can be made to do so using PerlRun modules but I think it's just a case that a documentation needs to be more prominent about this fact that vanilla Apache::Registry scripts behave significantly different from CGI. Perhaps the documentation should advertise more the PerlRun modules (etc) that do give transparent CGI wrapping. I like many others have fallen into the trap of just blindly switching a script from CGI to mod_perl and bitten by many of the (documented) issues if you bother to RTFM which of course I didn't at first =)

      Now that I know mod_perl indepth, the parent is correct in the immense flexibility of mod_perl with its ability to directly interface with Apache. Something you won't be able to do ever with CGI or even PHP.

      And about You can write whole content management systems using mod_perl, and in fact many have. Of course the CMS running here at Slashdot is powered by Slashcode [slashcode.com] which runs under Apache/mod_perl.
    • An example of one of these content management systems would be mason, http://www.masonhq.com [masonhq.com], and mason apps such as Fuse CMS [autistici.org] and Bricolage [thepirtgroup.com]. I find Mason to be just as powerful as multi-thousand dollar applications such as StoryServer [vignette.com]
  • website support (Score:4, Informative)

    by Anonymous Coward on Wednesday September 18, 2002 @10:16AM (#4281106)
    we (the authors) support a companion website where you can find a number of useful items, such as all the code [modperlcookbook.org] from the book (to save your fingers) and a full-text search engine [modperlcookbook.org] (to supplement the index).

    http://www.modperlcookbook.org/ [modperlcookbook.org]

    enjoy
  • by pizza_milkshake (580452) on Wednesday September 18, 2002 @10:54AM (#4281380)
    it's taken /. a while as well; this book was published in January [amazon.com]
    • As one of the authors it's been difficult to wait for this book to get more widespread exposure. One reason might be because it is published by SAMS. I suspect if there was a cute O'Reilly animal on the cover we'd be much more widespread at this point. Who knows, maybe we should stuck with the (unfounded) SAMS stereotype and named the book mod_perl unleashed in 21 days for dummies. Nah..

      In any case, it's nice to see a new review on one of my favorite web sites. More good reviews over there at amazon and at the book's official web site [modperlcookbook.org].

    • It's partly my fault. I got my review copy in June :-/

  • http://www.modperlcookbook.org/modperl2.html
    One thing to note is that it is for the 1.3 version not the new 2.0 version. They say though there are not too many differences.
    • Re:Not mod_perl 2.0 (Score:2, Interesting)

      by lindner (257395)
      One thing to note is that it is for the 1.3 version not the new 2.0 version. They say though there are not too many differences.


      Funny thing. We were worried that mod_perl 2.0 would steal our thunder. It's now september 2002 and we still don't have the official release.


      Apache 1.3 and mod_perl 1.x will be around for a long time though. Especially on all those production servers that don't get the latest greatest software, only the boring reliable stuff...

  • A very useful book (Score:2, Interesting)

    by barries (15577)
    Apache and mod_perl are incredibly powerful but complex systems; it's very difficultfor any one person, to keep all of the details and possible approaches to all of the things you can do with them in my^Wtheir head.

    This book's approach helps me find tried and true approaches to the things I need to make mod_perl do. It's far better organized and written than the freely available documentation and covers a range of modules (many written for the book) that do things I used to do the hard way. It's clear, concise, and the material is well chosen. You'll get a lot further along on your next mod_perl project a lot faster with this book close to hand than by repeatedly scouring CPAN and the web for the modules, mail messages, and documentation

    Yours in mod_perl,

    Barrie

  • I don't have the money to pay for my own dedicated server. Is there anywhere that I can get access to mod_perl for $10-20/month?

    I know what you are probably thinking, If my site is small enough to get by on a virtual hosting account, than I should probably just not worry about mod_perl. And maybe I should just leave it at that but here is what I am thinking...

    I am not a mod_perl expert so I might be totally wrong about this but, hey, that's why I'm asking! If a host setup mod_perl with some basic modules preloaded users could then run their scripts under it. Not only would user's websites run faster but it would reduce system resources (overall) which should make the hosting company happy too.

    Yes, I know that there would be some problems particularly with security but has anyone figured out a way to do this successfully?

    REF: http://www.perl.com/pub/a/2002/05/22/mod_perl-isp. html [perl.com]
    • by Anonymous Coward
      I don't think so. Mod_perl gets its hooks so much deeper into Apache than CGI does, that it's hard to share.

      One problem is that a bad mod_perl program can bring down the server. A bad CGI program can't since by definition it's forking a new process to run, so all it can do is crash its own forked process.

      Another issue is that in order to load new or modified mod_perl scripts, you need the privileges of the process running the server.
      No way you can do that in a virtual hosting environment, unless you have a death wish.

      In order to install a mod_perl script you also have to be able to edit the apache config file. Typically (for good reasons) a file writable only by root.

      Another issue is that the apache processes running keep all the mod_perl programs in memory. If there were ten different mod_perl-enabled "web sites" run by the same apache server on the same box, that could get really inefficient.

      More I think about it, the only way this could work is if each mod_perl virtual host has its own apache server instance, with full ownership of its own config file and privileges to bring it down and restart it.

      You would need some kind of gateway server answering all requests coming in to port 80 and redirecting them to a different port depending on the request's URL; each mod_perl virtual host would have its own port number which the gateway server would redirect to.
      This sort of internal redirect is often used now running a server without mod_perl to handle static requests, and a mod_perl-enabled server to handle dynamic requests on the same box.

      -->So I have changed my off-the-cuff conclusion. You could do this all on one box, but it would be complicated, and a fundamentally different model than shared hosting with CGI only. And I'm not sure any hosting company is doing it. Nor am I sure that there aren't some bad security or performance implications I haven't thought of.

      Just get DSL and start messing around running mod_perl on your own computer. If it's a low-volume or self-instructional site that should be quite adequate.

  • Not to knock Perl, but have you ever tried to maintain someone else's CGI scripts? Perl has a serious design flaw in my opinion - it's easy to write code that is almost unreadable to anyone else but the author.

    As a System Administrator, I see this as being detrimental to the work environment - so you have a Perl guru who can do it all - what happens when they leave? Who will you replace them with? How long will the new people need to familiarize themselves with obfuscated code?

    Consider a solution like Python or PHP. PHP quite simply is the shit when it comes to web programming - you can quickly put together a complex web application using its straightforward and simple syntax, which is easy to read, understand and modify. (Of course you all know this...)

    Python is even better in this regard.

    In a production environment, it makes sense to use tools that are conducive to efficiency.

    My motto: Keep It Simple Stupid!
    • Maybe obfusgated(sp?) code. My perl looks just as good as my C/C++. I've never had a problem reading the perl of others, either, provided it was formatted and commented with just a little care.
      • True, Perl doesn't have to be messy. It's all a matter of coding style. If you are a considerate developer, you will write clean, easy to read code that's replete with useful comments.
        Unfortunately, there are people who want to write messy code out there, and I'll be damned if I have to maintain it!
        Thanks for the point - clarity of code is a matter of style, but certainly the choice of language helps as well.
    • Another powerful solution is Java Servlet Pages. With JSP you can combine HTML and java code together in one file. It's also likely to be faster than Python because it doesn't need to interpret each line of code before executing it.

  • This is by far one of the most useful books on my bookshelf. If you have a problem just pick it up, thumb through it a bit, and find the answer. Its approach, which is very different form that of the Eagle book, was sorely needed for a long while.


  • PerlRun is the component that allows CGI to be run without modification, not Registry.

Facts are stubborn, but statistics are more pliable.

Working...