Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

SQL Hacks 72

Scott Walters writes "Many of the recipes in SQL Hacks will improve the SQL you write day to day, and many will give you the confidence to attempt much more involved tasks with SQL. Other recipes will rarely if ever be needed, but make for a entertaining and education reading in a similar way that "worse case survival scenario" books do — SQL is pitted against the most difficult analysis tasks just as survival scenario books pit humans against pavement and lions. SQL Hacks fits well in the Hacks series, which raises the bar on advanced books by offering large, eclectic sets of tricks for problems that an unambitious person (a non-hacker) wouldn't ever push technology hard enough to run into. Put another way, the questions answered in a good Hacks book are ones that would get a "good question" comment rather than a an "RTFM!" response. It does a good job continuing where O'Reilly's SQL Cookbook left off, which is always difficult with two books written at slightly different times by different authors. Still, it's harder to review a Hacks book than a Learning book as, with hacks, the sky is the limit, and the reader will always find herself wishing for more. To this end, I hope O'Reilly continues to publish newer editions of their various Hacks books, drawing in more and more content in each edition, and identifying recipes that might better serve in the Cookbook counterpart." Read the rest of Scott's review.
SQL Hacks
author Andrew Cumming & Gordon Russell
pages 386
publisher O'Reilly & Associates
rating 7
reviewer Scott Walters
ISBN 0-596-52799-3
summary Tips & Tools for Digging into Your Data


SQL Hacks skips most of the tutelage and shows you very specific ways for doing specific chores, with more explanation of how to adapt it than theory behind it. Most hacks have database specific information for the five databases the book tackles, and many hacks are inherently different on each system, making them completely different solutions to the same problem. Those five databases are Microsoft Access, Microsoft SQL Server, MySQL, Oracle, and PostgresSQL; most of the ideas require work to adapt or are completely specific to the database system, so I wouldn't suggest straying from this supported set. The authors did their homework, and SQL Hack's strengths are the depth, detail, and level of knowledge with which each database system is covered, and the book's willingness to get down and gritty. There's never an impression that juicy details were omitted because the authors didn't want to expend the effort to pick a colleague's brain or hunt down a factoid that never got documented elsewhere. Learning how to create indices on functions with multiple arguments in Postgres was worth more than the "hack" it was a footnote in. This dedication carriers over to screen shots showing how something is done in Microsoft Access directly opposite Unix shell pipelines between grep, perl, and the SQL command shell. Most books, including mine, are a bit awkward or vague on either Unix or Microsoft Windows, but the author's and contributor's experience on this one expertly covered platforms specific database topics. Besides just database systems and platforms, the authors challenged themselves to show how to securely and efficiently use the database interfaces of a set of languages: C#, Java, Perl, Python, and PHP. The polish shows, and you'll have absolute confidence that all of the tricks really are at your fingertips, regardless of your choice of operating system, database system, or programming language.

It gets bonus points for mentioning non-obvious types of input, such as cookies, that must be sanitized or sent through bound parameters, in its discussion of SQL injections. In the security department, it looks at SQL injections from three points of view: early on in the book, correct code is shown; later, SQL injections are shown from the point of view of the attacker, with several pages of strategies and scenarios for formulating attacks; and then from the point of view of the defender, who has to defang and avoid these scenarios — extra bonus points for this comprehensive treatment.

If you're looking for a quick buy/don't buy indication, then, by all means, buy it. That is, assuming that it's not intended to be your first or only SQL book. By it's own indication, it won't teach you the basics of database normalization, installation, and so forth. I would buy it as a second SQL book, though, after the fantastic 'The Practical SQL Handbook', as it's written to a much higher standard than most books, and gets things right, such as security, the intricacies of using a database to handle accounts, and transactions and shopping carts. The cover text promises lots of advanced hackery, but that's vague. "Pushing the limits of SQL"... "Solve puzzles using SQL"... "Manage users and audit the changes they make to the database".

Here are the major sections: SQL Fundamental; Joins, Unions, and Views; Text Handling; Date Handling; Number Crunching; Online Applications; Organizing Data; Storing Small Amounts of Data; Locking and Performance; Reporting; Users and Administration; and Wider Access.

Wider Access requires some explanation. It deals with locking down the various database systems to securely providing guest accounts, or, more generally, to limit damage in the case of an SQL injection attack or similar compromise.

With some well designed tables, SQL Hacks will show you quite a few tricks, some of them quote involved, quite non-obvious, and quite clever, to extract meaning from the data. You'll probably learn quite a few new types of reports you can do — intersecting ranges from different sets of data, outputting SVG pie charts, swapping rows and columns, finding medians, computing running totals, and computing running functions such as compound interest struck me as the most useful and got mental bookmarks.

I have two metrics for this book. The first metric is whether I'd buy it if I came across it in a book store, and that's a function of whether I'd have exhausted what it had to offer after an hour or so of furious skimming and intentionally picking out the best parts from the table of contents. Very few books make this cut for me.

The other metric is whether the authors did at least what I imagine I would have done were I writing it. This test is also a difficult one but builds in a great deal of forgiveness as my ideas are quite likely dumb ones.

I totally dig the cut-and-paste ASCII query results. The authors could have easily marked all of those up in DocBook and made it prettier but also alien compared to what you'll see at the computer. They're not ashamed of the SQL command shell, and they're not ashamed of SQL.

Many hacks have several examples, covering the problem with different constraints and end goals in mind.

Multi-platform, and thoroughly so. One moment, it's showing how to use XSLT tools from the command line on Microsoft Windows, and on the next page, there's a Unix shell pipeline with wget, xsltproc, and grep. Perl one-liners abound, and there are screen shots from Windows applications with instructions for navigating the menus and setting the needed options. You won't feel shortchanged for running the "wrong" platform.

When a powerful, modern SQL extension, such as replace, gets ratified by the standards committees, the authors let you know. Sidebars are spread around sharing the good news that sometime you might not have heard of before is portable. At the same time, some features are just fluff, and you're warned off of operations intentionally left out of the SQL92 standard.

Sometimes database systems have non-portable local extensions, such as MySQL's full-text indexing and SQLServer's XML handling features, and lots of these get motioned too, usually as variations on examples demonstrating the feature as a short-cut or simplification.

The treatment of security is first rate. The polish is top notch. Writing a book is a huge undertaking, and the economics of book publishing gives publishers little margin for advances. A book that reads like it's third release but is actually in its first can only be the product of an exceptional level of dedication by the authors.

Rarely, the authors do get tutorial-ish, but only a little, and I think it works: "Choose the right join style for your relationship" deals the difference between inner and outer joins, and whether records should be partially populated with nulls or omitted entirely when relations between tables can't be made for a record. Another section shows how to convert between subqueries and outer joins, and talks about when it's possible, and this serves as a sort of lesson in demonstrating the equivalencies between the two.

The "Hacks" format is similar to the "Cookbook" format. Both offer small, randomly-accessible (flip to it when you need it) examples of how to accomplish various tasks. In the traditional, MIT circles, a hack is piece of work that's either brilliant in its simple elegance or else brilliant in its expediency and simple effectiveness, and as such, is worthy of some esteem. It's also work that's custom for a particular scenario and has limited domain — in other words, it's a highly specialized fix or improvement. If a stock fix is applied systematically, that's mechanical, not clever. By this definition, showing users how to invoke their SQL monitor, or showing users how to decide whether to use an outer or inner join, are not hacks. Few of the recipes triggered this peeve, and they were early in the book, but including those few muddles the question of who the audience is, and lowers the standard for the Hacks series, endangering its basic premise. 'SQL Hacks' isn't alone in this sin; most of the Hacks books do it to some degree.

It was written by two professors at Napier University in Edinburgh, Scotland. The style, grammar, and presentation are perfectly fine — but only that. It's not a bone dry college text book, but it was written with a dedication to professionalism that can make a technical book tedious and will certainly keep it from becoming a classic. The literary power of Brooks, Hoare, or Wall is conspicuously absent.

Authors of Hacks books are at liberty to tap the experiences of the best and brightest of the field, and the best and brightest often have tricks just too strange, clever, or specialized to fit into any ordinary sort of text. I'd like to imagine that if I were charged with writing one of these, I'd have hundreds of contributors (I'm not likable, but I am persistent). Nothing against the contributors (two of them more than 20 years experience each), but why stop at three?

I said I had two benchmarks: whether I'd be likely to walk out of a bookstore with it if I had an hour alone with it to try to get my fill, and whether it touched on the subject that I thought it should.

Before cracking the cover, I stopped to ponder what would really impress me, and what I'd like to see. The Internal Functional Programming Competition had a puzzle solved by the contest winner using SQL. I'd like to see similar combinatronics and optimization problems solved using SQL. I'd like to see a good implementations of semi-infinite-strings, the text indexing data structure and algorithm that Google uses. I've done a version of this, but my implementation leaves something wanting. When reforming badly non-normalized databases, I've had to build a normalized database in parallel and populate it from queries on the non-normalized one. It would be interesting to hear how other people approach that problem, and what I can learn from them. There are other jobs that I've tackled and managed despite never having been prepared for. Renumber a display_order priority on records in response to the user adjusting or reassigning priorities. Trees using self-joining tables is something more people should be exposed to, especially when presented with non-normalized data.

There was no semi-infinite-string implementation, but the book showed how to build full-text indexes the optimal way for each database, using built-in full-text indices and optional add-on modules offering full-text indexing. The renumbering example took the more general form of running-totals computations. There were a few examples of self-joining data, and one tree example visualized the structure. Normalizing data had tricks, including some with views, and it showed how to use Cartesian joins to do combinatronics problems. So, aside form one sort-of, the authors nailed my entire wish list. That's amazing — I've never had that happen before, actually.

The highest endorsement a book can earn from me (a cheapskate, who already has a good deal of knowledge from working the industry for ten years) is getting bought on a random trip to the bookstore where I hadn't been looking for or intending to buy anything, and paying full price on top of that. Books that are surprising, riveting, and so packed with information that I couldn't possibly copy all of the best parts down and exhaust it in an hour or two are the ones that get purchased in this manner. I have 'SQL Hacks' in my hot little hands here at home, so this benchmark is now synthetic, but... I'm somewhat undecided, and not sure whether I would or wouldn't walk out with it. More likely, I'd just put it on my wishlist and pick it up later, for a discount (I'm a cheapskate, remember). If you don't know how to do more than half of the things listed in the table of contents, most certainly buy it. If you find yourself frequently working with SQL and constantly face new problems, buy it. If you find yourself still learning SQL and wanting a variety of examples, buy it. If you're shopping for a handful of good SQL books, buy it.

On a scale of stuff laying around the house, I give it 7 gold stars, half a box of binder clips, some AA batteries, and a bottle of really good soy sauce.


You can purchase SQL Hacks from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

SQL Hacks

Comments Filter:
  • by macadamia_harold ( 947445 ) on Monday January 08, 2007 @03:43PM (#17512512) Homepage
    SQL is pitted against the most difficult analysis tasks just as survival scenario books pit humans against pavement and lions.

    This "survival scenario book" you're referring to sounds like it takes place at the zoo.
  • Good Article (Score:2, Informative)

    by b1ad3runn3r ( 896115 )
    Yes, offtopic, but I think it deserves to be noted that its a pretty well written review.
    • by Osty ( 16825 )

      Yes, offtopic, but I think it deserves to be noted that its a pretty well written review.

      I must not have read the same review as you. The one I read was full of sentence fragments, run-on sentences, comma and semi-colon misuse and overuse, poor organization, minor typos, and unnecessary comparisons to the reviewer's own authoring abilities. The content of the review was decent, though I didn't get enough from the review to know whether or not the book would be useful for me. The presentation of the r

    • by fm6 ( 162816 )
      Why is it offtopic? Whenever we get poorly written reviews (and that's most of the time) everybody bitches and flames at length. It's only fair to note when the reviewer actually knows what they're doing.
  • by Bazman ( 4849 ) on Monday January 08, 2007 @04:11PM (#17512928) Journal
    From: http://thedailywtf.com/Articles/The_Data_Cleanup.a spx [thedailywtf.com]

    UPDATE OWNER_USER.all_candidates SET name = filter(translate(REPLACE(REPLACE
    (replace(replace(replace(replace(replace(replace(r eplace(replace(replace(replace
    (REPLACE(replace(replace(replace(replace(replace(r eplace(replace(replace(replace
    (replace(replace(replace(replace(replace(replace(r eplace(REPLACE(REPLACE(
            name,
    CHR(1),null),CHR(2),null),CHR(3),null),CHR(4),null ),CHR(5),null),CHR(6),null),
    CHR(7),null),CHR(8),null),CHR(9),null),CHR(10),nul l),CHR(11),null),CHR(12),
    null),CHR(13),null),CHR(14),null),CHR(15),null),CH R(16),null),CHR(17),null),
    CHR(18),null),CHR(19),null),CHR(20),null),CHR(21), null),CHR(22),null),CHR(23),
    null),CHR(24),null),CHR(25),null),CHR(26),null),CH R(27),null),CHR(28),null),
    CHR(29),null),CHR(30),NULL),CHR(31),NULL))), ...I hope.
    • by Otter ( 3800 ) on Monday January 08, 2007 @04:29PM (#17513234) Journal
      Being accustomed to human genome data -- my first thought was that whoever has 31 chromosomes has more pressing concerns than some ugly SQL.
    • by vivian ( 156520 )
      It's obviously replacing non-pritning chars with null then doing something else with that in the translate and filter functions. The replace's are nested because there's no function to simply giv a range of values or array of values in the replace function.

      If they had used Postgres or Oracle, you could have used a regular expression to replace all those posky non-printing characters with null.

  • by Anonymous Coward
    why is it in every book i read, and in also reviews, the author and reviewer refer to the read or user as "her" all the time. Do they actually thing the main demographic that is reading their tech books are women? I am not a sexist here but it is a fact that IT,coding,networking is a male dominiated field.. hacking linux exposed, and hacking web apps exposed also refered to me as a woman.. ??
    • Re: (Score:3, Funny)

      by theskipper ( 461997 )
      Well, when I code, I sprinkle niceties like "you look really nice today" and "have you lost weight?" among my comments.

      Just in case it's a buxom blonde that peruses my c++ in the future.

      • by Mike89 ( 1006497 )
        I do similiar things. Sometimes without remembering doing it. I found in an older project the other day that the top of the file was marked simply "Still 12 days till the Wii :-(".

        I've found other strange stuff too. In a project from a couple of years ago I'd used a close friends birthday as the read handle for several files. It's interesting, because anyone else who goes through it will be at a lose as to what I was talking about (for the most part - they'd understand the Wii business). I also get to se
    • No, they're just continuing to labor under the assumption that a lot of technical males get overly excited that there might be real girls out there that actually know what things like "grep", "floppy", and "hard drive" really mean.

      Either that or they think writing "her" will eventually lead to more geeks encouraging their daughters to take up technical careers.

      Of course, that's not to say there aren't already some very geeky girls out there ...
    • Re: (Score:3, Insightful)

      by Ardeaem ( 625311 )
      If this annoys you, do you think it annoys many women that the generic pronoun for animate things in English is "he"?
    • Re: (Score:1, Informative)

      by Anonymous Coward
      The author is illiterate.
      http://www.theregister.co.uk/2002/04/08/new_winnt_ 2k_xp_security/ [theregister.co.uk]

      All right, I'm only going to say this once: 'He' is the singular indefinite pronoun in English ("if a person drinks too much, he will likely experience a hangover"). 'He' also happens to be the masculine personal pronoun.

      'She' is the singular pronoun of personification in English ("if England fails to advance America's foreign-policy ambitions, she will suffer terrible consequences"). 'She' also happens to be the

      • by reanjr ( 588767 )
        Male = he
        Female = she
        Neuter = one

        If one drinks too much, one will likely experience a hangover.

        To avoid using "one", one should utilize alternating gender-based pronouns (which is usually the case, but men - predominantly - will only notice those pronouns that are feminine and not those that are masculine). The alternation frequency should be regular, but ultimately depends upon textual context.

        One could alternate gender every paragraph (excepting those cases where a pronoun is in reference to a predetermin
    • by Falesh ( 1000255 )
      Personally I like the convention that some authors use where they use "he" if the author is male or "she" if the author is female.
  • by Anonymous Coward
    is to not use SQL. Seriously, at this point, with persistence systems like Hibernate and EJB3, most of us should no longer be using SQL for most database work.
    • by Anonymous Coward on Monday January 08, 2007 @05:21PM (#17514150)

      is to not use SQL. Seriously, at this point, with persistence systems like Hibernate and EJB3, most of us should no longer be using SQL for most database work.

      Congratulations, you get the "Don't Hire This Guy" award for this particular database-related slashdot post.

      In two simple sentences, you've managed to display most of what's wrong with the field of data management today. Let's review:

      * You use an invented, ad-hoc terminology instead of an accepted term: "persistence system", instead of "database management system" or "DBMS". I assume you call it a "persistance system" to reinforce the fact that it doesn't do much for you besides remember some bits on disk.

      * You support the COBOL-era practice of application-centric databases. Tell me, how do I access your Hibernate-controlled data from my Ruby script?

      * You support COBOL-era network databases, since I assume you're using Java, which is object-oriented, and object databases are basically network databases under a different name (more ad-hoc terminology).

      * You commited one or both of Date's Great Blunders: you equate entire SQL tables with types, rather then demanding that your non-scalar types be stored directly in the database without a "mapping" step, and you believe all values should have unique simple IDs, even though that data has nothing to do with what you're modeling.

      * You believe that database queries should be procedural and navigational, rather than algebraic and declarative. Tell me, if you have a sequence of Java statements in your code that perform a complex data management task, are they optimized as a unit into simpler, equivalent statements? Is it even possible? No, didn't think so.

      * You (likely) believe that business logic belongs in the application, rather than the underlying DBMS, even though you also have business logic in the DBMS already, such as different column types.

      * You're generally very confused, since Hibernate actually uses SQL itself.

      Basically, you'd rather avoid the complexity of learning about what a DBMS is, what the underlying model is, how SQL approximates the model, and how to design a DB that can outlive it's various applications and "frameworks of the month".

      Congratulations! You win a swift kick in the nuts!

      • And you, Mr AC, has shown yourself to be _extremely_ clueless.
        Hibernate _is_ a persistance system, not a DBMS, it is a tool to help you work _with_ DBS.

        How you access my Hibernte DB from Ruby, well, I imagine that you would use something like: "SELECT * FROM Orders", other methods does exist for ruby, but this is the lowest level of abstraction.

        I have _no_ idea what you are talking about with objection databases or network databases, both has no relation whatsoever to hibernate.

        The rest of the post is not w
    • by Amadodd ( 620353 )
      Some people just don't get it. Here is some typical business logic as defined by business:
      If ItemA has Status4 and has been through Process1 three times already send it to Process4. Except if it has already been through Process4, send it through Process5. If it has been through Process5 once before set status to Status8. Except if it the last process it went through has been assigned to Employee23 or Employee46 or any employee in EmplGroup8, in which case you must add an alert to that employees queue.
      And
  • SELECT * FROM database_weenies WHERE cool_factor > 0;

    0 rows returned

    • UPDATE database_weenies SET paycheck =100000,units='DOLLARS',cool_factor=cool_factor+1
      • INSERT database_weenies (paycheck, units, cool_factor) SELECT 150000, 'DOLLARS', cool_factor*2 FROM web_weenies WHERE sql_injection IS NULL
  • by Anonymous Coward
    Sounds like SQL Hacks is a kitchen sink, which may appeal to some readers.

    I appreciate Joe Celko's SQL for Smarties series.
    Mostly for the series pure depth, without dragging in the periphery issues or proprietary extensions.
    A) Some databases are better within their own documentation demarking something non-portable.
    B) The series points out commonly confused non-portable items and makes note of those in the standards that
    a
  • It's not a bone dry college text book, but it was written with a dedication to professionalism that can make a technical book tedious and will certainly keep it from becoming a classic.
    So in order to be a classic, a technical book has to have all the digressions and cheap humor that plague the O'Reilly books? A classic technical work is useful, not entertaining. If you want entertainment, read a novel.
    • by Falesh ( 1000255 )
      The most effective learning books I have ever read were Head First HTML with CSS & XHTML and Head First Design Patterns. They also happen to be the most enjoyable.

      Note that this style may not be the best for you personally as we all learn in different ways.
      • by fm6 ( 162816 )

        There's something to be said for the "head first" approach, which uses humor and graphics to reinforce learning. But there's a limit to what you can teach that way. In the case of SQL, you could write an HF book to make the basic concepts easier to grasp, but once you started getting into the really complicated and abstract stuff, all that gonzo exposition just gets in the way.

        Anyway, the HF books are very calculated in the way to use humor for teaching. That's not true for most humor in computer textbook

  • by doom ( 14564 ) <doom@kzsu.stanford.edu> on Monday January 08, 2007 @11:42PM (#17518512) Homepage Journal
    Something I've been wondering about off and on: is there anything like a "schema cookbook" out there?

    The other day I was once again implementing yet another set of tables to represent mailing addresses, phone numbers, user names and so on; and it once again struck me as being completely ridiculous that we all don't just use the same standard schemas to do common tasks like this.

    • by qohen ( 104310 )
      Such things do exist--there are some free resources online as well as books that one can buy.

      Free data model collections:
      http://www.databasedev.co.uk/data_models.html [databasedev.co.uk]
      http://www.databaseanswers.org/data_models/ [databaseanswers.org]

      Books:
      A guy named Len Silverston has made this his niche--he's written these books:
      "The Data Model Resource Book, Vol. 1: A Library of Universal Data Models for All Enterprises"
      "The Data Model Resource Book, Vol. 2: A Library of Data Models for Specific Industries"

      More about them on Amazon:
      http://www.a [amazon.com]
      • by qohen ( 104310 )
        P.S. Just found a list of all the data models that Silverston has in his 2 Data Model Resource books:
        http://www.wiley.com/WileyCDA/Section/id-106466.ht ml [wiley.com]

        And, here's a list of all the (eleven) articles he did for Data Management Review:
        http://www.univdata.com/pubs.htm [univdata.com]
        (this includes the clickstream analysis data model article previously listed as well as others about "Universal Data Models for Financial Services", "Universal Data Models for Health Care", "Using "Universal Data Models to Jump-Start Your Data
      • I think I like David C. Hays "Data Modeling Pattern" a little more.

    • by ZenFu ( 692407 )
      Jim Arlow's Enterprise Patterns and MDA may also have what you want, but in UML.

      The archetypes are abstract organization and business models. Instead of customers and vendors, for example, you might have parties and roles.

      The models should give you some ideas on what a generalized database should look like.

      In my oft-questioned opinion, the downside to coming up with a generalized layout, regardless of the tier, is that your complexity is fixed. Sometimes such an approach feels like driving a 18-w
  • After reading the review I was expecting some pretty high level tips and tricks. I was disappointed. With few exceptions the "hacks" are things that most good DBA's already know. One saving grace of the book is that the examples are in several variants of SQL, so it's somewhat usefull as a translation guide.

fortune: No such file or directory

Working...