Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming Books Media Book Reviews IT Technology

Effective XML 312

milaf writes "Who doesn't know about XML nowadays? Quite a few people, actually: there has been so much hype around it that some people think that XML is a programming language, a database, or both at the same time. On the other hand, if you are a developer, chances are that you feel that -- no matter its usefulness -- there is not much to XML. After all, it may take just a few hours to get the hang of creating and parsing an XML document. Maybe this is why most of the many and voluminous books discuss numerous XML-related technologies, but say less about the usage of XML itself." Read on for milaf's review of a book that takes the opposite tack.
Effective XML: 50 Specific Ways to Improve Your XML
author Elliotte Rusty Harold
pages 336
publisher Addison-Wesley
rating 10/10
reviewer milaf
ISBN 0321150406
summary Very well written collection of topics on XML Best Practices

In Effective XML: 50 Specific Ways to Improve Your XML, Elliotte Rusty Harold takes a different approach: know your elements and tags -- they are not the same thing! -- and weigh your choices in a context, because any technology applied for the wrong reasons may fail to deliver on its promises.

Following Scott Myers' groundbreaking Effective C++, the author invites us to re-evaluate seemingly trivial issues to discover that life is not as simple as it seems in the world of XML. In each of the 50 items (chapters), he gets into the inner workings of the language, its usage and related standards, thus giving us specific advice on how to use XML correctly and efficiently. The 300-page book is divided into four parts: Syntax, Structure, Semantics, and Implementation. Yet in the introduction, the author sets the tone by discussing such fundamental issues as "Element versus Tag," "Children versus Child Elements versus Content," "Text versus Character Data versus Markup," etc. On these first pages the author started earning my trust and admiration for his knowledge and ability to get right to the point in a clear and simple language.

The first part, Syntax, contains items covering issues related to the microstructure of the language, and best practices in writing legible,maintainable, and extensible XML documents. (In it, over 19 pages are dedicated to the implications of the XML declaration!) That seems a lot for one XML statement that most people cut-and-paste at the top of their XML documents without giving it much thought, doesn't it? Actually not, if you follow the author's reasoning and examples.

The second part, Structure, discusses issues that arise when creating data representation in XML, i.e. mapping real-world information into trees, elements, and attributes of an XML document; it also talks about tools and techniques for designing and documenting namespaces and schemas.

The third part, Semantics, explains the best ways to convert structural information represented in XML documents into the data with its semantics. It teaches us how to choose the appropriate API and tools for different types of processing to achieve the best effect. This chapter has a lot of good advice for creating solutions that are simple, effective, and robust.

The final part, Implementation, advises the reader on design and integration issues related to the utilization of XML; these issues include data integrity, verification, compression, authentication, caching, etc.

This book will be useful to a professional with any level of experience. It may be used as a tutorial and read from the cover to cover, or one can enjoy reading selected items, depending on the experience and taste. The book's very detailed index makes it an excellent reference on the subject as well. In the prefix to the book, the author writes, "Learning the fundamentals of XML might take a programmer a week. Learning how to use XML effectively might take a lifetime." I'm not sure about the "lifetime" -- that's an awfully long time for using one technology -- but for the most confident of us this still may not be enough :) . Your mileage may vary, but I suspect that you could shave a few months off that time by browsing through this book once in a while. Most importantly, it will make you a better professional and make you proud of the results of your work. Wouldn't this worth your while?


You can purchase Effective XML: 50 Specific Ways to Improve Your XML from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

This discussion has been archived. No new comments can be posted.

Effective XML

Comments Filter:
  • library (Score:5, Interesting)

    by Pompatus ( 642396 ) on Monday November 24, 2003 @03:19PM (#7549847) Journal
    If you want to read any book for free, just ask your local library to order it and they will. Libraries guess at what books people want to read, so if anyone shows any interest in any book, they order it. They loose their federal funding if they don't spend the money they are allocated, so they are generally VERY willing to buy as much as possible.
    • Re:library (Score:3, Funny)

      by essdodson ( 466448 )
      They should spend more on dictionaries. Perhaps then they could tighten up their federal funding before it gets too loose.
  • by foistboinder ( 99286 ) on Monday November 24, 2003 @03:19PM (#7549851) Homepage Journal

    It's got to be better than Ineffective XML

    • Hate to reply to a joke, but there ARE books that discuss the 'wrong' way to do things in order to avoid them.

      One that comes to mind would be Bitter Java which demonstrates wrong patterns used in applications and alternatives that tend to be more effective.

      So don't be too sure that it is better than Ineffective XML ;-)
  • by NickFitz ( 5849 ) <slashdot.nickfitz@co@uk> on Monday November 24, 2003 @03:20PM (#7549858) Homepage
    Learning how to use XML effectively might take a lifetime
    ...
    you could shave a few months off that time by browsing through this book

    Reading this book shortens life expectancy. Still, it's your choice...

  • by Anonymous Coward on Monday November 24, 2003 @03:23PM (#7549898)
    Others have said it before, but I'll say it again. XML is heavy weight and isn't free. The best example of this is SQLXML. Although it sounds nice to use SQLXML, the performance on most commercial database see a huge drop in performance. This is due to the fact that parsing XML blows and eats up copious amounts of CPU and memory. I've had people ask me about how to solve problems with SOAP on windows and java applications. The bottom line is, unless you're using hardware XML accelerators, XML is a resource hog.

    On a related note, more details on Microsoft Indigo are finally available. According to this article on XML mania [xmlmania.com] microsoft's future platform will use XML as much as possible. More details are available on microsft's site [microsoft.com]. The funniest part is they are claiming indigo + longhorn will be the best thing since slice bread. Maybe they haven't learned the hard lesson that parsing XML kills performance.

    • I share your opinion regarding XML, and have yet to find a great reason to use it, other than feeding data to our vendors systems through their proprietary file layouts.

      On that note though, I wonder if this author has some insight into better uses for XML than what I've typically seen (XML does everything!). I won't, however, be running out to buy it, as XML will always be just more bloat and a resource hog by nature.
    • by mellon ( 7048 ) * on Monday November 24, 2003 @03:47PM (#7550085) Homepage
      XML is just text! If the XML parser is slow, write a faster one! Figure out where the bottlenecks are! Don't give me this XML is slow crap. This is slashdot - you're supposed to be a geek. If you don't like XML, fine, but come up with a geeky reason not to like it, not some problem whose solution is just to roll up your sleeves and do some hacking!

      Oy! :')
      • by nat5an ( 558057 ) on Monday November 24, 2003 @03:54PM (#7550140) Homepage
        Okay, fine XML isn't slow by nature. But it's a generalized solution. Not every set of data needs to be stored in a general tree, so putting every set into one will often create a lot of extra work. The benefit of XML is its portablity, and the price is the performance hit you take from packing and unpacking all that data.
      • by micromoog ( 206608 ) on Monday November 24, 2003 @04:09PM (#7550241)
        How about the fact that, by definition, it takes something like 10 times as much information to store/transfer data in XML than in a native binary format?

        Having a huge amount of metadata surround every piece of data is not always a good thing. XML is slow, parser issues notwithstanding.

      • by larry bagina ( 561269 ) on Monday November 24, 2003 @04:32PM (#7550395) Journal
        parsing any text involves character-by-character analysis. No amount of geekdom code rewriting can change that. If an XML file is 3-times as large as a CSV file, it will take 3-times as long to parse. And both will be magnitudes slower than a binary record.
        • This is by no means assured. When you store data in a binary format, you generally have to have code to deal with byte-swapping and other format conversions. Also, generally speaking, the limitation on character parsing is memory bandwidth - if you are using a modern CPU, it is going to spend most of its time waiting for bits to come out of memory, and it doesn't care whether they're an ASCII (or utf8) byte stream or binary words.

          Also, a lot of stuff that goes around in packets is free-form text anyway
      • XML is just text!

        Exactly. It is just a form of text tagging. XML is evolutionary and not revolutionary in terms of technology. First there was SGML (Standard Generalized Markup Language) and then there was HTML.

        I remember working with SGML a dozen years ago. It was certainly not easier to use than the old system of formatting manuscripts. In fact it was much more time consuming. But the real benefit was the ability to make an archive of searchable articles with results that could be pulled up and be pro

      • by Anml4ixoye ( 264762 ) * on Monday November 24, 2003 @04:37PM (#7550425) Homepage

        You bring up some really good points. The reason that you hear a lot of "XML is slow" is because of the usage of XPATH. To use XPATH expressions, most implementations parse the entire XML document into memory.

        I suppose you *could* write a custom parser. If your structure is well-defined, and not subject to a lot of changes, you could significantly increase performance that way. The other option is to parse the document once, get out what you need to get out into smaller chunks, dump the larger document, and only work off the smaller chunks.

        Looks like TMTOWTDI is not just for Perl

        • XPath is not inherently a pig. Many API's handle XPath with aplomb, usually building an alternative data structure behind the scenes for access. XSL usually wants the whole tree but many implementations optimize this out unless large structures are being reorganized.

          Use the context, Luke.
      • Hows this? It's not necessary, it has no real reason to exist, and everything that it can do can be done better by existing products?
        I'm at a loss, and have been ever since it came out, as to why this is becoming a common way of doing things.
      • by Anonymous Coward on Monday November 24, 2003 @04:40PM (#7550453)
        Have you ever tried storing a picture in it?

        <pixelrow>
        <pixel>
        <value channel="red" level="0.023"/>
        <value channel="blue" level="0.22"/>
        <value channel="green" level="0.5"/>
        </pixel>

        ...

        </pixelrow>

        ...

        :)
        • And that would be a PERFECT example of the WRONG tool for the job.

          Now, if you want to compare something to do with images and xml, try comparing Flash files to SVG files and see what conclusions you come up with...
        • Sure: (Score:5, Informative)

          by rodentia ( 102779 ) on Monday November 24, 2003 @05:22PM (#7550976)
          <?xml version="1.0" ?>
          <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN"
          "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/sv g10.dtd">

          <svg>
          <line x1="50" y1="50" x2="300" y2="300"
          style="stroke:#FF0000;
          stroke-width:4;stroke-opacity:0.3;"/>
          <line x1="50" y1="100" x2="300" y2="350"
          style="stroke:#FF0000;
          stroke-width:4;stroke-opacity:1;"/>
          </svg>>
        • Have you ever tried storing a picture in it?


          Actually, yes.

          Its called SVG, it is a very nice way to represent graphics.

      • I really haven't touched XML. The reason? From the outside, it lookies bulky and slow. Parsing strings and using only 128 bits of the character isn't my idea of efficiency.

        Could someone explain to me why conversion into, say a binary map or such, isn't an advantage? I can easily see its portability, and ease of use, I just don't see the speed and small size.
      • XML is just text! If the XML parser is slow, write a faster one! Figure out where the bottlenecks are! Don't give me this XML is slow crap. This is slashdot - you're supposed to be a geek. If you don't like XML, fine, but come up with a geeky reason not to like it, not some problem whose solution is just to roll up your sleeves and do some hacking!

        Oy! :')

        XML may not be slow, but when it's used as a network protocol like SOAP and that craziness called XML-RPC it sure is. An XML parser is unjustifiable ove

      • wow, wait a minute... you want a geeky reason not to use it... well, how about rolling your own binary parsing data format is a) much, much more difficult for others to understand, b) way faster, c) far more bandwidth efficient.

        there you go - 3 classic geek reasons to do something the hard way instead of the standard, ordinary, easy but OK for mortals way.

        Incidentally, XML really is slow. Sure it looks nice, is easy to understand, easy to create with the simplest of text editors, interoperable, and an ind
        • my previous company used XML everywhere (it was cool, after all), but after a while performance (when sclaed to many users) became an issue. Rewriting the XML-handling object to use a binary format made things much, much, much faster. The XML blobs were then only used for the browser front end, and for debugging on a developer machine.

          No, your company did the exact right thing in choosing XML. When the nascent system is still being actively debugged, you made the process much easier because XML is human

    • This is the main reason I personally don't believe XML can be used as a functioning database. I see it being used more as a way to transport data across the internet and across different platforms. If two companies merge and one uses mostly UNIX-based servers and the other uses Microsoft, the two can combine their databases easily using XML.


      I see XML as a nice way to transport data but (at least right now) it's not mature and/or fast enough to serve as a fully functioning database.

    • speek kills... (Score:3, Insightful)

      by Broadcatch ( 100226 )
      ...resource hogs.

      While I'm not an XML zealot, I like the clarity it can bring to many domains of practice. Regarding the performance hit, get a faster computer! If you don't have a fast enough one yet, wait a year.

      Lisp was shunned in the past primarily for speed reasons, too. Now the main reason many don't like Lisp is because they don't understand advanced software engineering concepts and write poor Lisp code.
    • by Citizen of Earth ( 569446 ) on Monday November 24, 2003 @04:01PM (#7550182)
      Others have said it before, but I'll say it again. XML is heavy weight and isn't free.

      XML needs to be updated to allow binary encoding [cubewerx.com]. The open-source high-performance parser/generator library at the link demonstrates the performance gain [cubewerx.com].
    • I have to agree with that. Last year I did a work term in a department where they where converting their software to XML and SOAP. When I came in they asked me to learn XML and SOAP (c++'s gsoap and java soap). We were making and converting distributed applications. Usually with a user client made in java and a c++ server (for performance). After a few weeks into my work term I was still in the processes of working on one of these SOAP servers when finally one group finished converting one of our main produ
    • by Ed Avis ( 5917 ) <ed@membled.com> on Monday November 24, 2003 @04:45PM (#7550484) Homepage
      If you know your XML will conform to a particular DTD, FleXML [sourceforge.net] can be used to generate a very fast parser for it in the style of lex/yacc. You don't have to mess with all that slow DOM or SAX stuff if you're concerned about speed. It may still be a resource hog compared with binary file formats and protocols but not nearly as sucky as often seen (my own code included).
    • By the time Longhorn actually ships, we'll all have 20 TeraHertz processors to go with our moon colonies and personal rocket packs. Problem solved.
    • piffle (Score:3, Interesting)

      by rodentia ( 102779 )
      Bandwidth is an order of magnitude more limiting than tree parsing, egg. That and the facilities the tool vendors decorate their stuff with. Of course its not free, what is?

      SQLXML and most other value-adds are bull. Your business objects should optimize the hell out of their DB access and return XML. XML is messaging and presentation tier glue. Read the book.
    • XML is very fast (Score:5, Interesting)

      by Doug Merritt ( 3550 ) <<gro.euqramer> <ta> <guod>> on Monday November 24, 2003 @06:08PM (#7551689) Homepage Journal
      XML is heavy weight ... ...see a huge drop in performance. This is due to the fact that parsing XML blows and eats up copious amounts of CPU and memory.

      That's because everyone uses slow XML parsers. Some years ago at one of the then-top 5 web portals I was unhappy with the standard SAX/DOM parser in use; it was ridiculously slow (and buggy).

      So I wrote a new one. Parsing XML became one hundred fold faster! I timed it quite carefully.

      Other people in this thread are saying "of course XML is slower than binary formats, it's 3 times bigger." But a factor of 3 in performance is nothing, considering some of the advantages.

      A slowdown of 100, on the other hand, is absurd.

      I don't know why people don't rebel against this and make faster XML parsers the widely-used ones; for whatever reason, apparently everyone continues using slow parsers.

      At any rate, no, XML is not slow. It's just a simple, easy to parse format, for which IBM and others have written very, very slow parsers.

      And everyone just assumes that it has to be slow. Sheesh, why should an XML parser be slower than a C++ compiler??? Come on.

  • XML... (Score:5, Insightful)

    by the man with the pla ( 710711 ) on Monday November 24, 2003 @03:23PM (#7549899)
    I think one of the main problems with the embedding of XML architecture into office productivity software is unfortunately the end user. I mean, how long have programmes like MS Word had "document properties" contained in them, and how many people are actually using them? I'm currently working on a project to retrieve documents accross a company's backed-up data from the past 10 years, and there is very very little metadata available for us to do any searching on. Unless the embedded XML contained within office suites is brought more "to the fore" and in the face of users, instead of being a behind the scenes 'option', people just are not going to use it
    • Re:XML... (Score:3, Interesting)

      by Zo0ok ( 209803 )
      I saw a Microsoft demo that was supposed to show how powerful and useful it could be to insert XML-tags into Word documents. The idea was to fill the Word document with useful information (just fill in the users name here, and all information about the user is automatically inserted, now how good isnt that?). MS calls this Smart Document.

      So, I took a look in the XML-file that the connected to the Word document to make it smart. I wasnt very impressed (but fairly amused) when I saw that the XML-file was li
    • Interestingly, XML was originally intended as a userland technology, bringing the strength of SGML to the web, fixing what was broken in HTML (the last great userland data format). The game has lost sight of the goal a bit, I think, which is the root of much of the kvetching this topic generates.

      Frankly, ERH is a great writer and has good insights into the use and abuse of markup. This book is one of the things that was missing while the pro/anti-XML hype trains were picking up steam.
  • by acomj ( 20611 ) on Monday November 24, 2003 @03:25PM (#7549915) Homepage

    XML would work better if there were consistent DTDs for tagging information that everyone would use. There should be an open database of these DTDS.

    I was looking for a simple one to tag photos with. Couldn't find it, made my own. Is there a repository of these DTDs out there?

    • by Anonymous Coward on Monday November 24, 2003 @03:45PM (#7550073)

      Maybe here [xml.org]?

    • What's the matter, is your Google finger [wohlberg.net] broken?

      Let's see... A <digital> element contains zero or more <frame>s, each of which can contain an <image> with a URL.

    • W3C (Score:5, Informative)

      by sielwolf ( 246764 ) on Monday November 24, 2003 @04:08PM (#7550236) Homepage Journal
      Browse the Technical Reports, Recommendations [w3.org] and Proposed Recommendations [w3.org] at W3C as there are a lot of DTDs and Schemas there. I found a DTD [w3.org] for generic simulation representation there. There's quite a bit if you take the time to look.
  • XML (Score:2, Funny)

    by Anonymous Coward
    ... a floor wax and a desert topping...
  • by Randolpho ( 628485 ) on Monday November 24, 2003 @03:27PM (#7549936) Homepage Journal
    Does the book discuss the pros and cons of XML? Such as, when is it a good idea to use XML? When would a CSV, INI, or other structured text document be a better choice than XML?

    These are issues that need to be solved first, before one creates an effective XML structure. Does the book address them?
    • by LetterJ ( 3524 ) <j@wynia.org> on Monday November 24, 2003 @03:48PM (#7550095) Homepage
      Unfortunately, most Slashdot reviews are little more than book reports with pretty much no analysis. They end up just listing what the chapters contain.

      Incidentally, one of the main reasons to choose XML over either CSV or INI is that both of those formats are pretty driven by rigid "column" type structures. In most INI files there's only room for pairs of names and single values. In CSV records are one row with a set number of fields.

      XML lets you expand the children fully and represent more complex data. For instance, a classical CSV file with address information for customers would have columns for street address, city and then start to have problems when you start having columns for State (when you actually consider the world outside the US), postal codes, etc. If this is in XML, you can have your schema be more flexible and say that each <customer> contains a <shippingaddress> element which can contain either a <state> or a <province> or neither.

      In other words, you can use trees to represent data instead of flat rows. I'm not saying that it's the be-all and end-all that the evangelists say it is. There are still lots of places that simpler text files and other data storage formats are better, but XML can be useful.
      • You can represent tree structured data dead easily in .INI files (as long as your API for parsing them can enumerate sections and keys, not just ask for ones by names you already know).

        Actually there's nothing forcing you to stop at trees; you could represent arbitrary directed graphs in .INI files without any trouble, other than that of remembering that your application needs to avoid running round loops forever.
        • Imagine that, you can represent just about anything in a flat text file!

          Now, what do the contents of this ini file mean and how shall I edit it to do what I want it to?

          [fido.ini]

          (Contents don't matter for the point to be made)
      • Actually, I already understand that. The problem is; what happens if you don't *need* to represent a tree for your data? Why should you use XML rather than some flat, easier-to-parse CSV file? I mean other than the fact that XML is the current buzzword, of course.

        Every book on XML should address this issue. I wonder if this book does.
    • Here's another one:

      If xml is so great, why wasn't the review written in xml? Why wasn't the book written in xml? Why aren't its' advantages obvious as opposed to the disadvantages (bloat, slow, etc).

  • by AllergicToMilk ( 653529 ) on Monday November 24, 2003 @03:28PM (#7549945)
    One of the things that I have found limiting about XML is that it is inheirently hierarchical. Real "things" can be categorized many ways. Hierarchical classification systems (such as our modern file systems) work poorly to classify a broad scope of information. Thus, some of the new development in the FS in Longhorn and also some I've head about, but can't remember, for Linux.
    • e.g. IBM's take [ibm.com].

      You can link between XML entities quite easily.

      Also consider that RDF, which describes directed graphs, is quite easily expressed in XML; there's nothing to say that you can't describe a graph and reference actual elements with IDREFs. I don't think you've really thought about this.
  • Hmm.. (Score:3, Funny)

    by jpsowin ( 325530 ) on Monday November 24, 2003 @03:31PM (#7549965) Homepage
    Wouldn't this worth your while?

    Wouldn't this what my while???
    All your base are belong to us!

    (huge eye roll)
  • Glad you cleared that up for us non-programmers. Now if I could just figure out what it really is!
  • by FearUncertaintyDoubt ( 578295 ) on Monday November 24, 2003 @03:32PM (#7549971)
    Syntax:
    Include an XML Declaration
    Mark Up with ASCII if Possible
    Stay with XML 1.0
    Use Standard Entity References
    Comment DTDs Liberally
    Name Elements with Camel Case
    Parameterize DTDs
    Modularize DTDs
    Distinguish Text from Markup
    White Space Matters

    Structure:
    Make Structure Explicit through Markup
    Store Metadata in Attributes
    Remember Mixed Content
    Allow All XML Syntax
    Build on Top of Structures, Not Syntax
    Prefer URLs to Unparsed Entities and Notations
    Use Processing Instructions for Process-Specific Content
    Include All Information in the Instance Document
    Encode Binary Data Using Quoted Printable and/or Base64
    Use Namespaces for Modularity and Extensibility
    Rely on Namespace URIs, Not Prefixes
    Don't Use Namespace Prefixes in Element Content and Attribute Values
    Reuse XHTML for Generic Narrative Content
    Choose the Right Schema Language for the Job
    Pretend There's No Such Thing as the PSVI
    Version Documents, Schemas, and Stylesheets
    Mark Up According to Meaning

    Semantics:
    Use Only What You Need
    Always Use a Parser
    Layer Functionality
    Program to Standard APIs
    Choose SAX for Computer Efficiency
    Choose DOM for Standards Support
    Read the Complete DTD
    Navigate with XPath
    Serialize XML with XML
    Validate Inside Your Program with Schemas

    Implementation:
    Write in Unicode
    Parameterize XSLT Stylesheets
    Avoid Vendor Lock-In
    Hang On to Your Relational Database
    Document Namespaces with RDDL
    Preprocess XSLT on the Server Side
    Serve XML+CSS to the Client
    Pick the Correct MIME Media Type
    Tidy Up Your HTML
    Catalog Common Resources
    Verify Documents with XML Digital Signatures
    Hide Confidential Data with XML Encryption
    Compress if Space Is a Problem

    • And one of them is Just Plain Wrong, also IMHO.

      Here are two heuristics for good XML design that I dearly wish more people would take to heart:

      1. If processing any text field requires parsing, Something Is Wrong, and you probably need to break it apart into more elements/subelements.

      The only exceptions to this rule are fields that are numbers, or maybe date/time stamps that adhere to ISO standards.

      2. If you're using attributes, You'll Wish You Hadn't In The Future.

      Attributes are supposed to be the way X
  • by Valar ( 167606 ) on Monday November 24, 2003 @03:35PM (#7549997)
    It has been my experience with XML that it is like a lot of other things in development: the good developers understand it immediately and have native intuition towards best practices. The bad developers never really get it and spend their time reproducing tricks they saw in a cookbook. That's good and fine until you need something that doesn't quite fit into categories a, b or c. Another example of this is how high school and university data structure/algorithm classes never spend any time of development of new data structures that exactly meet the problem specification. Instead they lay out half a dozen types of linear lists, a couple of trees, and some hashing functions and say, "Well, you can glue just about anything together from this." Perhaps this book takes what is, IMHO, the better approach-- laying out the tools and politely explaining what the implication of each is, rather than attempting to list out pages of cute examples of what each can do.
  • I know that as a student maintaining a website I am in the minority of XML users, but I the main thing that stops me from moving my site (small-scale though it may be) over to using more XML is sheer server load. The fact of the matter is that we still don't have true low-bandwidth database solutions, and until this changes, I doubt that much will be done with technologies like XML (at least on smaller, non-corporate sites) no matter how much potential they have.
  • by pong ( 18266 ) on Monday November 24, 2003 @03:42PM (#7550050) Homepage
    ... and it is starting to dawn on me that trends like pervasive XMLization is going to haunt us for ever. The combination of business-minded consultants that push a market to create demand for themselves and a huge number of clueless but enthusiastic developers that will jump on any new idea and push it where it doesn't want to go unsurprisingly leads to this kind of instability.

    I hate XML with a passion. Let me present you with three examples

    1) Programming languages based on XML.

    Yes, it is true. Perverted minds, somewhere on this planet, actually seems to think that this is a neat idea! Since their initial conception the pivotal point of programming languages have been to raise the level of programming. To move from the computers domain to the human domain - to make it more intuitive an natural for a human being to program a computer. With these new XML-based languages we are moving a step backwards, because truely the only benefit of XML in this context is that it is easier for computers to parse, while it is certainly harder for humans.

    2) XSLT

    Have you tried it? I rest my case.

    3) SOAP

    Okay, initially this actually seemed like a good idea to me, but having thought about it, I really think it sucks. Okay, so it is easier to implement SOAP for a particular platform or programming language, but a wire protocol is like a compiler or an OS kernel in a certain sense - it is okay that it is very hard to write, as long as it is stable and high performance, because it is such a central component.
    • i second.. (Score:4, Informative)

      by Hooya ( 518216 ) on Monday November 24, 2003 @03:53PM (#7550136) Homepage
      i use XML for a lot of things and it's been quite decent. but on the other hand, we're using dual pentium IIIs for trivial stuff that was running fine on a PII with c/c++ app without XML.

      the fact is that XML is just marshelling and unmarshelling of all computational data to and from strings thereby negating fast numerical performance that a CPU inherently has. you want to add two numbers? create a string representation, pass it around thru a bunch of parsers/transformers as strings then finally convert it back to the number it really is then add then convert it back to string for passing it around all over again... what a waste.
      • Ahh, I see, so your problem with XML is that it is a really slow math processor?

        Right tool for the job. I don't believe I've EVER heard somebody suggest that one should remove some heavy-duty number crunching from a c++ app and stuff it into XML...

        On the other side however, ever tried stuffing a family tree into a relational database? Or doing large quantities of text processing in c++?
    • I hate XML with a passion. Let me present you with three examples

      1) Programming languages based on XML.


      Yup.

      2) XSLT

      Have you tried it? I rest my case.


      I'm coding some right now, and it's not easy. The thing is this: it is tremendously powerful, and good at doing the one thing it's good at: converting XML to XML. There aren't many cases when you should need to do this, and XSLT beats perl, IMHO.

      3) SOAP

      Okay, initially this actually seemed like a good idea to me, but having thought about it, I really
      • Most of those files that use # for a comment do have a single parser in common: /bin/sh

        Most of the time those "config files" are actually just little scripts that get sourced into whatever startup script needs the information. The variables you're setting are actually environment variables. Interestingly enough, if you changed the /etc files to XML, you would have to add the step to parse the variables in the XML files into (most likely) environment variables so your scripts could use them.

        Besides, i
        • Most of those files that use # for a comment do have a single parser in common: /bin/sh

          A bunch do, but many don't. For those that are really dressed up sh files, I have to agree with you. For the rest, a standard format (XML) would be nice.

          bombadil% ls /etc/*.conf ...
          • /etc/6to4.conf: shell
          • /etc/gdb.conf: not shell
          • /etc/inetd.conf: not shell
          • /etc/kern_loader.conf: blank?
          • /etc/named.conf: not shell
          • /etc/ntp.conf: not shell
          • /etc/resolv.conf: not shell
          • /etc/rtadvd.conf: no clue
          • /etc/slpsa.conf: probably s
    • XSLT: Have you tried it? I rest my case.

      I've tried it, and I ended up loving it. Of course, I'm not using it as it was intended. I'm using it to convert DocBook into HTML and PDF statically. This is a heck of a lot better than using SGML/Jade.

      Like XML, XSLT is being adopted in areas the authors never really intended, but ignored in those they did. I use XML in several areas, so when my employer offered to send me to an XML class for free, I accepted. It was horrible! The examples used by the professor al
    • 1) Programming languages based on XML

      Yeah, but code *generation* with XML is the cat's pyjamas.

      2) XSLT

      You clearly haven't tried it, or did not use it as intended. Do you have any experience with other functional languages? I work almost exclusively with XSLT at the moment and wouldn't have it any other way.

      3) SOAP

      is butt-stupid, I admit. But hey, ninty-odd percent of the beef this topic has generated can be fixed with a glance at the book being reviewed.
  • ... is XML-RPC. A sort of lightweight SOAP. Very very useful for API's when you're doing cross-platform coding...

    The site [xmlrpc.org] has loads of implementations of both server and client code, some in *very* obscure languages :-)

    Simon.
  • by jefu ( 53450 ) on Monday November 24, 2003 @03:44PM (#7550061) Homepage Journal
    and L is for the Laughter it brings us.

    I have not read this book, but it sounds interesting already.

    XML is an interesting technology that has the potential for changing the way we use technology in all kinds of weird and wonderful ways. (And in a few ways that may not be so wonderful.) But using XML correctly is tough. I've written and discarded more DTDs and schemata than I care to admit because they were seriously flawed. Getting it right is important and very, very hard.

    XML looks simple, and in some ways it is. But in so many other ways it is not simple at all - in large part because it gives us a tool to approach some very hard problems. And hard problems, often even when expressed in the simplest way around, tend to stay hard. (Calculus makes saying some things simple, for example, but understanding those things still takes work and insight.)

    I will be taking a good look at this book in the near future to see what it has to say. And I'd urge those who dislike XML to do the same. And finally, even those who like XML need to think hard about how to use it well, so perhaps this would be a good read for them too.

  • by Chromodromic ( 668389 ) on Monday November 24, 2003 @04:20PM (#7550317)
    Reading through the posts on this board, I tend to agree with the criticisms about XML. It's a big dreadnought of a specification when, in most cases, a nice light corsair or even single-seat fighter would do the trick. Still, I would normally be inclined to say of XML what is said about Democracy: it's the worst system out there, except for all the others.

    Then I found YAML [yaml.org]. Long and short, YAML is very lightweight, eminently readable, easy to use (parsers exist in multiple languages) and a pleasure all kinds of projects that require data serialization. Where XML branches off into other types of uses, like XSL programming, YAML doesn't really compete. I find this to be a strength, actually, because once you've used YAML and seen it in action, XSL seems like a big, fat add-on. But for those that rely on XSL and other things, YAML won't do the trick.

    But if all you need is data serialization in a compact, easy-to-read, easy-to-use package -- and this, in my opinion, is by far what XML is most used for -- then YAML is great. Give it a shot.

    As for XML. I used to hate it with a passion. Now I still hate it, but I'm less passionate. The creators of XML are ambitious people, and they tried to do something in that spirit. It works, basically and XML doesn't deserve *all* the bad press it gets.
    • by oren ( 78897 )
      XML and YAML have different "sweet spot" domains, though you can apply both technologies outside their intended domain.

      XML is great for "documents" - text documents, that is. XML does an admirable job seperating "content" from "markup" which can be used to drive "presentation". It really is a big improvement over SGML. Things like DocBook, and CSS stylesheets, make XML the choice for writing documents.

      YAML is great for "data" - data structures, that is. YAML directly maps to common application data struct
  • <karmawhore type="shameless">

    If you have a Safari account, you can read it Here [oreilly.com]

    </karmawhore>

  • by wdavies ( 163941 ) on Monday November 24, 2003 @04:51PM (#7550571) Homepage
    Ok, maybe I'm missing a point, but the next time I see an XML file like this...
    <RECORD NAME=".." ADDRESS=".." AGE = "..">
    <RECORD NAME=".." ADDRESS=".." AGE = "..">
    <RECORD NAME=".." ADDRESS=".." AGE = "..">
    <RECORD NAME=".." ADDRESS=".." AGE = "..">
    instead of this
    ..\t..\t..
    ..\t..\t..
    ..\t..\t..
    I am going to go nuts. Yes, XML is an improvement for truly hierarchical or repeating data, but efficient it isn't and a pain in the butt to use with AWK or anyone of a million Unix utilities. The one downside I have on ESR's Art of Unix [catb.org] is that while espousing how clean is with pipes and text, he then starts waxing lyrical about XML... Winton
    • by Anonymous Coward
      The whole idea with XML is that it will catch the error when a user or script writes "RECROD" in one place, or forget a space. Your AWK script will likely just crash without an explanation or miss a record if the user e.g. forgets the carriage return between lines.

      And just assume that six months after releasing your program you realize it would be very useful with an "OCCUPATION" field to. What do you do now? Maintain a separate collection of databases for each generation of your software?

      The XML-database
    • Well, duh, if you are using XML for non-heirarchical data, then your using it wrong.

      On the other hand if it looked more like this:

      &ltRecords&gt
      &ltRECORD id = .. NAME=".." ADDRESS=".." AGE = ".."/&gt
      &ltRECORD id = .. NAME=".." ADDRESS=".." AGE = ".."/&gt
      &ltRECORD id = .. NAME=".." ADDRESS=".." AGE = ".."/&gt
      &ltRECORD id = .. NAME=".." ADDRESS=".." AGE = ".."/&gt
      &lt/Records&gt

      and if the tag was nested in something else, then xml is appropriate.

      A
  • by BrittPark ( 639617 ) on Monday November 24, 2003 @05:10PM (#7550817) Homepage Journal
    XML is highly overrated and generally over-used. Admittedly XML + CSS is better than html, but beyond that its only reasonable use is as a generalized syntax for configuration files, and as such does a good job, or at least I've had success using it that way in the past. Many (if not most) of its other uses are just poor program design. Soap is an extremely silly idea. Why use XML for a marshalling syntax for RPC? It's slower, bulkier, and just a bad choice in comparison to a binary marshalling mechanism. Now as a syntax for an RPC's IDL XML makes a lot of sense, but not as a transport.

    Glad to get that off my chest. I have a bitter history with XML. I was the first person at my former company to bring XML in as a uniform configuration file format for our product, but then found myself a couple of years later forced into adding XML specific features to the filesystem that was the core of our company's product. I spent a week thinking about the idea, and concluded that it was a bad one. Thus followed a long (and fruitless) battle with management to scratch the plan. The end result was a technically nifty but useless set of features. The work remains unreleased for lack of customer interest. At least I get a bit of "I told you so." pleasure.
  • by zontroll ( 714448 ) on Monday November 24, 2003 @05:14PM (#7550873)
    VeryGeekyBooks [verygeekybooks.com] has more reviews of this book.
  • by elharo ( 2815 ) <elharoNO@SPAMmetalab.unc.edu> on Tuesday November 25, 2003 @11:50AM (#7558449) Homepage
    Nice review. Thanks! It's interesting how many of the comments here relate directly to chapters in the book. For instance, there's a lot of concern about XML's perceived verboseness. This is addressed directly in Item 50, Compress if space is a problem. [cafeconleche.org] This chapter and ten others are online at http://www.cafeconleche.org/books/effectivexml/ [cafeconleche.org] . Check it out.

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...