Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Programming Books Media Book Reviews IT Technology

Text Processing in Python 215

Ursus Maximus writes "If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book." Ursus Maximus's review continues below.
Text Processing in Python
author David Mertz
pages 520
publisher Addison Wesley
rating 10
reviewer Ursus Maximus
ISBN 0321112547
summary How to use Python to process text.

As you probably know, there are many good introductory texts about Python. This is not one of them, for this is an advanced book, but not an inaccessible one. David Mertz has a unique style and focus that we have become familiar with from his series of articles on the IBM Developer Network. Dr. Mertz is more interested in facilitating our learning process than in lecturing us, and rather than fill his pages with impressive examples designed to illustrate his expertise, he gently guides us by offering subtle yet important examples of code and analysis that makes us think for ourselves.

He has a special talent for programming in the functional style, and this is a great introduction to that style of Python programming. Thus, this is also a good guide to using the newer features introduced into Python in the last few revisions, which often facilitate the functional style of programming.

The text includes, in an appendix, a 40 page tutorial covering the basic Python language. This tutorial is, like the book, unique in its approach and is worthwhile even for experienced Pythonistas, as it sheds light on some of the underlying ideas behind the syntax and semantics, and it also illustrates the functional style of programming, which is sometimes quite useful when doing text processing. And, despite its many other virtues, this is a book about text processing.

Chapter 1 covers the Python basics, but with a particular eye towards those features most critical and useful for text processing. Chapter 2 covers the basic string operations as found in the string module and the newer built-in string functions. Chapter three is about Regular Expressions, and, although I am shy about regexes because of their relative complexity, I am very glad to have read this chapter and will no longer be intimidated when regexes are the correct approach to take! Chapter 4 is on Parsers and State machines, which are important for processing nested text, as in everyday HTML, XML and the like. This chapter is not as esoteric as its title may sound to relative newbies (like myself), as it does offer useful ideas and principles for dealing with HTML. How much more useful can a topic be than that? It is true that a deep understanding of this subject may be beyond myself and other relative duffers, but this chapter has much to offer those like me and I am sure much more to offer professionals.

Chapter 5 is on Internet tools and techniques, and this a good example of how text processing touches every important area of computer programming. We manipulate text for email, newsgroups, CGI programs, HTML and many other aspects of net programming. A good summary of XML programming is included, as well as useful synopses of other Python internet modules, from a text processing point of view.

Appendix A is the aforementioned selective and short review of Python basics. Appendix B is a ten page Data Compression primer that is quite educational. Appendix C offers the same good service for Unicode, and Appendix D covers the author's own software, a state machine for adding markup to text, which is backed up by his extensive web site that has a lot of free software to support those doing extensive text processing. Lastly, Appendix E is a Glossary for technical terms from the book. This is very much an educational book, and would be suitable for classroom work at the University level, beyond the introductory programming level; in fact, as part of a curriculum to teach programming using Python at the University level, this would be an excellent text for the second course.

One of the highlights of the book is that each chapter is concluded with a problem and discussion section. These are of the highest quality I have encountered in computer texts. Rather than overwhelming the reader with a large number of problems, the author has obviously given a lifetime of thought in coming up with a few key problems that are meant to stimulate thought, creativity, and ultimately understanding and growth in the reader. I will be coming back to the problems often, as they cannot be absorbed quickly anyway; they require thought. These would be most useful in a classroom environment; but as they are accompanied by excellent discussion material, and backed up by the author's web site, the individual reader will be well served also.

The book is more than the sum of its parts. It will be a most useful reference source for when I am doing various text related tasks for some time to come, and it was also a delightful and educational quick read in the here and now. It also amply illustrates the centrality of text processing in all areas of computer science, and I am confident that the book will be useful and educational for all programmers, whatever their area of expertise.

To sum it all up, this book is educational. It is also beautifully bound and printed, and excellently written. I rate it five stars, my highest rating, and heartily recommend its purchase.


You can purchase Text Processing with Python from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

This discussion has been archived. No new comments can be posted.

Text Processing in Python

Comments Filter:
  • Great Intro (Score:3, Interesting)

    by GoofyBoy ( 44399 ) on Monday July 07, 2003 @11:03AM (#6383334) Journal

    Exactly who wouldn't benefit from reading this book?
    • by Motherfucking Shit ( 636021 ) on Monday July 07, 2003 @11:12AM (#6383407) Journal
      Exactly who wouldn't benefit from reading this book?
      Dr. David Mertz, probably... :)
      • by Lulu of the Lotus-Ea ( 3441 ) <mertz@gnosis.cx> on Monday July 07, 2003 @12:46PM (#6384014) Homepage
        Actually, although this remark lacks modesty, I wrote the book for myself, in a way. That is, whenever I want to remind MYSELF of a particular method in an odd little module I only use occassionally, I turn to my own explication of it. It reminds me of what I found the most important aspect when I investigated that particular feature during writing. So I benefit from having a copy too (or usually the e-copy that you can find on my website).

        Btw. I also have some author copies that I'd like to sell to US buyers who can pay by check. Basically, I get the most money if you do it that way. If that's not convenient, please buy it some other place... but if you want to drop me an email, so much the better.

        David Mertz
        http://gnosis.cx/TPiP/
    • Re:Great Intro (Score:3, Informative)

      Novice coders. You should either have some background in Python or have the fundamentals that allow you to treat languages as tools rather than being a " language X programmer."
    • I read that intro about five times to figure out what he was saying. Basically, if you want to learn Python, you will benefit from this book.

      or....

      This book is good. (Python is implied)

      There you go, I distilled the whole intro into four words.

      Or even better yet: Good book.
  • The book in full (Score:5, Informative)

    by TheRoss ( 28211 ) * on Monday July 07, 2003 @11:03AM (#6383337) Homepage
    is here [gnosis.cx], as a series of text files. This is official.
  • "If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book."

    If you are a practitioner of voodoo and merely handle large pythons, you will benefit from this book.

    If you are a undersea explorer but have heard of pythons....

    --

    Was it the sheep climbing onto the altar, or the cattle lowing to be slain,
    or the Son of God hanging dead and bloodied on a cross that told me this was a world condemned, but loved and bought with blood.
  • by rkz ( 667993 ) on Monday July 07, 2003 @11:09AM (#6383376) Homepage Journal
    This one is a great addition to the book shelf, I know how to do certain things in Python by using the docs, but this book clarifies nicely why you are actually doing it and provides better language specific ways of doing things that might now occur to you. Also, it introduces nice Python concepts in a clear and easy way which scripters might not have come across before.
  • Another... (Score:4, Informative)

    by Pinguu ( 677142 ) on Monday July 07, 2003 @11:09AM (#6383381)
    good book [amazon.com]
    • Re:Another... (Score:5, Informative)

      by Mister Furious ( 413397 ) <ben@@@someguysserver...com> on Monday July 07, 2003 @11:49AM (#6383647) Homepage
      yeah, this is a good book. also it's released under the GNU Free Documentation License and is available to download in various formats here [greenteapress.com].
      • Woe to XHTML (Score:3, Interesting)

        by fm6 ( 162816 )
        The GTP site naturally links to the Open Books Project [ibiblio.org] site. Here things get sort of depressing. The HTML includes a reference to the XHTML DTD at w3.org. If you try to open this page with Internet Explorer, it tries to download and parse the DTD, with unfortunate results:

        Parameter entity must be defined before it is used. Error processing resource 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'. Line 85, Position 2

        IE behaves correctly if you give it an out-of-band indication that this is HTML (such as co

    • YES! This is the book that got me started programming. Highly recommended for anyone who's a total newbie to programming!
  • by SpaceRook ( 630389 ) on Monday July 07, 2003 @11:12AM (#6383408)
    Maybe it would be useful to review some BAD books. First, it would steer people away from them. Second, it would provide good examples of where a lot of tech writing goes wrong. Finally, it's just fun to read someone bash the sh!t of out something.
    • Maybe it would be useful to review some BAD books. First, it would steer people away from them. Second, it would provide good examples of where a lot of tech writing goes wrong. Finally, it's just fun to read someone bash the sh!t of out something.

      Why are you so focused on negativity? With the nightly news pushing out stories left and right about what's wrong with the world, can't we at least keep our Slashdot book reviews a good positive example of what's right with the world?

      Speaking of positive review
      • Negativity can be fun. Its like having the same thing for dinner. Variety is good.
      • Always with the negativity.
        Woof, woof, woof.
      • by Jerf ( 17166 ) on Monday July 07, 2003 @12:59PM (#6384140) Journal
        Why are you so focused on negativity? With the nightly news pushing out stories left and right about what's wrong with the world, can't we at least keep our Slashdot book reviews a good positive example of what's right with the world?

        For a given reviewer, you need both positive and negative reviews so you can get a feel for what the reviewer is looking for, and how closely it matches what you are looking for. In something as subjective as books or video games, this is critical. This allows you to align your views with the reviewer.

        In this environment, where it's a different reviewer is reviewing each time, it's much less useful. Reviews are really only useful in the context of knowing something about the reviewer. (I just thought of this, and after I post this I intend to shut off reviews from my Slashdot feed, since they are uniformly useless to anybody seriously looking to use them due to this overwhelming flaw in the process.)

        In fact, the bad reviews are typically far more informative then the good ones. Most good reviews can be boiled down to "It's great!" with little loss of content, where the bad reviews have actual criticisms of the reviewed product. What you do then is read the criticisms and see if you might agree with them. If you're reading a video game review (which I use because it has great examples), and it says "Game X has far too many little numbers to keep track of for your characters", and you're old-skool and you like fiddly little numbers, then the negative review may actually boost your opinion. A lot of what appears in reviews is that sort of opinion, relatively little is concerned with universal things like "I couldn't get this game to run stably for more then 5 minutes on any of the four computers I tried it on here."

        For a book review, such negative comments really go a long ways towards clarifying what the book is. "This book didn't give any examples on how to process XML" tells you more about the book's focus then "This book is great for anyone who programs and uses text!".

        The point of "The Power of Positive Thinking", IIRC, wasn't to be unremittingly positive in every way; that's actually counterproductive and can take you out of touch with the real world. In fact, IIRC, it can best be summarized as "Don't be negative; that's bad." ;-)
    • They do review bad books, but they say they are good anyway. A couple years ago, I bought a book based on a good slashdot review and the book really sucked.
    • Yes, in fact I didn't really recommend The Linux Problem Solver in my review [slashdot.org] from a while ago. (The formatting is not my fault.)

      Also, feel free to submit your own reviews [slashdot.org] of BAD books.
    • by hding ( 309275 ) on Monday July 07, 2003 @11:50AM (#6383656)
      Actually I think it's considerably less useful to review a bad book. Why? There are many, many times more books written than I will read. Therefore a bad review is most likely to warn me away from a book that I wasn't going to read anyway. And chances are (given the limited number of reviews) that no review will appear of a bad book that I planned to read.

      However, a good review may point me to a useful or interesting book that I would have otherwise overlooked.

      The obvious exception to this is when one can give a bad review to a book that is expected to have a very wide readership (and thus can warn many people away from a bad book), but how many technical books fall into this category?

    • by Anonymous Coward
      You need to learn the Slashdot Book Rating System.

      Anything above a "9" is a good book.

      A "9" is an average book. Read it only if you are particularly interested in the subject.

      Anything below a "9" is a bad book. Avoid like the plague.
    • I recommend never buying a book that is for Idiots, Dummies, or Stupid. IMO these books suck and leave their readers little smarter for having read them.

      I have seen several series of Learn Visually books and I think they are much better in most cases. That's what I will usually give newbies to learn from.
  • benefits (Score:5, Funny)

    by tmark ( 230091 ) on Monday July 07, 2003 @11:14AM (#6383424)
    If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book

    And if you're the website posting this glowing review, and collecting affiliate fees, you will also benefit from this book.
    • I'd like you to know that I am *not* an affiliate of any company's, and you can not link to Amazon or anywhere else from my site giving me a commission. I do it for love, or fun, or whatnot, but the 35 or so book reviews on my site and the rest of my site, do not earn any money anyway. www.awaretek.com/plf.html
      • You are not an affiliate. Slashdot is. I'll give them this, slashdot wears the conflict of interest on its sleeve, as they've stated since they began doing reviews how most reviews were going to be glowing because of the affiliation.

        One might imagine that a little integrity would spur more buying of books that were well-reviewed, because the review would mean something, but apparently for now it's worth just getting mentioned on slashdot.

        Slashdot used you.
  • by ACK!! ( 10229 ) on Monday July 07, 2003 @11:17AM (#6383448) Journal
    I have not really used the language much but I have used a few programs like Redhat config tools that are python driven.

    What do Slashdotters use python for?

    What are its strengths and its weaknesses?

    Why is it worth learning another programming language?

    Just being curious and all that.

    • What do Slashdotters use python for? "Fire & Forget" scripts, ie quickly fixing entries in databases as one offs. System monitors, checking the computers on our network is ok. as a calculator ;) & as a tool to unencode base64 encoded text. (I want to know that htaccess username & password ;) What are its strengths and its weaknesses? Quick to code something VERY powerfull, but slow to execute. Why is it worth learning another programming language? It's not, you already have leaned python, its just that you don't know you have! Just being curious and all that.
    • First, a disclaimer: I haven't used Python for about a year and a half, and so may be out of touch with the most recent developments in the language. I am writing the following NOT to bash Python or to invite flames, merely to explain what I feel to be weaknesses of Python. If someone can counter them rationally, please do so.

      That said, I learned, wrote in, and loved Python for a few months. However, the whole whitespace issue eventually drove me away from Python; some people like it, I didn't.

      Second,
      • However, the whole whitespace issue eventually drove me away from Python; some people like it, I didn't.

        I think you are the first person I have ever heard to hold this POV. Most people I see seem to hate the whitespace at first, and then grow to love it.

        I disliked how you had to explicitly pass "this" as a parameter to each method.

        You don't. You have have to explicitly indicate "this" (or "self" in Python) as an argument in the method definition, but you don't pass it as an argument -- Python passes
      • This happened a couple years ago. This is no longer a reason to prefer Perl.

        I haven't succumbed to Ruby for the same reason most Java-heads haven't succumbed to Python yet. I am not a Java-head because I like my programming languages free as in liberty.

    • by tuffy ( 10202 ) on Monday July 07, 2003 @11:34AM (#6383553) Homepage Journal
      What do Slashdotters use python for?

      I use it for data management, system administration chores and CGI programming.

      What are its strengths

      Python has a nice clean syntax that tends to re-use language constructs, which makes it easy to learn and read. It makes good use of objects and exceptions and it has a solid standard libarary of goodies. And, it has no shortage of additional modules to use. Plus, the whole of it is highly malleable.

      and its weaknesses?

      It's not the fastest language out there, some don't like its whitespace-based syntax and it doesn't have the breadth of pre-built modules as older languages like Perl have.

      Why is it worth learning another programming language?

      It is if you have problems to solve and don't particularly care for the tools you're using now.

      • by 4of12 ( 97621 ) on Monday July 07, 2003 @01:17PM (#6384293) Homepage Journal

        it doesn't have the breadth of pre-built modules as older languages like Perl have.

        Maybe not quite as many modules as Perl, but the standard Python library provides interfaces for a lot of different tasks. It's not skimpy [python.org], in case any of you potential Python users was worried.

        There's good reason the motto is "Batteries Included".

        I've found Python useful for all kinds of tasks and love the clean, short syntax devoid of punctuation characters.

        If you need more of a recognized authority to recommend how great and wonderful is Python, then listen to Bruce Eckel [mindview.net] or Eric Raymond [linuxjournal.com].

    • by Metrol ( 147060 ) on Monday July 07, 2003 @12:13PM (#6383812) Homepage
      I've recently started going through O'Reilly's "Learning Python" here myself. I'd spent a healthy bit of time trying to get C++ functionally working in my head, but I just couldn't get it. For someone who wants to code the logic and leave the nit picky stuff to someone else, Python seems to be a better approach.

      Mostly what got me going was an article in Linux Journal recently concerning wxWindows. Just the notion that I could code up a GUI application that is truly cross platform with Python and this windowing kit has got me focused on learning this language. I'm also rather interested in the fact that Python also binds in with KDE's API, as that's my preferred desktop.

      That is what all got me going. What I'm finding interesting as I learn this language is how it approaches various problems. Python is an interpereted language, but upon running a program the program is compiled into bytecode like with Java, except that the compile process is automatic. You can manually compile beforehand as well. Read a blurb in there about being able to convert a standing Python program to C, which then in turn can be compiled into a full executable. Haven't even begun to play with any of this stuff yet, but it is interesting.

      I'm personally impressed with the OOP approach that Python takes. I mostly code in PHP these days, and will most likely continue to do so for web stuff. Still, I never did much care for PHP's approach to OOP. C++'s approach just up and lost me. Python's approach seems to make a lot more sense, and even at this early stage of learning it I can see how I would utilize it in the kinds of stuff I'm looking to write.

      It has a module system similar to Perl's, and there's a LOT of them. Pretty much all the stuff I'm looking to do has some kind of module in play to help me along. I've only coded a little bit of Perl, but every time I did I really didn't care for the language. Too many esoteric symbols in place of where commands should be in play for my taste.

      I know that in every Slashdot thread concerning Python there needs to be at least one person bitching about code indenting as a part of the syntax. I personally love this. I imagine that anyone who has had to follow up behind someone who didn't indent code might just appreciate this. Python's indenting schema is pretty much exactly what I've been doing now in PHP and JavaScript for years now anyway. My eyes are still tuned in to looking for that closing brace that isn't there, but my brain is slowly starting to come around.

      At this early stage, about the only thing I'm finding a little confusing is how variables are handled. This is neither good or bad at this point, just that there's enough concepts I hadn't really dealt with before that there's a learning curve I haven't yet gotten through. From what I can tell, there's an odd mix of C++ style variables that act more like pointers than the scalars that I'm used to working with in PHP.

      This far into it, I'm still having fun going through this beginner's book. Been playing around a bit with the wxPython tutorials, and getting lost in BoaConstructor. I'm still of the opinion that my time being invested here is being well spent. Seems like a pretty cool approach to getting an application slapped together.
      • <<<At this early stage, about the only thing I'm finding a little confusing is how variables are handled. This is neither good or bad at this point, just that there's enough concepts I hadn't really dealt with before that there's a learning curve I haven't yet gotten through. From what I can tell, there's an odd mix of C++ style variables that act more like pointers than the scalars that I'm used to working with in PHP.>>>

        Python doesn't have variables in the traditional sense, only refere
    • by DeadVulcan ( 182139 ) <dead@vulcan.pobox@com> on Monday July 07, 2003 @12:22PM (#6383872)

      The type of object that an identifier points to cannot be declared; it's established at run-time. This is either a strength or a weakness depending on your philosophical leanings.

      It's a strength in that it makes prototyping very fast. If you want some function to operate on a class that it wasn't originally intended to operate on, then you just have to make the new class interface-compatible and jam it in there. No worrying about subclassing or prototypes or anything.

      It's a weakness for maintenance, because, when you're debugging this function, all you know is something has been passed in, and you're calling GetValue() on it. And cripes, you've got fifty six classes that have a GetValue() method! Which one is it getting? You have to run the program to find out.

      If you're doing scripting, then dynamic typing can be a godsend. If you're doing larger scale development, it can be a pain in the butt, because all of your developers need to be very disciplined.

      In general, Python is almost too powerful for its own good. If you have any undisciplined or "cowboy" programmers on your team, Python gives them enough rope to hang themselves... and everyone else... and their managers.

      But I love it. Treat it with respect, and Python will work wonders for you.

      • Has anyone ever done a study to find out if the time saved by not debugging dynamic type problems is greater than the time wasted by developers worrying about compiler typing rules? In my experience, dynamic type issues are somewhat rare in a language like Python, but when programming in a language like C++ it seems that a large fraction your time can be consumed with trying to get the compiler happy with your type declarations. (Or structuring your code in an unnatural way to match someone else's type decl
        • Studies that compare programming languages are hard to find (not to mention hard to do!). The best example I know is Lutz Prechelt [ipd.uka.de], who did a comparison [ipd.uka.de] of C,C++,Java,Perl,Python, Rexx and TCL for one particular text-processing application. He tried to measure different things like productivity, number of bugs, memory consumption, speed, etc.
    • I use python for everything more complex than a couple lines of shell. ~/proj/assorted_hacks/ contains stuff like a parser for libpcap dumps of AIM sessions, a script that pulls quotes out of a quotefile, a script which (using a module I wrote to parse a certain flavor of XML document) pretty-prints my bookmark URLs... I've also written a converter from the contact list format of my IM client of choice to '.blt', and at work I've written a substantial amount of CGI and some moderately tricky security-relat
    • GUI programming!! (Score:5, Interesting)

      by Balinares ( 316703 ) on Monday July 07, 2003 @12:27PM (#6383898)
      Python and Qt are the killer combo. I once coded during a break, just for fun (and as an example for the management, alright), a complex widget that took our head VB programmer *three days*. Only the Python/Qt widget was dynamically resizable (the VB one wasn't) and could hold any subwidget (the VB one could only hold buttons).

      Now I use Python for a variety of tasks ranging from things just a little too complex to be cleanly done in Perl, to large things that usually belong in Java's sphere but are much faster coded in Python. But GUI programming is an area where it particularly shines.
    • by Qbertino ( 265505 ) <moiraNO@SPAMmodparlor.com> on Monday July 07, 2003 @12:53PM (#6384083)
      What do Slashdotters use python for?

      Software Agents / Content Syndication 'bots
      Web/Internet Application Server (Zope)
      3D (me: Blender, ILM for Maya and others)

      I've used Python on various things one of the more abitious being, well, actually Text Processing :-). In the wider term that is. A Software Agent for scanning and retrieving certain information from different Inet Sources - a very serial process that's hard to 'objectivise'. Python did/does a great job at keeping things overseeable.

      Zope is the other area I use Python in. Zope I consider the most sophsticated Application Server avaiable. It's GPLd of course :-) (www.zope.org)

      Just as with me Python is very popular within the 3D Field. ILM use it as their prime scripting language and I like Blenders built in Python controlled/based realtime engine.

      What are its strengths and its weaknesses?
      Shurely it's tab-based delimiting of blocks ('whitespace syntax') is a big feature. I can be shure to be able to read *any* code from anybody who did it in Python instantly. Think of how teamwork improves (especially in extreme programming) when bad indentation means your code is broken!
      Python is completely GPLd, which means a lot to me and overall futuresafety of a PL. That's why I don't feel so good about Java (allthough I like it too in a way)
      Python is very easy to learn. "Perl is executable line noise, Python is executable pseudocode" actually sums that one up.
      The only *weakness* that comes to mind is that it's a younger language. But it's catching up rapidly in terms of breadth and width of the 'lib' availability - also due to Python being completely GPLd!

      Why is it worth learning another programming language?
      It's actually one of the most modern and sophisticated. I realizes what developers theorized as ideal some 20 years ago.
      The obligatory famous quote:
      |||We will perhaps eventually be writing only small modules which are identified by name as they are used to build larger ones, so that devices like indentation, rather than delimiters, might become feasible for expressing local structure in the source language
      - Donald E. Knuth,1974|||

      Oh, and, yet again, it's GPLd all the way through. Want a better PL? Use Python.
      • The only *weakness* that comes to mind is that it's a younger language.

        This is the second time someone mentions this. Perl is from the end of 1987, Python from the end of 1989. Not enough difference to be a concern now, I'd say.

        I think Perl mostly seems older as it rode the wave of CGI scripting, and became popular as the CGI language. So a lot of people heard of it before they heard of Python, and it seems that Perl is a lot older. But really they're from the same period.

        • You're right about 1987 for Perl's origin, but the earliest origin for Python seems to be 1991 [python.org], and it doesn't look like it actually went anywhere until 1995. In any case, age is really a proxy for the amount of development that's occurred on a language, and it still seems that Perl is way ahead on modules.
          • I was going on this link [python.org], that I should have read more carefully before posting that date. Coding started late '89, it was used internally during '90, the first release was in early '91.

            But of course, there's nothing that comes close to CPAN anywhere else. It's also beautiful to see a CPAN install, automatically grabbing lots of modules and installing everything correctly :-). There are only a few Linux distributions that can do the same sort of thing, I think.

      • ...also due to Python being completely GPLd!

        While I generally agree with your post, you gave this incorrect information several times. Python is not licensed under the GPL. It uses its own unique license [python.org] that is more similar to the LGPL or BSD than the GPL.

    • I've barely begun to investigate Python, but this article [linuxjournal.com] is the one that convinced me it was worth a look. That, and anecdotes from half a dozen acquaintences who said essentially the same thing Eric Raymond did...
    • What do Slashdotters use python for?

      For the same things I would use Perl, Java, or Matlab for: CGI scripts, system administration, text processing, numerical processing, GUI application development, and many other applications.

      What are its strengths and its weaknesses?

      Strengths: easy to learn, easy to read, much better error checking than Perl, much more concise than Java, lots of libraries, lots of GUI toolkits (Gtk+, wxWindows, Qt, Tk, FLTK, others).

      Weaknesses: library modules are somewhat haphazar
    • So far I've used python for:
      - a X 4.3 <a href="http://iki.fi/psavo/pub/code/xcurs">curso r editor</a>
      - a thumbnailer (shortish script, ~100 LOC)
      - a file/data sorter, image classifier.

      Python was _very_ much worth learning it, best language I've used to day (unless looking at some scheme -weirdness).
      Strength is that it can implement stuff very quickly, and most thing can be runnable separately, tested.
    • What do Slashdotters use python for?

      All the code I wrote for my thesis (text analysis, decision trees, genetic algorithms). On the job, scripting - backup scripts, hacking up the output from a database designer program into input xml files for Torque (Java data objects library). At home, a lot of little scripts, most notably the thing that drives mpg123 for me.

      What are its strengths and its weaknesses?

      Strengths: It's fun to code. It can be extremely powerful in a few lines and still stay readable. T

    • What are its strengths and its weaknesses?

      Its greatest strength is that it is incredibly quick to write, ESR wasn't exaggerating when he said that within a few hours of first learning the language he was writing fully working production-ready code. It is that easy to use. Also, Python tends to be easy to read, but really sloppy coders can still write crap Python code.

      Its greatest weakness is that it is incredibly quick to write. It is really really easy to code yourself into a hole without even reali

  • for text processing? Does it have the same libraries? I know it is less complicated, or that is what I hear....
    • Careful. You're going to get an "emacs vs vi" debate going.

      Perl and Python have different coding philosophies. Some find Perl's flexibility better. Others find Python's rigidity better. Use what you like.

      As far as core language support and add-on modules, they both cover similar areas.

    • You can go back to your Python program in 6 months and still understand it.
      • "Insightful". Huh.

        I just picked up a largish complex perl program I wrote two years ago, and even though I'd written very little perl since (I wrote python and java) and it was largely uncommented, I still understood all of it just fine.

        Granted, perl gives people all kinds of ways to be "clever", so reading someone else's code can be a nightmare. Given the ease of operator overloading and creating metaclasses in python, it's quite possible to create python code that looks perfectly readable on the surfa
  • Python Jobs (Score:5, Interesting)

    by Line_Fault ( 247536 ) on Monday July 07, 2003 @11:26AM (#6383494) Homepage
    Strangely enough, there seem to be a lot of jobs, at least where I am, where the only major language requirement is Python.
    I'm not sure if this is maintaining legacy apps, but it certainly scared me!
    • So where are you? :-)
    • Python is the Lord (Score:5, Informative)

      by ultrabot ( 200914 ) on Monday July 07, 2003 @11:56AM (#6383694)
      I'm not sure if this is maintaining legacy apps, but it certainly scared me!

      Python jobs are hardly for legacy app maintenance. More like rapid development of cutting edge stuff, prototyping, exploring, enterprise application integration... and Agile development in general. I introduced Python to my previous workplace, and after the guys there learned it, they didn't switch back (even though their chief python advocate/fascist, i.e. your truly, left :-).

      Python can be used for very large problems (hundreds of modules, and much more classes), in addition to trivial scripts (0 functions). It is *fun* as hell. Python programmer is always an architect, there is very little monkey-level "grunt work", which tends to form most of your day-to-day C++/Java programming.

      You really have no clue about OOP before you have tried one of the dynamic OOP languages: Python, Smalltalk, or Ruby. Smalltalk has fallen to a legacy role these days, while Ruby is much less mature and has a smaller community than Python. Additionally, Ruby is less "tasteful", in that it borrows more heavily from perl, but that is a matter of controversy ;-).

      Additionally, Python is an embodiment of Open Source, because the code is actually readable and concise enough to lower the barrier of reading it. In fact I have taken a look at the source code of several Open Source projects that use Python "just for kicks", while I hardly bother in case of e.g. C programs. One line of Python is equivalent of 10-20 lines of C++, so you can digest more with the typical geek attention span (i.e. borderline ADD ;-).
      • by Tack ( 4642 )
        Additionally, Python is an embodiment of Open Source, because the code is actually readable and concise enough to lower the barrier of reading it.

        At the risk of being redundant, I have to emphatically agree with this. A few years ago I started a project that required me to wrap a C library as a python module. (The project was ORBit-Python.) Having done a lot in perlXS before that, I was quite prepared to struggle with the Python/C API.

        But it wound up being truly a breath of fresh air. There are a few

      • You forgot Objective-C, it is a "dynamic" OOP language.
      • You really have no clue about OOP before you have tried one of the dynamic OOP languages: Python, Smalltalk, or Ruby. Smalltalk has fallen to a legacy role these days, while Ruby is much less mature and has a smaller community than Python. Additionally, Ruby is less "tasteful", in that it borrows more heavily from perl, but that is a matter of controversy ;-).

        I am not sure what you mean by Ruby being "less mature"; as a language in this niche, it appears (to me) to be among the most mature. It does h

  • to simplify (Score:2, Funny)

    by MasTRE ( 588396 )
    "If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book."
    = No matter what, you will benefit from this book.

    Do I hear a "best thing since sliced bread" coming?
  • by jdavidb ( 449077 ) on Monday July 07, 2003 @11:55AM (#6383689) Homepage Journal

    A good programmer can write Perl in any language. :)

    (Just kidding. ;) )

  • I've been curious about learning Python for awhile now. But, seriously, what is the great advantage of using Python vs. C++? All I really even know about it is that it is object oriented, just like C++, but that you have to be very particular about your whitespace.

    Not sure how significant one could take this to be, but over at meetup.com, [meetup.com] the C/C++ group [meetup.com] looks to be a dying breed while a relative many are flocking to the Python meetings. [meetup.com] Oh well. At least the the D&D meeting [meetup.com] is still going strong. ;)
    • Here's what I like (Score:5, Interesting)

      by truthsearch ( 249536 ) on Monday July 07, 2003 @12:53PM (#6384075) Homepage Journal
      I just started learning Python a few weeks ago, with my background being C++, Java, and Visual Basic. As a side note I have to point out that Python is an absolutely fantastic option for someone wanting to switch from VB to something more modern, useful, and platform independant.

      These are the benefits of Python (mostly over C++) I personally like:
      - It's a very forgiving language; i.e. you don't need to be overly concerned about string lengths or list bounds, no pointers and simple garbage collection
      - List notations built into the syntax are extremely handy for referring to portions of the list and making changes; far less code needed for working with lists
      - The OO parts are sufficient without being complex; everything is public; multiple inheritance
      - Modules are compiled as needed and compiled version is used when available, so it's pretty quick
      - Lots of runtime information easily available
    • by k8to ( 9046 )
      The primary difference between Python and C++ is quite simple. C++ is a low-productivity language. By comparison, Python is a very high-productivity language.

      By this I mean that per line, or per time, you're getting far more done in Python. Your programs are accomplished much more quickly, and you can move on to the next job.

      Like many high-producivity languages, Python is a nicer choice than a languages like C++ except for where it's inappropriate to be used at all. Some examples include: an unusually
      • I agree completely.

        In the end, if you find that some particular part of your Python code is limiting your performance, then code it up in C or C++ and make it available as a Python object.

        Then, you've obtained the best of both worlds: fast development and ability to quickly test, prototype in Python combined with the sheer speed of C exactly where it's needed and when (at the end, because [DEK] Premature optimization be the root of all evil.).

      • > If nothing else, Python is excellent for prototyping C/C++ applications.

        That's what I've been learning it for as well these last few weeks. Public methods, typeless, combined declaration/definition, all things that speed up prototyping (provided you keep in mind things that might get awkward in the final language). Another thing I've found trivial to prototype are web services; due to Python's dynamic nature, you can call remote methods as if they were real Python methods and everything gets marshalle
    • With the high speed of current computers, coding speed is more important than running/execution speed (unless you are programing a real-time data gathering device).
      Even if the program is slow, you could leave it running overnight, it cost less than average programmer hourly fee.
      5-10 years, I would have said: Learn C, but now, Python have a lot of advantages in order to be consider a "serious" languaje.
      If you are fine with C, keep with it, but I think trying python won't hurt.
  • Hello? Is anybody there? Can the reviewer be bothered to say anything at all about the actual subject of the book?

    "Text processing" could mean ANYTHING AT ALL. Consider the humble Turing machine...

    • Well, but that is pretty much the point. The book, from what I've seen from the free version on the website, is pretty general, in that it covers using Python to process text. Yes, pretty much any text. Yes, pretty much anything you want to do with text -- parse it and extract meaning, generate it, manipulate it, you-name-it. Whether it be marked-up, semantically meaningful text, or just a blob of text.
  • by dpbsmith ( 263124 ) on Monday July 07, 2003 @12:23PM (#6383879) Homepage
    On taking a lightning-quick skimming of the text at gnosis [gnosis.cx] I'm still don't quite get the point.

    SNOBOL was a mind-opener for me, because it really had a radically different approach to text processing. And it was genuinely useful. I haven't used it recently enough to know how I would feel about it today.

    Many languages now are more convenient for text processing than, say, C++ with STL or MFC. The traditional BASIC's at least recognize strings as good citizens and make it easy to do the fundamental operations. MUMPS improves on BASIC incrementally, as do PERL, Java, Javascript, etc., mostly to the degree that their standard libraries provide a useful suite of string functions. More and more languages have a Regex feature (e.g. REALBasic) and this is a really nice thing to have.

    So, I just read the review, and, as I say, took a lightning-quick browse through the online text of the book, and neither of them bothers to tell me how Python fits in.

    Both of them seem to assume from the beginning that I have already decided that Python is the language I want to use.

    Is there anything about Python that renders it especially appropriate for text processing? With regard to text processing, is it in a different category altogether from Java/Javascript/PERL/MUMPS/REALbasic?

    Or is it just a good language with string primitives and a decent string library?

    • With regard to text processing, is it in a different category altogether from Java/Javascript/PERL/MUMPS/REALbasic?

      For meaningless and arbitrary text (text without syntax/semantic or with a very primitive syntax still no semantic or when you consider text as a arbitrary set of strings despite any syntax or semantic) processing neither of imperative languages is good.

      If you want to work with text as with meaningful set of information, where both syntax and semantic should be taken to consideration and pr

    • No, Python is not particularly good for text processing. Python is very much a general-purpose language, and there's no specific task for which Python was designed.

      Text processing is, after all, only the start of things. Eating and spitting out text gets kind of boring pretty quick (see Awk or XSLT). More often you'll want to do something with that text. You'll process it then present it, email it, perform actions based on it, etc.

      That said, Python is quite good for text processing. For instance

  • by Qbertino ( 265505 ) <moiraNO@SPAMmodparlor.com> on Monday July 07, 2003 @12:25PM (#6383884)
    ...Python is executable Pseudocode.

    I have a stack of Perlbooks since something like 3 years ago and haven't gotten around to studying them thouroughly.
    Now that I've done some stuff in Python I actually think I'll never will. Everything that Perl can do Python can do better by now. Unless you're used to Unix CLI and syntax quirks Python will get you farther in a shorter period of time - and you'll be able to read your code in a year from now.
    Allthough the anual Perl obfuscation contest actually can be somewhat funny. :-)
    • Actually, your situation sounds very similar to mine. I was/am a very experienced C programmer, a pretty good shell script programmer, and with a decent knowledge of stuff like awk, sed, etc.

      Somehow I'd never got round to learning Perl, even though I thought I'd love it, as it seemed to combine the best parts of what I already knew in one single language, which many people raved about.

      Anyway, I bought a couple of the O'Reilly Perl books, and immediately started to think "Whoa, this is just too much functi
    • The "executable line noise" criticism has gotten to be a standard knee-jerk reaction, and as such it has lost all meaning.

      Perl has built-in syntax for various common tasks, such as regular expression matching and common file operations (Does this file exist? What is the size of this file?). This drives the purists crazy. But if you think about it, putting the syntax directly into the language has some benefits. You can check if a file exists with a single operator. In Python, you have to remember the na
      • But if you think about it, putting the syntax directly into the language has some benefits. You can check if a file exists with a single operator. In Python, you have to remember the name of the function *and* which module it is located in, then you have to import that module. This adds up to a lot of extra mental noise.

        Which is small price to pay for readable code (and that's assuming that -f and -d, etc is easier to remember than os.path.isfile() and os.path.isdir().) I can't believe that people still

      • Just on the topic of file processing, the path [jorendorff.com] module for Python is really cool. I'd like to see it become a part of the standard library, actually. I think it makes Python code much more on-par with Perl for that task (and I fully admit that Python's os.path functions are not very pretty).
    • Python is probably th better language. However Perl can be written in a nice and easy to understand way. What it really has is the killer-app - well actually the library. Lots of examples of Perl code (both good and bad) and a lot of useful stuff.

      This book though is good and shows one of Pythons strengths. It is just a pity that Python didn't have such a library.

  • "Dr. Mertz is more interested in facilitating our learning process ..."

    What the hell does that mean?

  • Does anyone know what solutions exist for quick/diry embedding python inside HTML, ala embPerl?

To be awake is to be alive. -- Henry David Thoreau, in "Walden"

Working...