Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Image

Book Review: Version Control With Git, 2nd Edition 116

kfogel writes "Two thumbs up, and maybe a tentacle too, on Version Control with Git, 2nd Edition by Jon Loeliger and Matthew McCullough. If you are a working programmer who wants to learn more about Git, particularly a programmer familiar with a Unix-based development environment, then this is the book for you, hands down (tentacles down too, please)." Read below for the rest of Karl's review.
Version Control with Git, 2nd Edition
author Jon Loeliger, Matthew McCullough
pages 456
publisher O'Reilly Media
rating Very good.
reviewer Karl Fogel
ISBN 978-1-4493-1638-9
summary Using the Git version control system for collaborative programming.
There's a catch. You have to read the book straight through, from front to back. If you try to skip around, or just read the parts you feel you need, you'll probably be frustrated, because — exaggerating, but only slightly — every part of the book is linked to every other part. Perhaps if you're already expert in Git and merely want a quick reminder about something, it would work, but in that case you're more likely to do a web search anyway. For the rest of us, taking the medicine straight and for the full course is the only way. To some degree, this may have been forced on the authors by Git's inherent complexity and the interdependency of its basic concepts, but it does make this book unusual among technical guides. A common first use case, cloning a repository from somewhere else, isn't even covered until Chapter 12, because understanding what cloning really means requires so much background.

Like most readers, I'm an everyday user of Git but not at all an expert. Even this everyday use is enough to make me appreciate the scale of the task faced by the authors. On more than one occasion, frustrated by some idiosyncrasy, I've cursed that Git is a terrific engine surrounded by a cloud of bad decisions. The authors might not put it quite so strongly, but they clearly recognize Git's inconsistencies (the footnote on p. 47 is one vicarious acknowledgment) and they gamely enter the ring anyway. As with wrestling a bear, the question is not "Did they win?" but "How long did they last?"

For the most part, they more than hold their own. You can sometimes sense their struggle over how to present the information, and one of the book's weaknesses is a tendency to fall too quickly into implementation-driven presentation after a basic concept has been introduced. The explanation of cloning on p. 197 is one example: the jump from the basics to Git-specific terminology and repository details is abrupt, and forces the reader to either mentally cache terms and references in hope of later resolution, or to go back and look up a technical detail that was introduced many pages ago and is suddenly relevant again[1]. On the other hand, it is one of the virtues of the book that these checks can almost always be cashed: the authors accumulate unusual amounts of presentational debt as they go (in some cases unnecessarily), but if you're willing to maintain the ledger in your head, it all gets repaid in the end. Your questions will generally be answered[2], just not in the order nor at the time you had them. This isn't a book you can read for relaxation; give it your whole mind and you shall receive enlightenment in due proportion.

The book begins with a few relatively light chapters on the history of Git and on basic installation and local usage, all of which are good, but in a sense its real start is Chapters 4-6, which cover basic concepts, the Git "index" (staging area), and commits. These chapters, especially Chapter 4, are essentially a design overview of Git, and they go deep enough that you could probably re-implement much of Git based just on them. It requires a leap of faith to believe that all this material will be needed throughout the rest of the book, but it will, and you shouldn't move on until you feel secure with everything there.

From that point on, the book is at its best, giving in-depth explanations of well-bounded areas of Git's functionality. The chapter on git diff tells you everything you need to know, starting with an excellent overview and then presenting the details in a well-thought-out order, including an especially good annotated running example starting on p. 112. Similarly, the branching and merging chapters ensure that you will come out understanding how branches are central to Git and how to handle them, and the explanations build well on earlier material about Git's internal structure, how commit objects are stored, etc. (Somewhere around p. 227 my eyes finally glazed over in the material about manipulating tracking branches: I thought "if I ever need this, I know where to find it". Everyone will probably have that reaction at various points in the book, and the authors seem to have segregated some material with that in mind.) The chapter-level discussions on how to use Git with Subversion repositories, on the git stash command, on using GitHub, and especially on different strategies for assembling multi-source projects using Git, are all well done and don't shirk on examples nor on technical detail. Given the huge topic space the authors had to choose from, their prioritizations are intelligently made and obviously reflective of long experience using Git.

Another strength is the well-placed tips throughout the book. These are sometimes indented and marked with the (oddly ominous, or is that just me?) O'Reilly paw print tip graphic, and sometimes given in-line. Somehow the tips always seem to land right where you're most likely to be thinking "I wish there were a way to do X"; again, this must be due to the author's experience using Git in the real world, and readers who use Git on a daily basis will appreciate it. The explanation of --assume-unchanged on p. 382 appeared almost telepathically just as I was about to ask how to do that, for example. Furthermore, everything they saved for the "Advanced Manipulations" and "Tips, Tricks, and Techniques" chapters is likely to be useful at some point. Even if you don't remember the details of every tip, you'll remember that it was there, and know to go looking for it later when you need it (so it might be good to get an electronic copy of the book).

If there's a serious complaint to be made, it's that with a bit more attention the mental burden on the reader could have been reduced in many places. To pick a random example, in the "Branches" chapter on p. 90, the term "topic branch" is defined for the first time, but it was already used in passing on p. 68 (with what seems to be an assumption that the reader already knew the term) and again on pp. 80-81 (this time compounding the confusion with an example branch named "topic"). There are many similar instances of avoidable presentational debt; usually they are only distractions rather than genuine impediments to understanding, but they make the book more work than it needs to be. There are also sometimes ambiguous or not-quite-precise-enough statements that will cause the alert reader — which is the only kind this book really serves — to pause and have to work out what the authors must have meant (a couple of examples: "Git does not track file or directory names" on p. 34, or the business about patch line counts at the top of p. 359). Again, these can usually be resolved quickly, or ignored, without damage to overall understanding, but things would go a little bit more smoothly had they been worded differently.

Starting around p. 244 is a philosophical section that I found less satisfying than the technical material. It makes sense to discuss the distinction between committing and publishing, the idea that there are multiple valid histories, and the idea that the "central" repository is purely a social construct. But at some point the discussion starts to veer into being a different book, one about patterns for using Git to manage multi-developer projects and about software development generally, before eventually veering back. Such material could be helpful, but then it might have been better to offer a shallower overview of more patterns, rather than a tentative dive into the "Maintainer/Developer" pattern, which is privileged here beyond its actual prominence in software development. (This is perhaps a consequence of the flagship Git project, the Linux kernel, happening to use that pattern — but Linux is unusual in many ways, not just that one.)

The discussion of forking and of the term "fork", first from p. 259 and reiterated from p. 392, is confusing in several ways. It first uses the term as though it has no historical baggage, then later takes that historical baggage for granted, then finally describes the baggage but misunderstands it by failing to distinguish clearly between a social fork (a group of developers trying to persuade users and other developers to abandon one version and join another), which is a major event, and a feature fork (that is, a branch that happens to be in another repository), which is a non-event and which is all that sites like GitHub mean by forking. The two concepts are very different; to conflate them just because the word "fork" is now used for both is thinking with words, and doesn't help the reader understand what's going on. I raise this example in particular because I was surprised that the authors who had written so eloquently about the significance of social conventions elsewhere would give such an unsatisfactory explanation of this one.

Somewhat surprisingly, the authors don't review or even mention the many sources of online help about Git, such as the #git IRC channel at Freenode, the user discussion groups, wikis, etc. While most users can probably find those things quickly with a web search, it would have been good to point out their existence and maybe make some recommendations. Also, the book only covers installation of Git on GNU/Linux and MS Windows systems, with no explicit instructions for Mac OS X, the *BSD family, etc (however, the authors acknowledge this and rightly point out that the differences among Unix variants are not likely to be a showstopper for anyone).

But this is all carping. The book's weaknesses are minor, its strengths major. Any book on so complicated a topic is bound to cause disagreements about presentation strategy and even about philosophical questions. The authors write well, they must have done cubic parsecs of command testing to make sure their examples were correct, they respect the reader enough to dive deeply into technical details when the details are called for, and they take care to describe the practical scenarios in which a given feature is most likely to be useful. Its occasional organizational issues notwithstanding, this book is exactly what is needed by the everyday Git user who wants to know more — and is willing to put in the effort required to get there. I will be using my copy for a long time.

Footnotes

[1] One of my favorite instances of this happened with the term "fast-forward". It was introduced on p. 140, discussed a little but with no mention of a "safety check", then not used again until page 202, which says: "If present, the plus sign indicates that the normal fast-forward safety check will not be performed during the transfer." If your memory is as bad as mine, you might at that point have felt like you were suddenly reading the owner's manual for an early digital wristwatch circa 1976.

[2] Though not absolutely always: one of the few completely dangling references in the book is to "smudge/clean filters" on p. 294. At first I thought it must be a general computer science term that I didn't know, but it appears to be Git-specific terminology. Happy Googling.

[3] (This is relegated to a floating footnote because it's probably not relevant to most readers.) The book discusses other version control systems a bit, for historical perspective, and is not as factually careful about them as it is about Git. I've been a developer on both CVS and Subversion, so the various incorrect assertions, especially about Subversion, jumped out at me (pp. 2-3, p. 120, pp. 319-320). Again, this shouldn't matter for the intended audience. Don't come to this book to learn about Subversion; definitely come to it to learn about Git.

[4] As long as we're having floating footnotes, here's a footnote about a footnote: on p. 337, why not just say "Voltaire"?

[5] Finally, I categorically deny accusations that I gave a positive review solely because at least one of the authors is a fellow Emacs fanatic (p. 359, footnote). But it didn't hurt.

You can purchase Version Control with Git: Powerful tools and techniques for collaborative software development from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

*

This discussion has been archived. No new comments can be posted.

Book Review: Version Control With Git, 2nd Edition

Comments Filter:
  • by Qbertino ( 265505 ) <moiraNO@SPAMmodparlor.com> on Monday November 26, 2012 @03:06PM (#42097147)

    First Post!

  • by jthill ( 303417 ) on Monday November 26, 2012 @03:14PM (#42097225)

    cloning a repository from somewhere else, isn't even covered until Chapter 12, because understanding what cloning really means requires so much background

    That's ... that's ... just ... what?

    Cloning is part of the brutally simple (and amazingly flexible) guts of git. Given Linus's hatred of C++ I think what git has become is deliciously ironic, but the basics could not be easier to understand.

    • OT, but since you seem to be familiar with git:

      Do you know if it has any sort of mechanism for holding submits until approved? We're currently using Subversion and the general consensus on that one is giving everyone their own dev branch, then having the overseer handle merging, which is tolerable for some small number of developers.

      • Assuming you're not referring to git-flow, I GTFY: https://github.com/Nextdoor/git-change [github.com]
      • Re: (Score:2, Informative)

        by Anonymous Coward

        Sounds like you may want gerrit [google.com].

        If you can't tell from the doc, check out this video [bandlem.com].

      • by jthill ( 303417 ) on Monday November 26, 2012 @03:35PM (#42097465)

        You have complete control over what goes into your repository. Off the top of my head, I'd have a repository for submitters to push to and another one only you have write access to, into which you can pull anything you want and which you can all agree to call the "official" repository of approved commits.

        Nowhere is it written developer branches must be pushed anywhere. What goes on in their own repos is really nobody else's business, because see above: nobody has to take or keep anything they don't want. Do take the time to learn how the object db, blobs and trees and commits, is structured, and the thumbs into that db. If you're like me, once you see it you'll have the itch to dink with the innards with shell commands just because it's so ridiculously easy. Build from there until you understand the low-level commands, and after that you'll see all the rest as merely conveniences for common tasks, which is what they are. For my money, any other route to understanding git is like trying to learn math without understanding equality.

      • What you should be doing is giving each substantial unit of change a branch.

        Personal branches become fiefdoms. Unless the developer concerned is disciplined, you are in effect creating a bunch of forks.

        Make a branch for each feature, bugfix, etc. It helps that the most painful thing about SVN, merging, is so much easier in Git.

        • Why do you call merging painful in svn?

          The only thing I think painful about it is all of the freaking URLs you have to use all over the place.. Then again, I'm comparing it to the mostly easier cvs⦠and from a currently non-git user point of view, you go from rev #s in cvs to big URLs in svn to completely incomprehensible long hex #s in git. But at least I seem to have fewer merge conflicts with svn than I do with cvs.

          • It used to be extremely painful merging in SVN. They say it's changed recently.
            • Recently as in when? I've been using svn for at LEAST a few years, and it's been the same as long as I've been using it.

              svn merge -c NUM URL
              is what I use the vast majority of the time.

              I still don't know what you mean by "painful", unless it means merge conflicts.. and as I said in my original reply, at least anecdotally, I saw merge conflicts far less often in svn than cvs⦠and I say this as someone who prefers cvs.

              • Well I have outright rage over CVS. Primarily because of its slowness in large projects....

                It is possible the reason you don't have problems with SVN merge and I did was because you use the command line and I use tortoise. I'm not the only one who thinks it's painful, though [assembla.com].

                Other source control systems parse the code to detect methods that have moved around. This makes merging even easier, since if all you did was move a method, or add some whitespace, it will notice that it's not really a conflict. Th
              • You haven't been using it long enough so. There was a point in time were svn supported using merge but failed to record any information about what had been merged to where. So if your repeated the operation, you got a giant clusterphuck of a result. Same thing applied to subsequent merges, since there was no information stored in subversion you had to know beforehand which revisions were previously merged.

                The consequence of this was that many teams ended up having to create custom tools to perform the mer
      • Re: (Score:3, Informative)

        by Anonymous Coward

        http://en.wikipedia.org/wiki/Gerrit_(software)

        Gerrit is a code-review tool for git -- you submit your changes to gerrit, it holds them until they are code reviewed and approved, then they get passed into the central repo. You can configure it so that one person (or a small subset) is a required gatekeeper for passing things to the central repo, or so that it requires some number of "yes" votes from reviewers to be approved.

        It gives a very nice web front end for diffs and comments when doing code reviews, t

      • https://help.github.com/articles/using-pull-requests [github.com] works well when you let everyone have their own "fork" but afaik wouldn't work for a single shared repository.

        You can also do it the same way you do with subversion though in which everyone has commit/push access to the same repo and just create a lot of branches.

        And obviously you can mix the two methods just fine.

      • by hackula ( 2596247 ) on Monday November 26, 2012 @04:59PM (#42098829)
        Yeah, don't pull them into your branch! It takes a bit to get used to, but your brain gets knocked straight eventually ;)

        Seriously though, ENORMOUS projects with thousands of submitters are using Git effectively. Most of the larger projects use a "lieutenant" based system where the head of the project has several trusted sources that he/she pulls from. The lieutenants are able to divide up the pull requests and each test/integrate a portion.

      • by fatphil ( 181876 )
        Just give every developer their own *repo*, which only they have commit rights to. It can be centrally stored on a honking great central server with a reference repo if you have loads of developers, and you're worried about space. (Of course, they can do their day-to-day work on their own repos on their own desktops/laptops if they'd rather work with a local copy, distributed vcs's are designed to be flexible that way.)

        Each coherent change (one feature, one bugfix, one update) should be on a branch in their
    • Given Linus's hatred of C++ I think what git has become is deliciously ironic,

      Huh? How is that ironic? Linus is a very good C programmer who has some very ignorant and very silly ideas about C++. He probably could have written an excellent VCS in asm. Wouldn't stop it being a bloody stupid idea, though.

      • by jthill ( 303417 )

        Actually I think he was dead on target about the need to keep the hordes away from critical code.

        The irony I see is in how complicated people make git seem by not religiously focusing on its core, only using the higher-level abstractions and conveniences to add lasting value, when really do eliminate endless petty little error-prone tasks now or in the future -- say with a straight face that the need for rerere is any less abstruse or rare than for template name binding or partial specialization, I dare

  • Did a certain editor watch a little hentai this weekend?

  • Read Pro Git first (Score:3, Interesting)

    by flimm ( 1626043 ) on Monday November 26, 2012 @03:21PM (#42097309)
    I am reading it now, and I would probably be having a much harder time if I hadn't read Pro Git first (available online gratis). I'm thinking specifically of the concept of branches and HEAD being pointers to commits. I do appreciate the thoroughness of this book, though.
  • git flow? (Score:4, Interesting)

    by vlm ( 69642 ) on Monday November 26, 2012 @03:24PM (#42097353)

    Does the book discuss my favorite workflow automation, git-flow? Other than in the obvious sense, like fundamentally "git flow feature finish 2012-11-26-whatever" basically just runs 3 to 6 git commands on the 2012-11-26-whatever branch and the develop branch although I can't be bothered to remember which commands exactly?

  • by Deus.1.01 ( 946808 ) on Monday November 26, 2012 @03:26PM (#42097367) Journal

    I have my own simple and versatile way of keeping track of branches.

    *Project
    *Copy of project
    *Copy of Copy of project
    *Copy of Copy of copy of project

    • by vlm ( 69642 )

      That file naming convention is too predictable. Around here we use names like:
      winter-project
      nov-15-project
      project-2.0
      11.05.12-project
      project-somebodys-name

      Which one is the most recent? Oh look at the last modified date for the file, of course.

      • I grew up in the projects..I know. :)
      • It's funnier when you walk into an environment with something like tfs in place and you see that kind of naming convention in use anyways...

        Speaking from exp.

        And then somebody makes an argument that all devs should be paid equal and the servers start smoking :)
      • That file naming convention is too predictable.

        True, and so is the version control idea I stole:

        age file

        does "mv file file-; cp -p file- file" so you end up with files named "file" and "file-". age again and you get file, file- file--.

        Which one is the most recent? In this example, it's always the file named "file".

        Simple but very effective. However, I'm moving into the 21st century with git.
  • That is possibly the best review of a technical book I have ever read. Thank you!
  • Coles Notes version please.

  • If you try to skip around, or just read the parts you feel you need

    That's ironic, because those are the features that make git so awesome.

  • by GodfatherofSoul ( 174979 ) on Monday November 26, 2012 @04:11PM (#42098047)

    They've changed their licensing to make Perforce free for small development teams and added some kind of GIT interface. My guess is so many developers are coming out of college having used GIT that it's building user lock-in.

    • by kfogel ( 1041 )

      It is not possible to compete with Git on its own terms without being open source. Being zero-charge for small teams is not going to cut it, IMHO.

      • Totally disagree with this. While I've never done any admin work with our Perforce server, we've never needed to rewrite the code. If you're working with a less featureful product, there might be a need for the code base. Perforce is enterprise-quality software.

    • It's fine for me.

      Git's source code is open and there's no DRM on top of the data storage format.

      So any lock-in is completely voluntary on behalf of the user.

  • by HyperQuantum ( 1032422 ) on Monday November 26, 2012 @04:35PM (#42098429) Homepage

    IMHO, git is a shining example of bad design. You need too much info on how it works on the inside, to be able to use it. It is simply way too complicated. I regret the fact that it seems to be the most popular VCS for open-source projects. I'd prefer something simpler like bzr.

    • Unbeliever! (Score:3, Insightful)

      by Anonymous Coward

      You will be burned at the stake for the heretical belief that Git may not be the most appropriate tool for *every* VCS need!

      FWIW, I've used them all and prefer bzr (or svn, depending). Not every project is the Linux kernel, which needs to allow thousands of people to collaborate in an independent/distributed fashion... Git is great for *that*, but then again so is bzr.

      What?!
      No! Wait! It's not what it looks like, I swear!
      AAARGH! THE FLAMES! THEY BUUURRRNNN!

    • by Old Wolf ( 56093 )

      I have no idea how git works on the inside, and make great use of it. It is such a massive time-saver coming from CVS.

      • Re: (Score:2, Insightful)

        by Anonymous Coward

        This all may be but imagine this. I work in a project where delivery time is fixed. We have a fixed time line of 6 months (now 8 due to delays). I imagine we wasted at least 3 weeks because people did not know how to deal with git and with commit,merge,pull,push i.e. basic functionalities. They knew other tools still prevalent in the rest of the company so there was no real need to switch except one engineer having a say and deciding for himself and the rest too. Now he is hardly using it and the rest never

        • by lattyware ( 934246 ) <gareth@lattyware.co.uk> on Monday November 26, 2012 @07:34PM (#42100375) Homepage Journal
          Seriously, if it takes anyone on your team 3 weeks to learn how to use GIT at a basic level, you need to find new people.
        • This all may be but imagine this. I work in a project where delivery time is fixed. We have a fixed time line of 6 months (now 8 due to delays). I imagine we wasted at least 3 weeks because people did not know how to deal with git and with commit,merge,pull,push i.e. basic functionalities. They knew other tools still prevalent in the rest of the company so there was no real need to switch except one engineer having a say and deciding for himself and the rest too. Now he is hardly using it and the rest never works from home so we wasted time to learn it only because one person liked the tool. In other words - git is possibly the biggest of all version control systems but because its concepts are so different from the others it means that switching to it should be carefully considered - are benefits evaluated against the incumbent ones.

          What version control system were you using beforehand?

          It seems odd that you'd all decide to switch to a tool without a plan for it, especially since you can easily use Git with other VCS systems quite easily (just maintain a branch that you use when you commit back to the other tool).

          Half the power of Git is that you have a fairly gentle migration path.

        • by devent ( 1627873 )

          No wonder you are posting it as AC, I would be ashamed myself to work with such incompetent people.

          I mean, WTF is so complicated with git add, git commit, git branch, git checkout, git merge, git push, git pop and git mergetool? Yes that is only 8 commands that you need to use git for production.
          If you are using an IDE like Eclipse, the IDE can manage git add, git commit, git branch, git checkout just fine.

    • Re: (Score:1, Insightful)

      by Anonymous Coward

      +1 for you. Git's internals may be nice, but its usability (or lack thereof) is epic fail

    • by devent ( 1627873 )

      That is slashdot: a troll gets +5 insightful.
      If you even bother to elaborate what on git is "complicated" and how that compares to bzr.

      I started with Subversion and that was a pain. Then I switched to git and I didn't know anything about it but could use it in 5 minutes. After some 3 years of productive usage, and my own repositories* I still only know the high-level commands and have never any trouble.

      * https://www.anr-institute.com/gitpublic/ [anr-institute.com]

    • by JigJag ( 2046772 )

      Personally (and professionally), I use Fossil [fossil-scm.org]. It has much of the strength of Git while remaining approachable. Bonus, the built-in cgi server/bug tracking/wiki/timeline tree. All this in a single binary less than 1 Mb.

    • I was using git long before I knew any nitty gritty details. Here's a useful site for beginners: Everyday GIT With 20 Commands Or So [kernel.org]

    • IMHO, git is a shining example of bad design. You need too much info on how it works on the inside, to be able to use it. It is simply way too complicated. I regret the fact that it seems to be the most popular VCS for open-source projects. I'd prefer something simpler like bzr.

      Git is very often explained wrong. Especially for those brain-damaged by the use of CVS or SVN. (I myself was/am too). And yes, 'brain-damaged' is a quite fitting term in this case. Think switching from Basic to OOP Java. That's the

  • by Old Wolf ( 56093 ) on Monday November 26, 2012 @05:05PM (#42098917)

    The reviewer has clearly read the book he is writing a review for.. what is the world coming to

  • by Anonymous Coward

    I find it hard to reconcile the high rating given to the book with the actual review. It seems like a very long list of very fundamental flaws with the book concluded with "so it's great!".

    Of course, people who like git are perhaps the kind of people who like reading an overly complex and confused book as some kind of puzzle.

  • Discount code CYBERDAY gets you 50% off at O'Reilly's shop [oreilly.com] until November 26, 2012 at 11:59pm PT!
  • even less from alternate vendors or as ebook
    Why is it so hard for the editor or submitter to include the street price on book reviews?
    doesnt seem a bad price for an almost essential developers reference.
    you'd think amazon marketdroids would be all over including a hyperlink to a /. special kickback price
  • # On branch master
    # Your branch is ahead of 'origin/master' by commits.
    #
    nothing to commit (working directory clean)

    Every single fucking one. And its probably the most search phrase relating to kit with the most questions and the most complete lack of fucking helpful answers. If the precious fucking git maintainers could come up with a more informative error message and a simple default action for the most common case - syncing local/remote branches then they'd probably reduce google traffic by at least 10

  • Version control was a solved problem 20+ years ago ... if you used VMS, that is ... Unfortunately VMS was never cool, for some reason I can't fathom.
  • Odd typo (or transcript-o) -- the word "privatizations" should be "prioritizations" in this sentence:

    "Given the huge topic space the authors had to choose from, their privatizations [should be 'priorizations'] are intelligently made and obviously reflective of long experience using Git."

    I don't know how that happened. It was "prioritizations" in my submission at http://slashdot.org/submission/2368485/book-review-version-control-with-git-2nd-edition [slashdot.org] .

    -Karl Fogel

1 + 1 = 3, for large values of 1.

Working...