Deploying With Git

gitI’ve been banging my head on this for a while. It really did take me a year and reading lots of things to begin to understand that I was totally wrong. As with many things, I have to sit down and use them for a while to understand what I’m doing wrong, and what I need to learn. I finally had my git breakthrough. It’s very possible (no, likely) that I got some of this wrong, but I feel like I now understand more about git and how it should be used, and that made me more confident in what I’m doing with it.

Speaking as a non-developer (hey, sometimes I am!), I just want a command line upgrade for my stuff. This code also lacks a WordPress-esque click to upgrade, so I have to do a three step tango to download, unpack, copy, delete, in order to upgrade. 1 The more steps I have, the more apt I am to make an error. So in the interests of reducing my errors and my overhead, I wanted to find a faster and safer way to deploy. 2

Since I already thing that automating and simplifying deployment is good, and all I want to do is get one version, the ‘good’ version, of code and be able to easily update it. One or two lines is best. Simple, reliable, and easy to use. That’s what I want.

Recently, Ryan Hellyer pointed out git archive, which he claims is faster than clone. I’d believe it if I could get it to work. When I tried using HTTPS, I got this: fatal: Operation not supported by protocol.. So I tried using ssh and got Could not resolve hostname… instead. Basically I had all these problems. Turns out github turned off ‘git archive -remote’ so I’m dead in the water there for any code hosted there, which is most of my code.

I kicked various permutations of this around for a couple afternoons before finally throwing my hands up, yet again, and looking into something else, including Capistrano, which Mark Jaquith uses in WP Stack. It’s something I’m personally interested in for work related reasons. Capistrano is a Ruby app, and vulnerability fears aside, it’s not very user friendly. At my old job, we used ant a lot to deploy, though there don’t seem to be ant tasks yet for Git. The problem with both of those is that they require you to pull down the whole hunk ‘o code and I’m trying to avoid that in this use case. Keep it simple, stupid. Adding more layers of code and complication onto a project that doesn’t need it is bad.

Finally I went back to git and re-read how the whole distributed deployment works. I know how to clone a repository, which essentially gets me ‘trunk.’ And I know that a pull does a fetch followed by a merge, in case I’d done any edits, and it saves my edits. Hence merge, and why I dig it for dev. At length it occurred to me that what I wanted was to check out the git repo without downloading the code at first. Well I know how to do that:

$ git clone --no-hardlinks --no-checkout https://github.com/wp-cli/wp-cli.git wp-cli
Cloning into 'wp-cli'...
remote: Counting objects: 10464, done.
remote: Compressing objects: 100% (3896/3896), done.
remote: Total 10464 (delta 6635), reused 10265 (delta 6471)
Receiving objects: 100% (10464/10464), 1.20 MiB | 1.04 MiB/s, done.
Resolving deltas: 100% (6635/6635), done.

That brings down just a .git folder, which is small. And from there, I know how to get a list of tags:

$ git tag -l
v0.3.0
[...]
v0.8.0
v0.9.0

And now I can check out version 8!

$ git checkout v0.8.0
Note: checking out 'v0.8.0'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 8acc57d... set version to 0.8.0

Well damn it, that was simple. But I don’t want to be in a detached head state, as that means it’s a little weird to update. I mean, I could do it with a switch back to master, a pull, and a checkout again, but then I thought about local branches. Even though I’m never making changes to core code (ever), let’s be smart.

Linus Torvalds

Linus Torvalds flipping Nvidia the bird

One codebase I use has the master branch as their current version, which is cool. Then there’s a 1.4.5 branch where they’re working on everything new, so when a new version comes out, I can git pull and be done. 3

One conundrum was that there are tags and branches, and people use them as they see fit. While some of the code I use defines branches so I can check out a branch, others just use tags, which are treeish. Thankfully, you can make your own branch off a tag, which is what I did.

I tried it again with a non-Github slice of code: Mediawiki. 4 Now we have a new issue. MediaWiki’s .git folder is 228.96 MiB… Interestingly, my MediaWiki install is about 155MiB in and of itself, and diskspace is cheap. If it’s not, you’ve got the wrong host. Still, it’s a drawback and I’m not really fond of it. Running repack makes it a little smaller. Running garbage collection made it way smaller, but it’s not recommended. This, however, is recommended:

git repack -a -d --depth=1 --window=1

It doesn’t make it super small, but hey, it worked.

Speaking of worked, since the whole process worked twice, I decided to move one of my installs (after making a backup!) over to this new workflow. This was a little odd, but for Mediawiki it went like this:

git clone --no-hardlinks --no-checkout https://gerrit.wikimedia.org/r/p/mediawiki/core.git wiki2
mv wiki2/.git wiki/
rmdir wiki2
cd wiki
git reset --hard HEAD

Now we’re cloning the repo, moving our files, resetting where HEAD is, and I’m ready to set up my install to use the latest tag, and this time I’m going to make a branch (mysite-1.20.3) based on the tag (1.20.3):

git checkout -b mysite-1.20.3 1.20.3

And this works great.

The drawback to pulling a specific tag is that when I want to update to a new tag (1.20.4 let’s say), I have to update everything and then checkout the new tag in order to pull down the files. Now, unlike svn, I’m not making a full copy of my base code with every branch or tag, it’s all handled by head files, so there’s no harm keeping these older versions. If I want to delete them, it’s a simple git branch -D mysite-1.20.3 call and I’m done. No code changes (save themes and .htaccess), no merging needed. And if there’s a problem, I can switch back really fast to the old version with git checkout mysite-1.20.3. The annoyance is that I just want to stay on the 1.20 branch, don’t I? Update the minors as they come, just like the WP minor-release updater only updates changed files.

Thus, I asked myself if there was a better way and, in the case of MediaWiki, there is! In world of doing_it_right(), MediaWiki has branches and tags 5, and they use branches called ‘REL’. If you’re not sure what branches your repo uses, type git remote show origin and it will list everything. There I see REL1_20 and since I’m using version 1.20.3 here, I surmised that I can actually do this instead:

git checkout -b mysite-REL1_20 origin/REL1_20 

This checks out my branch and says “This branch follows along with REL1_20.” so when I want to update my branch it’s two commands:

git fetch --all
git pull

The fetch downloads the changesets and the pull applies it. It looks like this in the real world (where I’m using REL1_21 since I wanted to test some functionality on the alpha version):

$ git fetch --all
Fetching origin
remote: Counting objects: 30, done
remote: Finding sources: 100% (14/14)
remote: Getting sizes: 100% (17/17)
remote: Total 14 (delta 10), reused 12 (delta 10)
Unpacking objects: 100% (14/14), done.
From https://gerrit.wikimedia.org/r/p/mediawiki/core
   61a26ee..fb1220d  REL1_21    -> origin/REL1_21
   80347b9..431bb0a  master     -> origin/master
$ git pull
Updating 61a26ee..fb1220d
Fast-forward
 includes/actions/HistoryAction.php | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

This doesn’t work on all the repos, as not everyone follows the same code practices. Like one repo I use only uses tags. Still, it’s enough to get me fumbling through to success in a way that doesn’t terrify me, since it’s easy to flip back and forth between versions.

Fine. I’m sold. git’s becoming a badass. The only thing left is to protect myself with .htaccess:

 # SVN and GIT protection
 RewriteRule ^(.*/)?(\.svn|\.git)/ - [F,L]
 ErrorDocument 403 "Access Forbidden"

Now no one can look at my svn or git files.

And to figure out how to get unrelated instances of git in subfolders to all update 6:

#!/bin/sh
 
for i in `find ./ -maxdepth 1 -mindepth 1 -type d`; do
        cd $i
        git pull
        cd ../
done

And I call that via ./git.sh which lives in /wiki/extensions and works great.

From here out, if I wanted to script things, it’s pretty trivial, since it’s a series of simple if/else checks, and I’m off to the races. I still wish every app had a WordPress-esque updater (and plugin installer, hello!) but I feel confident now that I can use git to get to where my own updates are faster.

About these ads

Notes:

  1. By the way, more software should have one-click upgrades like that, it would make life easier for everyone. I do know that the backend support is non-trivial, so I would love to see a third-party act as a deployment hub, much like GitHub is a repository hub.
  2. My previous job was all about deployment. We had a lot of complicated scripts to take our code, compile it, compress it, move it to a staging site, and then email that it was ready. From there, we had more scripts to ‘move to test’ and ‘move to prod’ which made sense.
  3. In this moment, I kind of started to get how you should be using git. In SVN, trunk is where you develop and you check into tags (for WordPress at least) to push finished versions. In git, you make your own branch, develop there, and merge back into master when you’re ready to release. Commence head desking.
  4. Mediawiki, compared to wp-cli, is huge, and took a while to run on my laptop. It was a lot faster on my server. 419,236 objects vs 10464. I’m just saying someone needs to rethink the whole ‘Clone is faster!’ argument, since it’s slow now, or slow later, when downloading large files. Large files is large.
  5. So does WP if you looked at trac.
  6. Mediawiki lets you install extensions via git, but then you don’t have a fast/easy way to update…
StudioPress Theme of the Month

Comments

  1. If you want a quicker way to grab a branch down:

    $ git clone –depth 1 –reference v0.8.0 https://github.com/wp-cli/wp-cli.git wp-cli

  2. `git pull` is probably a bad choice for deployment. Much better to `fetch` origin (or whatever your remote’s name is) and checkout/merge the branch or tag you want.

    So if you’re dealing with tags…

    $ git fetch origin
    $ git checkout the_tag_name

    Or branches

    # say your on branch “stable”
    $ git fetch origin
    $ git merge origin/stable

    And to figure out how to get unrelated instances of git in subfolders to all update

    Install them as submodules, which will let the “parent” repo know about them and keep those submodules synced up to a certain commit.

    $ git submodule foreach git fetch
    $ git submodule update –init –recursive

    My usual sequence for deployment (I use branches, for the most part) is something like this:

    $ git fetch origin
    $ git merge origin/stable
    $ git submodule foreach git fetch
    $ git submodule update –init –recursive

    All automated with fabric.

  3. I am not 100% clear about all the things you’re saying here — though I have been using Git a lot for the past 2 years or so.

    Anyway, you might want to look at Beanstalk, if you are deploying to client sites. It has an auto-deploy feature, so that when you push to it, it can automatically “push” to the live or staging site. (You can also make this a manual feature, so you need to be sure you want to do it first.) And the target site only needs to have FTP/SFTP.

    Anyway, it’s just an absolutely beautiful thing.

Trackbacks

Half-Elf? Try Half OFF WordPress ebooks!