Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: git

  • Git: Combining Your Messy Commits

    Git: Combining Your Messy Commits

    Sometimes I end up making a lot of commits while I’m working on a branch in order to get the code right. It mostly happens when I’m going back and forth between my new branch and the old (live) one to double check some code that I think got lost. It happens when changing theme structures.

    However. This left me with a conundrum. I had about 100 commits and really it was going to be the messiest pull request ever. Which I didn’t want.

    One Branch to Rule Them All

    In order to fix this, finished up all my errant commits, as messy as they were, and then I went back and checked out the clean development branch (named development). Since everything was up to date, I went and made a new branch.

    That gave me three branches:

    1. development – The actual dev branch
    2. messy-dev – My super messy branch
    3. clean-dev – My clean branch

    Of course, nothing of mine was actually in that clean branch.

    One Branch to Find Them

    Once I switched to my new branch ( git checkout clean-dev ) I imported my old branch with this: git merge --squash messy-dev

    Yep, that was it. I then went though all my regular checks, made sure the code was working, did a few more fiddly changes, and then I ran a git commit to run the last pull.

    This gave me a commit message filled with … well … this:

        Merge branch 'development' into messy-dev
    
    commit 6c7534b9f7eabb5db59a85880bbf42ff2b982d84
    Author: Mika Ipstenu Epstein <ipstenu@ipstenu.org>
    Date:   Sat Sep 23 19:47:27 2017 -0700
    
        Cards again
    
    commit 9fe21f380a1befcbbe34a79937399b679c31c06f
    Merge: 0362bb5 14b4fab
    Author: Mika Ipstenu Epstein <ipstenu@ipstenu.org>
    Date:   Sun Sep 24 18:08:18 2017 -0700
    
        I hate Sara Lance so much!
    
    commit 0362bb5a289e2694bd4872137ad470091529021d
    Author: Mika Ipstenu Epstein <ipstenu@ipstenu.org>
    Date:   Sun Sep 24 17:57:33 2017 -0700
    

    One Branch to Bring Them All

    Don’t worry. I didn’t keep that. In fact, I’d been writing a log of the entire work, listing out what was changed, fixed, added, deleted, etc. So I deleted that entire commit message and pasted mine in it’s stead.

    Well written inline documentation is one thing, but a good commit message saves lives. Since I planned to submit this as a pull request, I knew I had to have a good, simple, commit that listed things that had changed.

    But there also had to be more…

    And In The Pull Request Bind Them

    I’m rather pedantic about all that and wrote about 500 words to explain what all the code was in that Pull Request. Since I’m working with other people, and I’m not the lead developer on this project, I know not to commit my changes to the dev server right away.

    Instead, I made a pull request with a repeat of data in my commit, but also a different and more detailed explanation. A pull request has to explain why the work was done and why the pull is needed.

  • Git Attributes: Control Your Vendor Folders

    Git Attributes: Control Your Vendor Folders

    When you’re developing code, a lot of the time you have vendor or bower_component folders that you don’t need. That is, you don’t need them for the code, but you do for the development.

    A year ago I explained how I handle my vendor folders. Essentially, I use SVN to ignore things so they don’t get uploaded to WordPress. And that’s great but

    What about Git?

    Enter Git Attributes

    The .gitattributes file lets you define attributes for paths. What that means is you can assign an attribute to a file which will then impact how various Git operations occur.

    Operations are things like checking your code out or pushing a release (something familiar to Githubbers). Basically an operation is when Git does a ‘thing.’ Whatever that thing will be. And the Attributes file allows you to specify what happens to specific files (or folders) when that thing happens.

    How Do You Use It?

    There’s a lot more to it than this, but if your goal is to exclude vendor folders and .git files from your zips to send them off to people, then you’ll want to have your .gitattributes file look like this:

    /vendor/ export-ignore
    .gitattributes export-ignore  
    .gitignore export-ignore
    

    This results in when you use GitHub and tell someone to download the zip from it, it will exclude those files.

    What Else Can I Do With It?

    Have a problem with line endings because one person on your dev team uses Windows? gitattributes can help.

    Need to make a change like that for everyone on a system? Or maybe you want to make sure you never include those vendor and documentation folders? The Pro Git Book says this:

    Attributes for all users on a system should be placed in the $(prefix)/etc/gitattributes file.

    Before you ask, unless you changed it, $(prefix) is nothing for pretty much everyone. You may have a /usr/local/git/etc/ location though.

  • Git Subtrees

    Git Subtrees

    I have a project in Hugo where I wanted the content to be editable by anyone but the theme and config to remain mine. In this way, anyone could add an article to a new site, but only I could publish. Sounds smart, right? The basic concept would be this:

    • A private repository, on my own server, where I maintained the source code (themes etc)
    • A public repository, on GitHub or GitLab, where I maintained the content

    Taking into consideration how Hugo stores data, I had to rethink how I set up the code. By default, Hugo has two main folders for your content: content and data. Those folders are at the main (root) level of a Hugo install. This is normally fine, since I deploy by having a post-deploy hook that pushes whatever I check in at Master out to a temp folder and then runs a Hugo build on it. I’m still using this deploy method because it lets me push commit without having to build locally first. Obviously there are pros and cons, but what I like is being able to edit my content and push and have it work from my iPad.

    Now, keeping this setup, in order to split my repository I need to solve a few problems.

    Contain Content Collectively

    No matter what, I need to have one and only one location for my content. Two folders is fine, but it has to be within a single folder. In order to do this, it’s fairly straightforward.

    In the config.toml file, I set two defines:

    contentdir = "content/posts"
    datadir = "content/data"
    

    Then I moved the files in content to content/posts and moved data to content/data. I ran a quick local test to make sure it worked and, since it did, pushed that change live. Everything was fine. Perfect.

    Putting Posts Publicly

    The second step was making a public repository ‘somewhere.’ The question of ‘where’ was fairly simple. You have a lot of options, but for me it boils down to GitLab or GitHub. While GitHub is the flavor du jour, GitLab lets you make a private repository for free, but both require users to log in with an account to edit or make issues. Pick whichever one you want. It doesn’t matter.

    What does matter is that I set it up with two folders: posts and data

    That’s right. I’m replicating the inside of my content folder. Why? Well that’s because of the next step.

    Serving Subs Simply

    This is actually the hardest part, and led me to complain that every time I use Submodules in Git, I remember why I hate them. I really want to love Submodules. The idea is you check out a module of a specific version of another repository and now you have it. The problem is that updates are complicated. You have to update the Submodule separately and if you work with a team, and one person doesn’t, there’s a possibility you’ll end up pushing the old version of the Submodule because it’s not version controlled in your git repository.

    It gets worse if you have to solve merge conflicts. Just run away.

    On the other hand, there’s a tool called Subtree, which two of my twitter friends introduced me to after I tweeted my Submodule complaint. Subtree uses a merge trick to get the same result of a Submodule, only it actually stores the files in the main repository, and then merges your changes back up to it’s own. Subtrees are not a silver bullet, but in this case it was what I needed.

    Checking out the subtree is easy enough. You tell it where you want to store the repository (a folder named content) and you give it the location of your remote, the branch name, and voila:

    $ git subtree add --prefix content git@github.com:ipstenu/hugo-content.git master --squash
    git fetch git@github.com:ipstenu/hugo-content.git master
    From gitlab.com:ipstenu/hugo-content
     * branch            master     -> FETCH_HEAD
    Added dir 'content'
    

    Since typing in the full path can get pretty annoying, it’s savvy to add the subtree as a remote:

    $ git remote add -f hugo-content git@github.com:ipstenu/hugo-content.git
    

    Which means the add command would be this:

    $ git subtree add --prefix content hugo-content master --squash
    

    Maintaining Merge Manuverability

    Once we have all this in, we hit a new problem. The subtree is not synced by default.

    When a subproject is added, it is not automatically kept in sync with the upstream changes so you have to pull it in like this:

    $ git subtree pull --prefix content hugo-content master --squash
    

    When you have new code to add, run this:

    $ git subtree push --prefix content hugo-content master --squash
    

    That makes the process for a new article a little extra weird but it does work.

    Documenting Data Distribution

    Here’s how I update in the real world:

    1. Edit my local copy of the content folder in the hugo-library repository
    2. Add and commit the changed content with a useful message
    3. Push the subtree
    4. Push the main repository

    Done.

    If someone else has a pull request, I would need to merge it (probably directly on GitHub) and then do the following:

    1. Pull from the subtree
    2. Push to the main repository

    My weird caveat is that updating via Coda can get confused as it doesn’t always remember what repository I want to be on, but since I do all of my pushes from command line, that really doesn’t bother me much.

  • A Simplier Hugo Deploy

    A Simplier Hugo Deploy

    I have a Hugo site that I’ve been deploying by running Hugo on the server. But this isn’t the only way about it.

    If you use Git and it’s on the same server as your site, and owned by the same user, it’s remarkably easy to do this.

    First make sure the public folder in your Hugo repository is being tracked. Yes, this can make your repository a little large but that’s not something to worry about too much. Space is cheap, or it should be. Next make a folder in tmp – I called mine library – to store the Git output in.

    The new post-update code then looks like this:

    #!/bin/sh
    
    SRC_DIR=$HOME/tmp/library/public/
    DST_DIR=$HOME/public_html/library/
    
    export GIT_WORK_TREE=$HOME/tmp/library/
    git checkout -f
    
    rsync -a --delete $SRC_DIR $DST_DIR 
    
    exit
    

    What this does is checkout the Git repository and then copy it over. The format of the sync will delete anything not found. Done.

    The benefit of this method is that you don’t need to install GoLang or Hugo on your server, and everything is pure and simple Git and copy. Rsync is a delightful way to copy everything over as well. You can delete the temp folder when you’re done, but the checkout process handles things for you. Another nice trick is you can specify what branch to checkout, so if you have a special one for publishing, just use that.

    But could this be even easier? Yes and no. You see, what I’m going is checking out the whole thing and then copying over folders. What if I could tell Git to just checkout the code in that one folder?

    There’s a think called a ‘sparse checkout’ where in I can tell Git “Only checkout this folder.” Then all I have to do is go into that folder and checkout the content I wanted. The problem there is it literally checked out the folder ‘public’ and what I wanted was the content of the public folder. Which means while it’s ‘easier’ in that I’ve only checked out the code I need, I can’t just checkout it out into where I want. I will always have to have a little extra move.

    To set up my folder, I did this:

    cd ~/tmp/library/
    git init
    git remote add -f origin ~/repositories/library.git
    git config core.sparsecheckout true
    echo public/ >> .git/info/sparse-checkout
    git checkout master
    

    And then my script remains the same. But! This is going to be a faster checkout since it’s only ever going to be exporting and seeing the folders it needs.

  • Why I Don’t Use Git Flow Anymore

    Why I Don’t Use Git Flow Anymore

    Please don’t get me wrong. I love git-flow. I think it’s great. But it was great to teach me how to use git. It taught me not to use master for my development, and how to make branches and all that. Git Flow got be in the habit of doing good things and testing and showed me how to work with multiple projects. It was a great crutch to get comfortable with the ideas of Git that (for a long time) confounded me.

    But I don’t need it anymore. Instead, I do things very, very simply and my flow is as follows.

    $ git checkout master ; git pull

    I always start by assuming I’ve forgotten something and need to sync up. This works for me, since I run on two computers.

    $ git checkout NewProject

    Once I’m in the new project, I start making all my edits, add my code, etc. Now here’s where I get a little silly. If I’m working on my own stuff, it’s Coda, always, so I’ll constantly ‘commit all changes’ and fill in my commit messages and then cancel out. I do this over and over until I’ve reached a point where I think “This code is ready to be tested.” Then I commit for real.

    This means my commit logs look like this:

    Convert Font Icon to SVGs
    
     - Add new images for social media
     - Optimize CSS for pagespeed
     - Remove unused function.php file
     - Add shortcodes
    
    Fixes #1234
    

    There are other ways to do this, of course. I’m a huge proponent of keeping change logs but a commit message should be useful too.

    It’s too easy to put in this: git commit -m "Adding new icons"

    While it’s more time consuming, just use git commit and put in a good message like I did up at the top. Now, this is not new. A hundred people have all said this before, but it bears repeating.

    • The first line is your subject, keep it to 50 characters.
    • Capitalize the subject line but don’t use a period
    • Use the imperative mode – “Add new icons” and not “Adding new icons”
    • Leave an empty line between subject and body
    • Explain what you did in the body, keeping lines to 72 characters
    • Bullet points are okay – use a space before a hyphen for best compatibility
    • Reference any issues at the bottom – “Fixes: #123” or “See Also: #456 #789”

    If, like me, you commit and then, before merge, realize you have changes, use git commit --amend to add your new changes to the existing commit.

  • Changing Git History

    Changing Git History

    Working on a group project in Git, I did the smart thing with my code. I made a branch and proceeded to edit my files. I also did a dumb thing. I made four commits.

    The first was for the first, ugly, functional version of the code. The second was a less ugly, kind of broken version. The third was the rewrite and the fourth was the working version. When I wanted to submit my changes for a review, it was going to be ugly. I did not need or want people looking at four commits They only wanted the one.

    Now I’m a weird person for how I do commits. I add a new feature like a new function to parse things, and I commit that. Then I change my CSS and commit that. And so on and so on. This means I can look through my commit history and see exactly when I made a change. When I’m ready to do my release, I document all the changes based on that commit log and have it as my message.

    But when you’re working with a team, and all they want is one clean commit? Well I’m their worst nightmare. There is a cure for this, though! You can squash your commits, merging them all into one.

    Squash

    Actually it’s rebase. It can be squash too, though. I ran the following command which says to rebase my last 4 commits:

    git rebase -i HEAD~4
    

    That opens up another editor

    pick b17617p Crap I need to do this thing!
    pick 122hdla Added feature HUMAN to autogenerate a humans.txt file
    pick nw9v88a Changed comment avatar size to 96px
    pick 8jsdy1m Updated CSS for comment avatars to make them a circle
    
    # Rebase b17617p..8jsdy1m onto b17617p
    #
    # Commands:
    #  p, pick = use commit
    #  r, reword = use commit, but edit the commit message
    #  e, edit = use commit, but stop for amending
    #  s, squash = use commit, but meld into previous commit
    #  f, fixup = like "squash", but discard this commit's log message
    #  x, exec = run command (the rest of the line) using shell
    #
    # If you remove a line here THAT COMMIT WILL BE LOST.
    # However, if you remove everything, the rebase will be aborted.
    #
    

    Now here’s where it’s weird. The first one, b17617p is the one I have to merge everything into. And it has the worst commit message, doesn’t it? Oh and I was totally not using the right formatting for how the company wants me to format my commits. They want the comment to be “Feature: Change” so I would have “Humans: Added new feature to autogenerate humans.txt”

    Since I knew I wanted to merge it all and totally rewrite the commit, I just did this:

    pick b17617p Crap I need to do this thing!
    squash 122hdla Added feature HUMAN to autogenerate a humans.txt file
    squash nw9v88a Changed comment avatar size to 96px
    squash 8jsdy1m Updated CSS for comment avatars to make them a circle
    

    Which, once saved and exited, gave me this:

    # This is a combination of 4 commits.
    # The first commit's message is:
    
    Crap I need to do this thing!
    
    # This is the 2nd commit message:
    
    Added feature HUMAN to autogenerate a humans.txt file
    
    # This is the 3rd commit message:
    
    Changed comment avatar size to 96px
    
    # This is the 4th commit message:
    
    Updated CSS for comment avatars to make them a circle
    
    # Please enter the commit message for your changes. Lines starting
    # with '#' will be ignored, and an empty message aborts the commit.
    # Explicit paths specified without -i nor -o; assuming --only paths...
    # Not currently on any branch.
    # Changes to be committed:
    #   (use "git reset HEAD <file>..." to unstage)
    #
    #	new file:   LICENSE
    #	modified:   README.textile
    #	modified:   Rakefile
    #	modified:   bin/jekyll
    #
    

    Since everything with a # is ignored, I deleted it and made it this:

    Humans: New Feature -- Humans.txt is now autogenerated
    Comments: Changed avatar size to 96px and edited CSS to make it a circle
    

    Yeah, that’s it. Admittedly, these should be two separate changes, but they’re all a part of the same project in this case so it’s okay.

    Of course, at the end of this, I looked at my code on our web tool and swore, because I’d left a debug line in. My hero Mike said “Don’t worry! ammend!”

    I made my change, instead of a normal git commit -a -m "These are my changes" I ran a git add FILENAME and git commit --ammend to fix up your most recent commit.

    It lets you combine staged changes with the previous commit instead of committing it as an entirely new snapshot. It can also be used to simply edit the previous commit message without changing its snapshot.

    And yes, it’s pretty awesome. Use it wisely.