I have a project in Hugo where I wanted the content to be editable by anyone but the theme and config to remain mine. In this way, anyone could add an article to a new site, but only I could publish. Sounds smart, right? The basic concept would be this:
- A private repository, on my own server, where I maintained the source code (themes etc)
- A public repository, on GitHub or GitLab, where I maintained the content
Taking into consideration how Hugo stores data, I had to rethink how I set up the code. By default, Hugo has two main folders for your content: content
and data
. Those folders are at the main (root) level of a Hugo install. This is normally fine, since I deploy by having a post-deploy hook that pushes whatever I check in at Master out to a temp folder and then runs a Hugo build on it. I’m still using this deploy method because it lets me push commit without having to build locally first. Obviously there are pros and cons, but what I like is being able to edit my content and push and have it work from my iPad.
Now, keeping this setup, in order to split my repository I need to solve a few problems.
Contain Content Collectively
No matter what, I need to have one and only one location for my content. Two folders is fine, but it has to be within a single folder. In order to do this, it’s fairly straightforward.
In the config.toml
file, I set two defines:
contentdir = "content/posts" datadir = "content/data"
Then I moved the files in content
to content/posts
and moved data
to content/data
. I ran a quick local test to make sure it worked and, since it did, pushed that change live. Everything was fine. Perfect.
Putting Posts Publicly
The second step was making a public repository ‘somewhere.’ The question of ‘where’ was fairly simple. You have a lot of options, but for me it boils down to GitLab or GitHub. While GitHub is the flavor du jour, GitLab lets you make a private repository for free, but both require users to log in with an account to edit or make issues. Pick whichever one you want. It doesn’t matter.
What does matter is that I set it up with two folders: posts
and data
That’s right. I’m replicating the inside of my content folder. Why? Well that’s because of the next step.
Serving Subs Simply
This is actually the hardest part, and led me to complain that every time I use Submodules in Git, I remember why I hate them. I really want to love Submodules. The idea is you check out a module of a specific version of another repository and now you have it. The problem is that updates are complicated. You have to update the Submodule separately and if you work with a team, and one person doesn’t, there’s a possibility you’ll end up pushing the old version of the Submodule because it’s not version controlled in your git repository.
It gets worse if you have to solve merge conflicts. Just run away.
On the other hand, there’s a tool called Subtree, which two of my twitter friends introduced me to after I tweeted my Submodule complaint. Subtree uses a merge trick to get the same result of a Submodule, only it actually stores the files in the main repository, and then merges your changes back up to it’s own. Subtrees are not a silver bullet, but in this case it was what I needed.
Checking out the subtree is easy enough. You tell it where you want to store the repository (a folder named content
) and you give it the location of your remote, the branch name, and voila:
$ git subtree add --prefix content git@github.com:ipstenu/hugo-content.git master --squash git fetch git@github.com:ipstenu/hugo-content.git master From gitlab.com:ipstenu/hugo-content * branch master -> FETCH_HEAD Added dir 'content'
Since typing in the full path can get pretty annoying, it’s savvy to add the subtree as a remote:
$ git remote add -f hugo-content git@github.com:ipstenu/hugo-content.git
Which means the add command would be this:
$ git subtree add --prefix content hugo-content master --squash
Maintaining Merge Manuverability
Once we have all this in, we hit a new problem. The subtree is not synced by default.
When a subproject is added, it is not automatically kept in sync with the upstream changes so you have to pull it in like this:
$ git subtree pull --prefix content hugo-content master --squash
When you have new code to add, run this:
$ git subtree push --prefix content hugo-content master --squash
That makes the process for a new article a little extra weird but it does work.
Documenting Data Distribution
Here’s how I update in the real world:
- Edit my local copy of the content folder in the hugo-library repository
- Add and commit the changed content with a useful message
- Push the subtree
- Push the main repository
Done.
If someone else has a pull request, I would need to merge it (probably directly on GitHub) and then do the following:
- Pull from the subtree
- Push to the main repository
My weird caveat is that updating via Coda can get confused as it doesn’t always remember what repository I want to be on, but since I do all of my pushes from command line, that really doesn’t bother me much.