Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: jekyll

  • Deploy Jekyll Without Ruby

    Deploy Jekyll Without Ruby

    This is a bit of a lie. I do have Ruby, it’s just on my laptop.

    I don’t have it on my server, though. I have my git repo there, though. I could, but there are a couple reasons I don’t, and because I don’t, I can’t just run a jekyll build on my server. I’m not the only one with this particular issue. A lot of people on shared hosts can’t do it, for example. And people on cloud based tools can’t really either.

    Option one, which is very common, is what I have been doing. I added _site (the Jekyll output folder) to Git and I copy that over on a post commit hook. For what it was, that worked just fine. It only ran when I did a git commit, and if I wanted to work on a version I could totally do that in a branch, edit it, and bring it back in without accidentally deploying things.

    But option two would be rsync and that appealed to me more.

    I found the gem I was looking for eventually. It’s called (simply) Jekyll deploy and you add it to your Gemfile with this:

    group :jekyll_plugins do
      gem 'jekyll_deploy'
    end
    

    Then run the bundle command:

    $ bundle

    Now you have a new command called deploy which runs first a build and then it deploys based on the configuration options you put in. In my case it’s an rsync deploy, but you can do Git too. There was just one problem with it. The build every time made it such that my site would rebuild every time, which meant the rysnc would always be 100% new and that was more traffic than I really wanted.

    So I did what you always do here and I made a fork of Jekyll Deploy and changed my Gemfile to this:

    group :jekyll_plugins do
    	gem 'jekyll_deploy', :git => 'https://github.com/ipstenu/jekyll_deploy.git', :branch => 'develop'
    end
    

    Now my deploy only runs a deploy.

    A better solution would have been to put in some options and create jekyll deploy --build to allow me to run a build first, but I actually kind of like having them separate.

    The only question left was if I should keep _site under version control. I decided that I should, since the git repository would keep the file dates under control, assuring me that only the files I changed would be pushed with a deploy.

    I will note that the only reason it’s so simple for me is that I have passwordless SSH set up, where I don’t have to put in my passwords when I connect from a trusted computer. And since I only have this installed on a trusted server, and if I didn’t, I’d have to have a password to get access to the git repo anyway, I felt it was safe.

  • Bad Habits, Bad Dates

    Bad Habits, Bad Dates

    First of all, the migration of MediaWiki to Jekyll went fine. I binge watched “Person of Interest” and converted things with the clever use of grep and regex. Once I got to the point where I was converting templated files from Wiki to Jekyll, it got a lot easier. The hardest part was date conversion, and it started with some bad filenames.

    MediaWiki let me use whatever I wanted in (almost) whatever way I wanted, which is a problem. Also a problem is MediaWiki’s flat-level structure. Everything was the same level for the URLs, so you had http://example.com/wiki/NAME and, for the most part, that worked out okay. The problem I ran into was how I chose to name files.

    You see, I used the logical names “Interview Source (dd M yyyy)” for the interviews. That converted to the URL of http://example.com/wiki/Interview_Source_(dd_M_yyyy) which is nice and descriptive, if long. And it worked great right up until my subject had seven interviews on one day, two with the same source.

    Take this example. If you have an interview with the CBS morning news and the CBS evening news, on the same CBS local station, do you name the files “CBS Morning News (28 October 2015)” or “CBS News (28 October 2015)”? Obviously you have to go by the unique name (or the more unique one) to avoid name collisions. And for a time that worked out just fine. Except. I also had news articles. So if the CBS Morning News put out a news article on the same date as the interview, I was screwed. I ended up with multiple stupid filenames like “CBS Morning News (28 October 2015 b)” and so on. It was annoying.

    This could have been ‘avoided’ or at least mitigated more if I’d had used the subpage hierarchy for articles, making things http://example.com/wiki/Interview/Interview_Source_(dd_M_yyyy) and http://example.com/wiki/News/Interview_Source_(dd_M_yyyy) instead. And certainly I could have moved everything.

    But for whatever reason, subpages aren’t really super popular with MediaWiki. At least not the self-managed ones I’ve seen. They take a level of awareness that not everyone has. You can’t ‘see’ the subpages easily, not like categories with WordPress, or collections with Jekyll. And that means people just don’t use them. How do you train everyone to know how to do everything?

    Conversely, this naming issue isn’t a problem with WordPress because there has always been a clear delineation between URL and page name. This is made more-so when you use plugins like Yoast SEO, which allows you to remove ‘stopwords’ like ‘a’ and ‘the’ from your URL strings. This looks ‘wrong’ on MediaWiki, sadly, which is used to making pretty URLs that are descriptive.

    In the move to Jekyll, I renamed everything. First I made folders for each year and then I moved all files with that year in the name into the right folder. Since that muddled a few ‘extra’ files in there, I checked each file for the content {{InterviewTemplate or {{NewsTemplate and sorted them into /interviews/year/ or /news/year/ as appropriate. That was easy.

    To rename the files, I used my favorite tool Name Mangler to convert the filenames from Interview_Source_(dd_M_yyyy) to interview-source – nice and short. The ‘gotcha’ with that was, of course, multiple posts from the same source in a given year. And that was a problem because of that stupid naming convention. I would have to sort out some kind of script to rename things in bulk to convert the names into something I could then re-rename in order.

    And then I remembered something…

    'Automating' comes from the roots 'auto-' meaning 'self-', and 'mating', meaning 'screwing'.

    Not that. I remembered that the post-slug didn’t matter. It could represent the date of the post, but also possibly the order in which the posts were created. Which meant they didn’t matter in the slightest and I could batch rename.

    Furthermore, my date convention lead to a massive annoyance inside the content. Jekyll wanted my name convention to be yyyy-mm-dd and there was no really easy way to take yyyy-M-dd and convert it. There is no regex that does that. In the end, I converted dd M yyyy into yyyy-M-dd (which regex can do nicely) and then a search on all files for date: /d{2}-January-/d{4} to replace with date: /1-01-/2 and repeated for every year.

    Annoying, but it worked.

  • Jekyll Layouts vs Wiki Templates

    Jekyll Layouts vs Wiki Templates

    One of the things I was doing in Mediawiki was using a lot of templates. A lot. The way a template works in Mediawiki, you have a special page called Template:NAME and you can embed it with {{NAME}} in any post. You can even embed a template in a template. They’re basically static ‘blurbs.’ You can make them dynamic, but I have found that even after ten years of using Mediawiki, it’s still a bit of a mystery.

    With Jekyll, that gets thrown out the window.

    Let’s take, for example, my list of interviews. I have 14 or so years of interviews, broken up into a separate page by year and internally sorted by date. Manually. I also have a template {{Interviews}} which outputs a pretty formatted link to each year. Also made manually. For every new interview, I edited at least two pages (the interview itself and the year). And for every year I had to update the main interviews page and the template.

    My end goal was to do the following:

    1. Each year index would dynamically list the posts for that year
    2. The interview main page would list links to all the available years
    3. The interview ‘template’ would be output on every page
    4. The interview year page would list everything from that year
    5. All those things would dynamically update when I added a new item

    Oh and I also wanted a layout to be intelligent enough to show a special header with specific information on the individual interview pages.

    Love Collections

    To convert this, I first made use of collections, making one for _interviews and within that I have a folder for each year with the interview as a flat file and an index.md to make the main index. I don’t have to do this. I could have the index anywhere I wanted, but this was easier for me.

    There is a big gotcha here, though. Subfolders and collections and sorting by date doesn’t work the way you’d think it would. I could make it easily sort by title, and I could reverse it, but sorting by date proved to be a killer. Eventually I figured this out:

    1. All the pages have to have a date, even if you’re not going to sort that page (see my index)
    2. You can’t sort in a for loop

    The final code looks like this:

    {% assign posts = site.interviews | sort: 'date' %}
    <ul>
    	{% for post in posts %}
    		{% if post.topic != 'index' and post.tags contains page.year %}
    			<li><a href="{{ site.baseurl }}{{ post.url }}">{{ post.title }} ({{ post.date | date: "%d %B" }})</a></li>
    		{% endif %}
    	{% endfor %}
    </ul>
    

    Front Matters

    This is funnier if you know that the ‘header’ of a Jekyll file is called the Font matter. Here’s an example of mine:

    ---
    title: Interviews
    author: Mika E.
    layout: interview
    permalink: /interviews/
    date: 2001-01-01
    topic: index
    tags:
      - 2001
    ---
    

    Everything except topic: index is a default variable. I made the topic, and what that does is tell me “This page is an index page” and what year things are. There are reasons for this down the line. Now I also want to sort by year, but I can parse the date for that.

    Design the Layout

    I designated my layout as ‘interview’ in the first example, so I made a file called interview.html in layouts and made it a child of my default layout. In there, I have this code:

    <ul>
    {% for post in site.interviews %}
    	{% if post.topic == 'index' %}
    		<li><a href="{{ site.baseurl }}{{ post.url }}">{{ post.date | date: "%Y" }}</a></li>
    	{% endif %}
    {% endfor %}
    </ul>
    

    That says “if a page is an index, list it.” Now when I want a new year, I just add in a new folder with an index file.

    I’ve gone even further, taking the logic from some WordPress themes I’ve see, and the layout file has all the code for both the index view and the per-item view, allowing me to format my interviews with custom headers and footers around the content.

    Does it Work?

    Yes it does! Mostly.

    The problem with this, and yes there’s a problem, is that the interview layout page doesn’t regenerate itself. I have to go and re-save the layout for interviews in order to regenerate any lists I have on that page.

    I can get away with typing this in shell: touch _content/_jekyll/layouts/interview.html && jekyll build but it is a little annoying. Even running a manual jekyll build won’t do it because the layout doesn’t realize it has a change yet. I do understand why, though. It may be worth moving that somewhere else, though I have a feeling even if I make it a template it would have the same problem, since that template file wouldn’t know to update until it was edited.

    It took me a while to find the magic sauce is a bit of code called regenerate: true – This is not something you should use everywhere! I use it on my interviews index pages because those pages get updated when a new item is added to their folder. It actually lets my index pages be totally blank except the yaml headers which is nice and simple.

  • Mailbag: Why Jekyll?

    Mailbag: Why Jekyll?

    Why didn’t you convert your site to WordPress? You said you had to import it from Mediawiki to WordPress already.

    I had this conversation with my wife, too.

    WordPress is awesome at being a dynamic website. To be a static ‘wiki’ style website, it sucks. It’s not meant to be static like that. It’s not intended to be static. Even if you turn off comments on your site, you mean for WordPress to generate index pages and categories and the like.

    With WordPress, all that work is done on the server. When you visit a page, it’s generated for the first time. I may have a cache that lets reader number 2 see that page, but always the page, the HTML, is being dynamically built on-demand. MediaWiki works the same way. In contrast, Jekyll is dynamically built on my laptop and deployed as an in-situ static site. Each HTML page is a real HTML page on the server. No extra work has to happen. It’s small, it’s light, and it’s fast, because all that processing was done by me on my laptop before putting it on the server.

    And that actually illustrates the problem with WordPress, and why we struggle with things like Varnish and nginx and caching. We want our sites to do more and be faster. We need flexibility and posting to Twitter and dynamic page generation when we make an edit, because we’re constantly making changes.

    Except I didn’t. I don’t. Not the particular site I was working on, anyway. The site has about 1000 pages (probably closer to 600 once I decided not to import some of the things) and they’re pretty static. At most I updated them once a week for half the year. WordPress would be overkill. Hell, the Wiki was overkill and the only reason I kept using it was technological debt. I didn’t want to add to the debt. I didn’t want to make things even weirder and harder to use. I didn’t want to put a site more at risk with software I didn’t want to upkeep (MediaWiki, not WordPress).

    So it was clearly time to dig myself out with a little sweat equity and decide what I really wanted. I made a list of what I needed, what I wanted, and what I could live without. When I did that, Jekyll started looking more and more like a viable option. I would have spent as much time removing the aspects of WordPress I don’t need as I would have learning a new theme system and language.

    Also in the end I didn’t use the WordPress import. I manually copy/pasted content. The content was what I wanted, and I needed it text only, and MediaWiki made that damn hard to get at. Of course the Jekyll exporter for WordPress was pretty freaking cool. If I was pure WordPress to Jekyll, I’d be fine. I guess there just aren’t a lot of people doing MediaWiki exports.

  • Jekyll Collections

    Jekyll Collections

    Early on, Jekyll’s developers said that if someone was using posts for non-blog content, they were doing something wrong. That left one other avenue open, the first time I looked at Jekyll, which was pages. They’re nice, but they’re not what I wanted.

    Enter Jekyll collections. These are ‘arbitrary’ groups of related content which you put in their own folder. I had 15 years of interviews collected, so for me this seemed like a perfect idea. I read up on Ben Balter’s – Explain Jekyll Collections like I’m 5 and it helped me sort out what I wanted.

    Configure

    This is easy. You just add the collection code to _config.yaml

    # Collections
    collections:
      interviews:
        output: true
    

    Having the output set to true means that when I run jekyll build the pages are generated. That’s pretty simple. They don’t get auto-generated when you run a jekyll serve and you’re testing locally, however. Which sucks. I upgraded to Jekyll 3.0 beta and it started working, though, and I’m okay with running a beta.

    Create A Folder

    Also easy. Make a folder called _interviews in the main Jekyll folder. I will note, this gave me a fit. I wish I could put all my collections in a subfolder, because now I have this:

    _data
    _includes
    _interviews
    _layouts
    _pages
    _posts
    _sass
    _site
    

    It’s messy, and if I didn’t know that some of those folders are special (like _includes) I could easily be confused. The _site folder makes some sense, that’s where my site is output. But even if I use the source setting to move all my source pages into a folder (called _source in my case), I still can’t separate the code from the content. What I would like is this:

    _assets – Store all of my ‘code’ like layouts, plugins, css, etc here.
    _content – Store all my post content, collections, pages, etc here.

    Still this is a little better for me. Less insane. I will note, I was able to move my folders by defining the directories in my configuration file like this:

    # Moving Folders
    source:       _content
    plugins_dir:  _jekyll/plugins
    layouts_dir:  _jekyll/layouts
    includes_dir: _jekyll/includes
    

    So now my main folder has two folders _site and _content which is a lot easier for me to work with. I feel less muddled. Inside the content folder is a _jekyll folder which is my ‘wp-content’ folder, and a _data folder, which has some data files. More on that later.

    NB: This only works on Jekyll 3.0 and up!

    Create Files

    All I had to do was make my files in my _interviews folder and I was done. Well. Not really. I needed a way for Jekyll to link through everything, and I really didn’t think making manual pages was smart. I tossed in this code to my interviews post file and it cleverly looped through everything it found, generating the page on the fly:

    <ul>
    {% for topic in site.interviews %}
    	<li><a href="{{ site.baseurl }}{{ topic.url }}">{{ topic.title }}</a></li>
    {% endfor %}
    </ul>
    

    If you’re familiar with WordPress loops, this is the same thing as saying “For all posts in a category…”

    Customize the Hell Out of It

    Of course you know that’s what I did next. I went and made it super-complex by putting my interviews in year subfolders and then making the main interview page a list of all the years, with links to those pages, and loops back and … well. That’s another post.

  • Bye Wiki, Hello Jekyll

    Bye Wiki, Hello Jekyll

    I’m trying to make life less messy by learning an entirely new system.

    I have a Wiki with 1000 or so pages and it’s running MediaWiki. And it’s overkill. I don’t update it often enough to need all the bells and whistles. I need it to be fast, I need it to be simple. I need it to work for one editor (hi). Oh and I need it to be secure.

    Create a Git Repository

    There’s a reason for this. My plan is to commit my changes for Jekyll to a git repo and then have it auto-copy the proper files up to the folder on my webserver. My git repository is private and on the same server owned by the same account, so I can do this. Once I had my bare git repo, I ran this in my local repository folder on my laptop:

    git clone ipstenu@example.com:/home/ipstenu/repositories/jekyll.git site-jekyll
    

    And I got a warning: warning: You appear to have cloned an empty repository.

    Which I knew. But that’s fine. I wanted it empty.

    Install Jekyll On Your Computer

    Full stop. This is where I got confused before.

    $ brew install ruby
    $ gem install jekyll
    

    That’s it. That’s how to get it started.

    Create Your Site

    I was still in that other folder, so I ran an install:

    jekyll new . --force
    

    The reason for the force was that I did have some git files in there and a readme. Then I spent a few hours trying to figure out how to write posts and pages in Jekyll. Posts are ‘easy’ in that you create a file named yyyy-mm-dd-PostName.md and it will generate a post with that name. You can read up on Writing Posts for more.

    But. I’m converting a Wiki and pretty much the whole thing is going to be ‘pages’. To be honest, Jekyll’s idea of pages are ugly. The Writing Pages directions want me to put it all in the same folder and I didn’t like that. I thought I’d rather write a mess of posts in the _posts folder and then let Jekyll generate on the fly.

    To do that was relatively easy. I set up permalinks:

    # Outputting
    permalink: "/:title"
    

    After I did that, I realized I would still have to name things that ugly way, so I added this to my _config.yaml file:

    # Pages
    include: ['_pages']
    

    Then I made a folder called _pages and put my files in there, named CSI_Crime_Scene_Investigation_(season_1).html and so on, with headers like this:

    ---
    layout: default
    title:  "CSI: Crime Scene Investigation (season 1)"
    permalink: "/CSI_Crime_Scene_Investigation_(season_1)/"
    categories: television
    tags: csi
    ---
    

    Yeah. It’s starting to make sense. I could change the permalink to ":/title/" and get the same result, where it would match the filename. But for now, the basic idea is enough.

    Themeing

    It was harder than expected. I had to convert a lot of random PHP includes into Jekyll includes (pity I can’t just say ‘include this file, yes, I know it’s PHP…). Then I wanted to add some features like a table of contents, like I had from MediaWiki, which was a little tricky. But. Once I sorted out the way you do includes and how I could do them, it was all a bit easier.

    Importing MediaWiki

    This proved to be incredibly hard. Like table flipping, teeth gnashing, up at night, wondering why the universe was created this way hard. It was so hard, I exported the wiki to XML (easy), converted that to WordPress xml via Perl (hard because of dependancies), edit all instances of <wp:post_type>wiki</wp:post_type> to be a post, import into a WordPress site (easy), and then …

    Then I spent a long time going through the import, fixing the pages, formatting things, uploading images properly, etc. The wiki I was importing was old. It happens to be the oldest part of the website it’s on, and I was using a lot of templates. In a way that was great. But in another way it was really a terrible idea because it locked me in.

    So a lot of things had to happen. First, I had to rebuild all my templates. The wonderful thing with this is that I was using a lot of templates to list things like episodes and I could convert those to yml (or csv) and then have Jekyll run a loop to display them. Once I realized that, it meant I had a lot more freedom with content.

    I ended up not importing everything. A lot of what was on that Wiki was never looked at by anyone but me, and fifteen plus years of cruft leads to a lot of messy things. Between Jekyll collections and data, I was able to break things out into sanity again. But that’s a whole post on it’s own.

    Pushing To My Server

    I’m using Git, and it’s set to auto-push when I push. But this time I did it a little different. Normally I’d run jekyll on the server, but in this case I don’t have the option so I went with adding my _site folder to the git repo (which meant editing .gitignore) and then writing this:

    #!/bin/bash -l
    GIT_REPO=$HOME/repositories/jekyll.git
    TMP_GIT_CLONE=$HOME/tmp/git/repositories
    PUBLIC_WWW=$HOME/www/jekyll
    
    git clone $GIT_REPO $TMP_GIT_CLONE
    cp -r $TMP_GIT_CLONE/_site/* $PUBLIC_WWW
    rm -Rf $TMP_GIT_CLONE
    exit
    

    This is not what I would consider a great idea. I’d rather run git on the box, but Ruby has been misbehaving there, and this actually lets me use the code on a shared box too.