Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: hugo

  • Hugo and Lunr – Client Side Searching

    Hugo and Lunr – Client Side Searching

    I use Hugo on a static website that has few updates but still needs a bit of maintenance. With a few thousand pages, it also needs a search. For a long time, I was using a Google Custom Search, but I’m not the biggest Google fan and they insert ads now, so I needed a new solution.

    Search is the Worst Part

    Search is the worst thing about static sites. Scratch that. Search is the worst part about any site. We all bag on WordPress’ search being terrible, but anyone who’s attempted to install and manage anything like ElasticSearch knows that WordPress’ search is actually pretty good. It’s just limited. And by contrast, the complicated world of search is, well, complicated.

    That’s the beauty of many CMS tools like WordPress and Drupal and MediaWiki is that they have a rudimentary and perfectly acceptable search built in. And it’s the headache of static tools like Jekyll and Hugo. They simply don’t have it.

    Lunr

    If you don’t want to use third-party services, and are interested in self hosting your solution, then you’re going to have to look at a JavaScript solution. Mine was Lunr.js, a fairly straightforward tool that searched a JSON file for the items.

    There are pros and cons to this. Having it all in javascript means the load on my server is pretty low. At the same time I have to generate the JSON file somehow every time. In addition, every time someone goes to the search page, they have to download that JSON file, which can get pretty big. Mine’s 3 megs for 2000 or so pages. That’s something I need to keep in mind.

    This is, by the way, the entire reason I made that massive JSON file the other day.

    To include Lunrjs in your site, download the file and put it in your /static/ folder however you want. I have it at /static/js/lunr.js next to my jquery.min.js file. Now when you build your site, the JS file will be copied into place.

    The Code

    Since this is for Hugo, it has two steps. The first is the markdown code to make the post and the second is the template code to do the work.

    Post: Markdown

    The post is called search.md and this is the entirety of it:

    ---
    layout: search
    title: Search Results
    permalink: /search/
    categories: ["Search"]
    tags: ["Index"]
    noToc: true
    ---
    

    Yep. That’s it.

    Template: HTML+GoLang+JS

    I have a template file in layouts/_default/ called search.html and that has all the JS code as well as everything else. This is shamelessly forked from Seb’s example code.

    {{ partial "header.html" . }}
    
    	{{ .Content }}
    
    	<h3>Search:</h3>
    	<input id="search" type="text" id="searchbox" placeholder="Just start typing...">
    
    	<h3>Results:</h3>
    	<ul id="results"></ul>
    
    	<script type="text/javascript" src="/js/lunr.js"></script>
    	<script type="text/javascript">
    	var lunrIndex, $results, pagesIndex;
    
    	function getQueryVariable(variable) {
    		var query = window.location.search.substring(1);
    		var vars = query.split('&');
    
    		for (var i = 0; i < vars.length; i++) {
    			var pair = vars[i].split('=');
    
    			if (pair[0] === variable) {
    				return decodeURIComponent(pair[1].replace(/\+/g, '%20'));
    			}
    		}
    	}
    
    	var searchTerm = getQueryVariable('query');
    
    	// Initialize lunrjs using our generated index file
    	function initLunr() {
    		// First retrieve the index file
    		$.getJSON("/index.json")
    			.done(function(index) {
    				pagesIndex = index;
    				console.log("index:", pagesIndex);
    				lunrIndex = lunr(function() {
    					this.field("title", { boost: 10 });
    					this.field("tags", { boost: 5 });
    					this.field("categories", { boost: 5 });
    					this.field("content");
    					this.ref("uri");
    
    					pagesIndex.forEach(function (page) {
    						this.add(page)
    					}, this)
    				});
    			})
    			.fail(function(jqxhr, textStatus, error) {
    				var err = textStatus + ", " + error;
    				console.error("Error getting Hugo index flie:", err);
    			});
    	}
    
    	// Nothing crazy here, just hook up a listener on the input field
    	function initUI() {
    		$results = $("#results");
    		$("#search").keyup(function() {
    			$results.empty();
    
    			// Only trigger a search when 2 chars. at least have been provided
    			var query = $(this).val();
    			if (query.length < 2) {
    				return;
    			}
    
    			var results = search(query);
    
    			renderResults(results);
    		});
    	}
    
    	/**
    	 * Trigger a search in lunr and transform the result
    	 *
    	 * @param  {String} query
    	 * @return {Array}  results
    	 */
    	function search(query) {
    		return lunrIndex.search(query).map(function(result) {
    				return pagesIndex.filter(function(page) {
    					return page.uri === result.ref;
    				})[0];
    			});
    	}
    
    	/**
    	 * Display the 10 first results
    	 *
    	 * @param  {Array} results to display
    	 */
    	function renderResults(results) {
    		if (!results.length) {
    			return;
    		}
    
    		// Only show the ten first results
    		results.slice(0, 100).forEach(function(result) {
    			var $result = $("<li>");
    			$result.append($("<a>", {
    				href: result.uri,
    				text: "» " + result.title
    			}));
    			$results.append($result);
    		});
    	}
    
    	// Let's get started
    	initLunr();
    
    	$(document).ready(function() {
    		initUI();
    	});
    	</script>
    {{ partial "footer.html" . }}
    

    It’s important to note you will also need to call jQuery but I do that in my header.html file since I have a bit of jQuery I use on every page. If you don’t, then remember to include it up by <script type="text/javascript" src="/js/lunr.js"></script> otherwise nothing will work.

    Caveats

    If you have a large search file, this will make your search page slow to load.

    Also I don’t know how to have a form on one page trigger the search on another, but I’m making baby steps in my javascripting.

  • Hugo Making JSON

    Hugo Making JSON

    While it rhymes with bacon, it’s not at all the same.

    There are a lot of reasons you might want a JSON file output from your static site (I like Hugo). Maybe you’re using Hugo to build out the backend of an API. Maybe you want to have it include a search function. Today I’m going to show you how to have a JSON file created with a complete site archive. The end goal of this example is to have a searchable JSON file that you can use with Lunrjs or Solarjs or anything else of that ilk.

    The Old Way: Node

    Since I was initially doing this to integrate Hugo with Lunr.js, I spent some time wondering how I could make a JSON file and I ran into Lunr Hugo, a fork of Hugo Lunr but with YAML support (which I needed). I actually use a private fork of that, because I wanted to change what it saved, but this is enough to get everyone started.

    To use it, you install it via Node:

    npm install lunr-hugo
    

    Then you add the scripts to your Node package file (normally called package.json):

      "scripts": {
        "test": "echo \"Error: no test specified\" && exit 1",
        "index": "lunr-hugo -i \"site/content/posts/**\" -o site/static/js/search.json"
      },
    

    Change the value of “site/content/” as you see fit. Once installed you can build the index by typing npm run index and it makes the file in the right location.

    The obvious downside to this is I have to run it outside of my normal build process.

    Another Old Way: Grunt

    This idea come from Seb, one of the lead developers for Hugo, and he uses a Grunt script to do this. First you have to install node and things via this command:

    npm install --save-dev grunt string toml conzole

    Next you make a Gruntfile.js file like this:

    var toml = require("toml");
    var S = require("string");
    
    var CONTENT_PATH_PREFIX = "site/content";
    
    module.exports = function(grunt) {
    
        grunt.registerTask("lunr-index", function() {
    
            grunt.log.writeln("Build pages index");
    
            var indexPages = function() {
                var pagesIndex = [];
                grunt.file.recurse(CONTENT_PATH_PREFIX, function(abspath, rootdir, subdir, filename) {
                    grunt.verbose.writeln("Parse file:",abspath);
                    pagesIndex.push(processFile(abspath, filename));
                });
    
                return pagesIndex;
            };
    
            var processFile = function(abspath, filename) {
                var pageIndex;
    
                if (S(filename).endsWith(".html")) {
                    pageIndex = processHTMLFile(abspath, filename);
                } else {
                    pageIndex = processMDFile(abspath, filename);
                }
    
                return pageIndex;
            };
    
            var processHTMLFile = function(abspath, filename) {
                var content = grunt.file.read(abspath);
                var pageName = S(filename).chompRight(".html").s;
                var href = S(abspath)
                    .chompLeft(CONTENT_PATH_PREFIX).s;
                return {
                    title: pageName,
                    href: href,
                    content: S(content).trim().stripTags().stripPunctuation().s
                };
            };
    
            var processMDFile = function(abspath, filename) {
                var content = grunt.file.read(abspath);
                var pageIndex;
                // First separate the Front Matter from the content and parse it
                content = content.split("+++");
                var frontMatter;
                try {
                    frontMatter = toml.parse(content[1].trim());
                } catch (e) {
                    conzole.failed(e.message);
                }
    
                var href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(".md").s;
                // href for index.md files stops at the folder name
                if (filename === "index.md") {
                    href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(filename).s;
                }
    
                // Build Lunr index for this page
                pageIndex = {
                    title: frontMatter.title,
                    tags: frontMatter.tags,
                    href: href,
                    content: S(content[2]).trim().stripTags().stripPunctuation().s
                };
    
                return pageIndex;
            };
    
            grunt.file.write("site/static/js/lunr/PagesIndex.json", JSON.stringify(indexPages()));
            grunt.log.ok("Index built");
        });
    };
    

    Take note of where it’s saving the files. site/static/js/lunr/PagesIndex.json That’s works for Seb because his set setup has everything Hugo in a /site/ folder.

    To build the file, type grunt lunr-index and off you go.

    The New Way: Output Formats

    All of that sounded really annoying, right? I mean, it’s great but you have structure your site to separate Hugo from the Node folders, and you have to run all those steps outside of Hugo.

    Well there’s good news. You can have this all automatically done if you have Hugo 0.20.0 or greater. In the recent releases, Hugo introduced Output Formats. The extra formats let you spit out your code with RSS feeds, AMP, or (yes) JSON formatting automatically.

    In this example, since I only want to make a master index file with everything, I can do it by telling Hugo that I want my home page, and only my home page, to have a JSON output. In order to do this, I put the following in my config.toml file:

    [outputs]
    	home = [ "HTML", "JSON"]
    	page = [ "HTML"]
    

    If I wanted to have it on more pages, I could do that too. I don’t.

    Next I made a file in my layouts folder called index.json:

    {{- $.Scratch.Add "index" slice -}}
    {{- range where .Site.Pages "Type" "not in"  (slice "page" "json") -}}
    {{- $.Scratch.Add "index" (dict "uri" .Permalink "title" .Title "content" .Plain "tags" .Params.tags "categories" .Params.tags) -}}
    {{- end -}}
    {{- $.Scratch.Get "index" | jsonify -}}
    

    To generate the file, just run a build and it makes a file called index.json in the site root.

    How do you statically build JSON Files?

    Do you have a trick or an idea of how to make building JSON files better? Leave a comment and let me know!

  • Sharing Content with Static Sites Dynamically

    Sharing Content with Static Sites Dynamically

    When I wrote how to serve content to Hugo, I did so using something that was mostly static. You see, that code requires someone to push a new version of the Hugo site to rebuild the pages.

    Now let’s be serious, who wants to do that?

    The Concept

    Sadly, you can’t just include a PHP file in Hugo (or any static site builder) and have it echo content. Their whole point is to be static and not change. And my problem is that I immediately ran into a week where I knew the message on the header was going to be changing daily.

    Ew, right? Right. So I looked at that which I should be embracing deeply. Javascript. Or in this case, jQuery and the getJSON call. Yes, that’s right, with jQuery you can call JSON and output it where you want.

    I do not recommend doing this for full page content. This is only vaguely smart if you’re trying to output something small that loads fast and isn’t going to mess up your site if someone has javascript disabled.

    The Code

    <script>
    	$.getJSON( "https://example.com/wp-json/wp/v2/pages/14363", function (json) {
    	    var content = json.content.rendered;
    		document.querySelector('.wpcontent').innerHTML = content;
    	});
    </script>
    
    <div class="utility-bar">
    	<div class="wrap">
    		<section id="text-16" class="widget widget_text">
    			<div class="widget-wrap">
    				<div class="textwidget">
    					<div class="wpcontent"></div>
    				</div>
    			</div>
    		</section>
    	</div>
    </div>
    

    What that code does is it grabs the JSON, sets the variable content to the value of the content’s ‘rendered’ setting. Then using document.querySelector, it tosses in the HTML to my class for wpcontent and I’m done.

  • Sharing WordPress Content with Hugo

    Sharing WordPress Content with Hugo

    When your life is just WordPress, there’s not a lot of headaches involved in making some posts and updating widgets and keeping your whole site in sync.

    When your life isn’t just WordPress, it gets a little weird.

    When you want to use WordPress to run your life a little more, you un-weird it by making it weirder.

    Rethinking Where the Content Lives

    Normally we think about content living on its own site. Well, I have a ‘message’ that needs to be the same on five different domains. Just work with me here. The point is, if I want to update the header message on the five sites, I have to update five sites. Yuck.

    Now, there are a lot of solutions to this. I decided I wanted one, and only one, place to update the header message. The most obvious is making a static text file that I could update when needed and import/include it in everywhere. But as I started looking into how I do that, my eyes drifted to WordPress.

    What if I used JSON? What if instead of a file, I made a page on WordPress, grabbed the content from https://example.com/wp-json/wp/v2/pages/12345 and parsed that so I didn’t have to do a whole mess of editing anywhere but on WordPress?

    Hugo

    Guess what. That works. And it works extra well for Hugo (a static site generator I’m fond of) because Hugo understands dynamic content. The one drawback is that it can’t live refresh remote data, so I will always have to push a change to the site to trigger this rebuild.

    However if you ever wondered how to include WordPress’ JSON data into a Hugo theme, here’s what I have in my template for utility-bar.html:

    {{ $wordpressURL  := "https://example.com/wp-json/wp/v2/pages/12345" }}
    {{ $wordpressJSON := getJSON $wordpressURL }}
    
    <div class="utility-bar">
    	<div class="wrap">
    		<section id="text-16" class="widget widget_text">
    			<div class="widget-wrap">
    				<div class="textwidget">
    					{{ $wordpressJSON.content.rendered | safeHTML}}
    				</div>
    			</div>
    		</section>
    	</div>
    </div>
    

    The reason safeHTML is there is that otherwise Hugo wants to escape my HTML. Which is a wise choice! Default to not trusting.

    This outputs the post content and I have a happy (enough) day. The more I look at it, the more I realize how much I can do with WordPress and Hugo, since regenerating the site just takes a push of Hugo content.

  • Git Subtrees

    Git Subtrees

    I have a project in Hugo where I wanted the content to be editable by anyone but the theme and config to remain mine. In this way, anyone could add an article to a new site, but only I could publish. Sounds smart, right? The basic concept would be this:

    • A private repository, on my own server, where I maintained the source code (themes etc)
    • A public repository, on GitHub or GitLab, where I maintained the content

    Taking into consideration how Hugo stores data, I had to rethink how I set up the code. By default, Hugo has two main folders for your content: content and data. Those folders are at the main (root) level of a Hugo install. This is normally fine, since I deploy by having a post-deploy hook that pushes whatever I check in at Master out to a temp folder and then runs a Hugo build on it. I’m still using this deploy method because it lets me push commit without having to build locally first. Obviously there are pros and cons, but what I like is being able to edit my content and push and have it work from my iPad.

    Now, keeping this setup, in order to split my repository I need to solve a few problems.

    Contain Content Collectively

    No matter what, I need to have one and only one location for my content. Two folders is fine, but it has to be within a single folder. In order to do this, it’s fairly straightforward.

    In the config.toml file, I set two defines:

    contentdir = "content/posts"
    datadir = "content/data"
    

    Then I moved the files in content to content/posts and moved data to content/data. I ran a quick local test to make sure it worked and, since it did, pushed that change live. Everything was fine. Perfect.

    Putting Posts Publicly

    The second step was making a public repository ‘somewhere.’ The question of ‘where’ was fairly simple. You have a lot of options, but for me it boils down to GitLab or GitHub. While GitHub is the flavor du jour, GitLab lets you make a private repository for free, but both require users to log in with an account to edit or make issues. Pick whichever one you want. It doesn’t matter.

    What does matter is that I set it up with two folders: posts and data

    That’s right. I’m replicating the inside of my content folder. Why? Well that’s because of the next step.

    Serving Subs Simply

    This is actually the hardest part, and led me to complain that every time I use Submodules in Git, I remember why I hate them. I really want to love Submodules. The idea is you check out a module of a specific version of another repository and now you have it. The problem is that updates are complicated. You have to update the Submodule separately and if you work with a team, and one person doesn’t, there’s a possibility you’ll end up pushing the old version of the Submodule because it’s not version controlled in your git repository.

    It gets worse if you have to solve merge conflicts. Just run away.

    On the other hand, there’s a tool called Subtree, which two of my twitter friends introduced me to after I tweeted my Submodule complaint. Subtree uses a merge trick to get the same result of a Submodule, only it actually stores the files in the main repository, and then merges your changes back up to it’s own. Subtrees are not a silver bullet, but in this case it was what I needed.

    Checking out the subtree is easy enough. You tell it where you want to store the repository (a folder named content) and you give it the location of your remote, the branch name, and voila:

    $ git subtree add --prefix content git@github.com:ipstenu/hugo-content.git master --squash
    git fetch git@github.com:ipstenu/hugo-content.git master
    From gitlab.com:ipstenu/hugo-content
     * branch            master     -> FETCH_HEAD
    Added dir 'content'
    

    Since typing in the full path can get pretty annoying, it’s savvy to add the subtree as a remote:

    $ git remote add -f hugo-content git@github.com:ipstenu/hugo-content.git
    

    Which means the add command would be this:

    $ git subtree add --prefix content hugo-content master --squash
    

    Maintaining Merge Manuverability

    Once we have all this in, we hit a new problem. The subtree is not synced by default.

    When a subproject is added, it is not automatically kept in sync with the upstream changes so you have to pull it in like this:

    $ git subtree pull --prefix content hugo-content master --squash
    

    When you have new code to add, run this:

    $ git subtree push --prefix content hugo-content master --squash
    

    That makes the process for a new article a little extra weird but it does work.

    Documenting Data Distribution

    Here’s how I update in the real world:

    1. Edit my local copy of the content folder in the hugo-library repository
    2. Add and commit the changed content with a useful message
    3. Push the subtree
    4. Push the main repository

    Done.

    If someone else has a pull request, I would need to merge it (probably directly on GitHub) and then do the following:

    1. Pull from the subtree
    2. Push to the main repository

    My weird caveat is that updating via Coda can get confused as it doesn’t always remember what repository I want to be on, but since I do all of my pushes from command line, that really doesn’t bother me much.

  • Making HTTPS Everywhere

    Making HTTPS Everywhere

    With the advent of Let’s Encrypt, introducing free and easy SSL certificates for everyone, and the fact that Plesk, cPanel, and home grown Panels like DreamHost’s all providing easy ways to install certs, renew them, and support them, we’re finally inching our way to the dream of HTTPS Everywhere.

    Why HTTPS?

    The S makes it secure, and the green lock on a browser tells a person that their visit is safer, encrypted, and obscures sensitive data. It means a visit is confidential. It means the site is the real site. It cannot be easily monitored, modified, or impersonated.

    While this blog has no sensitive data of yours, it does accept (require) your email when you leave a comment. You don’t want everyone knowing that, I suspect. You probably don’t want everyone grabbing your IP.

    Why do we care about security in general? Because nothing is non-sensitive anymore. Everything we do and say on the Internet can be used against us. Entering in your mother’s maiden name on a form over HTTP? Someone can snipe that and use it to steal your identity. Use the same password on multiple accounts, one of which is HTTP? Your code can be stolen. The list goes on and one.

    How I Turned This Site HTTPS Everywhere

    Every single domain on ipstenu.org is now https. Everyone either has the Let’s Encrypt certificate or a Comodo one. First I turned on Let’s Encrypt. Then I used WP-CLI to search and replace my urls:

    $ wp search-replace http://halfelf https://halfelf
    $ wp search-replace http://ipstenu https://ipstenu
    

    And so on and so forth down the line.

    Next I checked my mu-plugins folder and my content folder to make sure none of my home grown code was hardcoding in http (it wasn’t), and updated my wp-config.php to include this:

    define('FORCE_SSL_ADMIN', true);
    define('FORCE_SSL_LOGIN', true);
    

    That probably wasn’t required but why not? Finally I tossed this into my .htaccess:

    # Force non WWW and SSL for everyone.
    <IfModule mod_rewrite.c>
    	RewriteEngine On
    
    	RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
    	RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
    
    	RewriteCond %{HTTPS} off
    	RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
    <IfModule mod_rewrite.c>
    

    Really. That’s all it took to swap it all to https everywhere here.

    Gotchas

    Not all my plugins were happy about this.

    Most were, actually, which was nice, but a couple did some incredibly stupid things with hardcoded http resources. Fixing them for myself is trivial. For others… I recommend WordPress HTTPS or Really Simple SSL, both of which will let you force https for all URLs or block the http ones.

    For the most part, with WordPress, you don’t need to worry about this. In recent years, the ability to force SSL from within WP itself has gotten better and better. The problem has always been our themes and plugins.

    Other than that, it’s been pretty smooth going.

    Non WordPress

    But… what about my non-WordPress sites? Yeah, you know I have them.

    Well my ZenPhoto20 site doesn’t run any extensions, so I just checked the box for using SSL and went on my way. I’d cleverly written all my themeing with protocol-less URLs (\\example.com\my\path\file.css). While that’s really an anti-pattern, and https should be used whenever possible, I had everything in one file so I search/replace’d that and it was done.

    My Hugo site required two changes to a config file. It looked like this:

      StaticURL = "https://example-static.net"
      HomeURL = "https://example.net"
    

    The reason I did that was so my templates could look like this:

    <link rel="shortcut icon" href="{{ .Site.Params.StaticURL }}/images/favicon.ico">
    

    Once I saved the variables in my config, I could push the site (which automatically rebuilds and deploys) and be done.