Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: json

  • Hugo Making JSON

    Hugo Making JSON

    While it rhymes with bacon, it’s not at all the same.

    There are a lot of reasons you might want a JSON file output from your static site (I like Hugo). Maybe you’re using Hugo to build out the backend of an API. Maybe you want to have it include a search function. Today I’m going to show you how to have a JSON file created with a complete site archive. The end goal of this example is to have a searchable JSON file that you can use with Lunrjs or Solarjs or anything else of that ilk.

    The Old Way: Node

    Since I was initially doing this to integrate Hugo with Lunr.js, I spent some time wondering how I could make a JSON file and I ran into Lunr Hugo, a fork of Hugo Lunr but with YAML support (which I needed). I actually use a private fork of that, because I wanted to change what it saved, but this is enough to get everyone started.

    To use it, you install it via Node:

    npm install lunr-hugo
    

    Then you add the scripts to your Node package file (normally called package.json):

      "scripts": {
        "test": "echo \"Error: no test specified\" && exit 1",
        "index": "lunr-hugo -i \"site/content/posts/**\" -o site/static/js/search.json"
      },
    

    Change the value of “site/content/” as you see fit. Once installed you can build the index by typing npm run index and it makes the file in the right location.

    The obvious downside to this is I have to run it outside of my normal build process.

    Another Old Way: Grunt

    This idea come from Seb, one of the lead developers for Hugo, and he uses a Grunt script to do this. First you have to install node and things via this command:

    npm install --save-dev grunt string toml conzole

    Next you make a Gruntfile.js file like this:

    var toml = require("toml");
    var S = require("string");
    
    var CONTENT_PATH_PREFIX = "site/content";
    
    module.exports = function(grunt) {
    
        grunt.registerTask("lunr-index", function() {
    
            grunt.log.writeln("Build pages index");
    
            var indexPages = function() {
                var pagesIndex = [];
                grunt.file.recurse(CONTENT_PATH_PREFIX, function(abspath, rootdir, subdir, filename) {
                    grunt.verbose.writeln("Parse file:",abspath);
                    pagesIndex.push(processFile(abspath, filename));
                });
    
                return pagesIndex;
            };
    
            var processFile = function(abspath, filename) {
                var pageIndex;
    
                if (S(filename).endsWith(".html")) {
                    pageIndex = processHTMLFile(abspath, filename);
                } else {
                    pageIndex = processMDFile(abspath, filename);
                }
    
                return pageIndex;
            };
    
            var processHTMLFile = function(abspath, filename) {
                var content = grunt.file.read(abspath);
                var pageName = S(filename).chompRight(".html").s;
                var href = S(abspath)
                    .chompLeft(CONTENT_PATH_PREFIX).s;
                return {
                    title: pageName,
                    href: href,
                    content: S(content).trim().stripTags().stripPunctuation().s
                };
            };
    
            var processMDFile = function(abspath, filename) {
                var content = grunt.file.read(abspath);
                var pageIndex;
                // First separate the Front Matter from the content and parse it
                content = content.split("+++");
                var frontMatter;
                try {
                    frontMatter = toml.parse(content[1].trim());
                } catch (e) {
                    conzole.failed(e.message);
                }
    
                var href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(".md").s;
                // href for index.md files stops at the folder name
                if (filename === "index.md") {
                    href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(filename).s;
                }
    
                // Build Lunr index for this page
                pageIndex = {
                    title: frontMatter.title,
                    tags: frontMatter.tags,
                    href: href,
                    content: S(content[2]).trim().stripTags().stripPunctuation().s
                };
    
                return pageIndex;
            };
    
            grunt.file.write("site/static/js/lunr/PagesIndex.json", JSON.stringify(indexPages()));
            grunt.log.ok("Index built");
        });
    };
    

    Take note of where it’s saving the files. site/static/js/lunr/PagesIndex.json That’s works for Seb because his set setup has everything Hugo in a /site/ folder.

    To build the file, type grunt lunr-index and off you go.

    The New Way: Output Formats

    All of that sounded really annoying, right? I mean, it’s great but you have structure your site to separate Hugo from the Node folders, and you have to run all those steps outside of Hugo.

    Well there’s good news. You can have this all automatically done if you have Hugo 0.20.0 or greater. In the recent releases, Hugo introduced Output Formats. The extra formats let you spit out your code with RSS feeds, AMP, or (yes) JSON formatting automatically.

    In this example, since I only want to make a master index file with everything, I can do it by telling Hugo that I want my home page, and only my home page, to have a JSON output. In order to do this, I put the following in my config.toml file:

    [outputs]
    	home = [ "HTML", "JSON"]
    	page = [ "HTML"]
    

    If I wanted to have it on more pages, I could do that too. I don’t.

    Next I made a file in my layouts folder called index.json:

    {{- $.Scratch.Add "index" slice -}}
    {{- range where .Site.Pages "Type" "not in"  (slice "page" "json") -}}
    {{- $.Scratch.Add "index" (dict "uri" .Permalink "title" .Title "content" .Plain "tags" .Params.tags "categories" .Params.tags) -}}
    {{- end -}}
    {{- $.Scratch.Get "index" | jsonify -}}
    

    To generate the file, just run a build and it makes a file called index.json in the site root.

    How do you statically build JSON Files?

    Do you have a trick or an idea of how to make building JSON files better? Leave a comment and let me know!

  • Sharing WordPress Content with any PHP App

    Sharing WordPress Content with any PHP App

    Last week I explained how I shared my WordPress content with Hugo. Now, that was all well and good, but there is an obvious way I can do this when I don’t have a tool that understands dynamic content?

    I’ve got PHP so the answer is “You bet your britches, baby!”

    The First Version

    If you just want to grab the URL and output the post content, you can do this:

    $curl = curl_init();
    curl_setopt_array( $curl, array(
    	CURLOPT_FAILONERROR    => true,
    	CURLOPT_CONNECTTIMEOUT => 30,
    	CURLOPT_TIMEOUT        => 60,
    	CURLOPT_FOLLOWLOCATION => false,
    	CURLOPT_MAXREDIRS      => 3,
    	CURLOPT_SSL_VERIFYPEER => false,
    	CURLOPT_RETURNTRANSFER => true,
    	CURLOPT_URL            => 'https://example.com/wp-json/wp/v2/pages/12345'
    ));
    
    $result = curl_exec( $curln);
    curl_close( $curl );
    
    $obj = json_decode( $result );
    
    echo $obj->content->rendered;
    

    Now this does work, and it works well, but it’s a little basic and it doesn’t really sanitize things. Plus I would be placing this code in multiple files per site (not everyone themes as nicely as WordPress, ladies and gentlemen). So I wanted to write something that was more easily repeatable.

    The Advanced Code

    With that in mind, I whipped up a PHP file that checks and validates the URL, makes sure it’s a wp-json URL, makes sure it’s JSON, and then spits out the post content.

    <?php
    
    /* This code shows the content of a WP post or page.
     *
     * To use, pass the variable ?url=FOO
     *
     */
    
    if (!$_GET || !$_GET["url"]) return;
    
    include_once( "StrictUrlValidator.php" );
    
    $this_url = (string) filter_var( $_GET['url'], FILTER_SANITIZE_URL );
    
    if (strpos( $this_url, 'wp-json') == FALSE ) return;
    
    function do_curl ( $url ) {
    	$curl = curl_init();
    
    	curl_setopt_array( $curl, array(
    		CURLOPT_FAILONERROR    => true,
    		CURLOPT_CONNECTTIMEOUT => 30,
    		CURLOPT_TIMEOUT        => 60,
    		CURLOPT_FOLLOWLOCATION => false,
    		CURLOPT_MAXREDIRS      => 3,
    		CURLOPT_SSL_VERIFYPEER => false,
    		CURLOPT_RETURNTRANSFER => true,
    		CURLOPT_URL            => $url
    	) );
    
    	return curl_exec( $curl );
    	curl_close( $curl );
    }
    
    if ( StrictUrlValidator::validate( $this_url, true, true ) === false ) {
    	$return = "ERROR: Bad URL";
    } else {
    	$obj = json_decode( do_curl ( $this_url ) );
    
    	if ( json_last_error() === JSON_ERROR_NONE ) {
    		$return = $obj->content->rendered;
    	} else {
    		$return = "ERROR: Bad JSON";
    	}
    }
    
    echo $return;
    

    You can see I have some complex and some basic checks in there. The URL validation is done via a PHP Library called StrictUrlValidator. If I was using WordPress, I’d have access to esc_url() and other nifty things, but since I’m running this out in the wild, I make do with what I have.

  • Sharing WordPress Content with Hugo

    Sharing WordPress Content with Hugo

    When your life is just WordPress, there’s not a lot of headaches involved in making some posts and updating widgets and keeping your whole site in sync.

    When your life isn’t just WordPress, it gets a little weird.

    When you want to use WordPress to run your life a little more, you un-weird it by making it weirder.

    Rethinking Where the Content Lives

    Normally we think about content living on its own site. Well, I have a ‘message’ that needs to be the same on five different domains. Just work with me here. The point is, if I want to update the header message on the five sites, I have to update five sites. Yuck.

    Now, there are a lot of solutions to this. I decided I wanted one, and only one, place to update the header message. The most obvious is making a static text file that I could update when needed and import/include it in everywhere. But as I started looking into how I do that, my eyes drifted to WordPress.

    What if I used JSON? What if instead of a file, I made a page on WordPress, grabbed the content from https://example.com/wp-json/wp/v2/pages/12345 and parsed that so I didn’t have to do a whole mess of editing anywhere but on WordPress?

    Hugo

    Guess what. That works. And it works extra well for Hugo (a static site generator I’m fond of) because Hugo understands dynamic content. The one drawback is that it can’t live refresh remote data, so I will always have to push a change to the site to trigger this rebuild.

    However if you ever wondered how to include WordPress’ JSON data into a Hugo theme, here’s what I have in my template for utility-bar.html:

    {{ $wordpressURL  := "https://example.com/wp-json/wp/v2/pages/12345" }}
    {{ $wordpressJSON := getJSON $wordpressURL }}
    
    <div class="utility-bar">
    	<div class="wrap">
    		<section id="text-16" class="widget widget_text">
    			<div class="widget-wrap">
    				<div class="textwidget">
    					{{ $wordpressJSON.content.rendered | safeHTML}}
    				</div>
    			</div>
    		</section>
    	</div>
    </div>
    

    The reason safeHTML is there is that otherwise Hugo wants to escape my HTML. Which is a wise choice! Default to not trusting.

    This outputs the post content and I have a happy (enough) day. The more I look at it, the more I realize how much I can do with WordPress and Hugo, since regenerating the site just takes a push of Hugo content.

  • REST API – Democratizing Reading

    REST API – Democratizing Reading

    Democratize Publishing.

    We say that a lot in WordPress. We boldly state that the goal of WordPress is to democratize publishing through Open Source, GPL software.

    I’d like to turn that on it’s ear for a moment.

    The purpose of the REST API is to democratize reading.

    Democratize Discovery

    One of the myths of websites is that if you build it, people will come. If that was true, we’d never need plugins like Yoast SEO and we’d never need to self-promote. The reality is that we have to work hard and make our content good. WordPress helps us make it look good, and makes it easier to get it out there, and usually the rest is up to us.

    There’s another aspect to discovery, though, and that is the ease of findability. We already address this in WordPress to a degree with RSS, but as much as I’d love to say that was that, it’s not. It’s 2017. People don’t want RSS, they want email and they want the information on social media immediately.

    And? They want apps on their phones that alert them to new data.

    Democratize Consuming

    Reading data with WordPress is easy. We go to a webpage, we see the data, we consume, we move on. With emails and social media, people usually have to click to come to your site and read. While this is good for your traffic, anytime a person has to click you’re running a risk they might not click. That’s why Tweetstorms are more consumable than blog posts. While they are far more transient, they’re accessible in an immediate way.

    Making the consuming of data more direct, a more one-to-one relationship, improves the chances that someone will actually read the information. Much of this can be achieved with a good lede, but readers are well to aware of click-bait today. “She tried this blogging software. You won’t believe what happened next!”

    Right. You get the point. In order to get people to read, you have to make it easier for them to read. Converting WordPress posts to something an app can read needs to be easier for the blogger, or they won’t be able to have success.

    Democratize Reading

    The REST API does just that.

    It makes the discovery and consumption of your website easier by simplifying your output.

    When you look at a website you see the design and the layout and the wonderful beauty. When an app reads your data, however, it doesn’t want or need any of that. An app needs the raw data. And the REST API does that. It outputs the data in a super basic and simple way. Furthermore, it lets you add on to this and output specific data in special ways.

    Democratize Publishing

    You can publish your data in multiple ways. Today, we all know about the blogging aspect of WordPress. But here are some other ways you could share your data:

    • Allow users to download a CSV with all the products on your site and their costs
    • Create a plugin that calls custom data, allowing others to have a widget with the data
    • Power a mobile app without loading your theme

    Could you do all that without the REST API? Of course.

    Does the REST API make it easier and thus help you democratize faster? Yes. It does.

    The REST API can change your life, if you’d only let it.

  • WP JSON API Challenges

    WP JSON API Challenges

    In my frustration of my static process, I ripped everything out by it’s roots and decided to take things in a strict order.

    Decision: All-In-One or Separate?

    The first question was do I want to download everything as one big JSON file and split it out later, or do I want to, from the start, download each post as it’s own file? The time to download shouldn’t change, and if I do it right, I could script it to check if the date stamp on my downloaded file was older than the post date in WP. That means I could have this logic:

    if ( ! post file exists || post file date < post datestamp ) {
        download JSON for this post
    }
    

    That made it an easy win. Let’s split away from the original plan. Now my goal is to extract all the posts from my WP site in JSON format.

    Remember the goal here is to run a static website with WP. There are plugins that convert WP to static HTML, which you can then use to power a site, and this isn’t a bad idea.

    How To: Get all the posts from a WP site via the JSON API?

    This sounded easy. http://local.wordpress.dev//wp-json/wp/v2/posts listed posts, and the v1 comparison chart says that you can list all posts, it stands to reason that I can easily get all my posts.

    In a way, this is a WordPress problem. WP doesn’t want you to show all posts. It would make your site very slow if you had, say, 10,000 posts. Which means that the claim that the JSON API can lists all posts is true, it just doesn’t do them all at once.

    I happen to have 40 published posts on my test site, so when I GET with this URL, it loads all of them: http://local.wordpress.dev//wp-json/wp/v2/posts?per_page=100

    But that isn’t sustainable. And the old trick of making per_page equal to -1 didn’t work.

    I did determine that the JSON API actually knows what the amount of posts is!

    JSON reports X-WP-Total which, in my case is 40

    The header “X-WP-Total: 40” is what I needed. Of course, that is rather weird to get. It means to get all the files, I have to write ‘something’ that does this:

    1. Get http://local.wordpress.dev//wp-json/wp/v2/posts and parse it to get the value of X-WP-Total
    2. Get http://local.wordpress.dev//wp-json/wp/v2/posts?per_page=X-WP-Total

    Okay. So how?

    Well remember how I mentioned a plugin that made a static site already? I was talking about things like Really Static. Save a post, it saves the JSON file! Why not write this plugin?

    Why not use “Really Static”? Well I don’t want to have WP doing more than be my content generator. I want to separate content (the blog) from WP. And that means I will end up writing a plugin…

    Am I In Too Deep? Or Am I Over My Head?

    So far, this has been an incredible amount of work. It’s been a lot of two steps forward and three steps back, over and over and over. And yes, it’s been very frustrating. I’ve given up many times and, if my posts about multiple approaches are any indication, I’ve deleted and restarted many, many times.

    And I have to be honest here… I’m giving up on this overly ambitious plan right now.

    This marks two major failures (three if I could how insane the entire ‘use Jekyll/Hugo to run a gallery’ was). First, I’ve failed to sensibly convert JSON files to MD in a scriptable and reproducible way. Second, I’ve failed to export JSON from WordPress in a simple way. Both those failures can be worked around. I can build a plugin that exports the JSON. It’s not trivial, but having done similar things I’m sure I can do it. The issue is ‘is it worth it’? And right now I think the answer is no.

    I’m back to the drawing board right now, sketching up new plans and ideas. I want to eliminate as many steps as possible. I’d like to leave WordPress alone, let it be WordPress, and similarly let my static generator be static. These failures have taught me more about the interdependencies than I expected, and I’ve learned more from them than I may have if I’d had outright success from the start.

    While I’m starting over, again, I feel hopeful and less ‘I hate you, JSON’ than I thought I would. I now understand things more clearly and I see how I made things far, far more complicated for myself.

    The future of this project is, weirdly, bright.

  • JSON to Markdown

    JSON to Markdown

    Now that I understand how Hugo and JSON work separately and together, it’s time to figure out the next question.

    Can I generate Markdown files based on the contents of a JSON?

    My JSON File

    I started with an SQL DB where my photos have lived for a long time. I exported the albums table to a JSON file (yes, phpMyAdmin can do that), cleaned it up by removing the cruft I didn’t need, and ended up with something like this:

    [
     {
        "categories": "Adventures",
        "tags": "Index",
        "folder": "foobar",
        "title": "Foobar and Baz",
        "desc": "The adventures of Foobar and Baz.",
        "date": "2006-03-24 19:46:40"
      },
      {
        "categories": "Vacations",
        "tags": "Index",
        "folder": "botbob",
        "title": "Bot and Bob",
        "desc": "Bot and Bob's vacation photos",
        "date": "2006-03-24 11:51:04"
      }
    ]
    

    By the way, I’m not going to address how to actually make the gallery with images here. Yet. The eventual idea I have here is to run this with WordPress doing that, but since I was messing around with this anyway, I figured I’d reuse what I had.

    The end goal here, remember, is to export the JSON file from WordPress (list all top level categories, let’s say) and have Hugo generate a page for each one. That’s all. But WordPress has so much data, I decided to start with the smaller, simpler stuff.

    Let’s move on.

    Create a Markdown Page from JSON

    Thankfully someone else has already tossed this idea around. I started with reading Dynamic Pages with Hugo

    My idea is: Import any JSON or CSV from any local file or URL and make the JSON or CSV content available in a shortcode or directly in the layout files.

    That was very close to what I wanted, but it would result in me having to make a page for each item. I don’t want that. I want a script to make the .md file for me.

    So I stepped back and stopped asking “How do I make a Hugo page from JSON?” and I asked “How do I make a separate file for each entry in a .json file?”

    Sidebar: My wife was trying to convert the Hebrew year (3595) into all the possible other calendars and asked me to help her with the math. I asked her “What Julian year is that?” She said it BCE so it didn’t work like that, but that was 167. I Googled “What is 167 BCE in the Chinese calendar?” Boom. WikiPedia’s page for 167 BC lists them all. As I pointed out, the trick to these things is not converting it all yourself, but getting from ‘weird’ to ‘Base 10.’ Once you’re at the default (the lowest common denominator), it’s allot easier to figure out other people’s weird.

    This is related to my task at hand. Weird is being specific. Weird is caring what system I’m using. The root of my issue is this: I want to make a Markdown file for every entry in JSON.

    I use Mustache. And, yes, someone already had this idea. He was using mustache-render in Grunt, which I’m already familiar with, but I made a few changes.

    1. I changed the templates folder to my Hugo Archetypes folder. This is my source, my style.
    2. I changed the data folder to my Hugo Data folder.
    3. The output went to (you got this one, right?) the Hugo Content folder.

    This should work, right?

    Mustache Render

    My Gruntfile has this:

        mustache_render: {
          all: {
            options: {
              directory: "partials"
            },
            files: [
              {
                data: "../data/parents.json",
                template: "../archetypes/gallery.mustache",
                dest: "../content/gallery.md"
              }
            ]
          }
    

    And my gallery.md file is like this:

    ---
    title: {{title}}
    author: Me
    layout: default
    permalink: /{{folder}}/
    categories: {{categories}}
    tags: {{tags}}
    ---
    
    {{title}}
    

    Except that doesn’t work. If I changed {{title}} to {{0.title}} then it showed Foo and Bar’s adventure information. Obviously the render was only going to show each entry individually. Now. This could work in a variety of situations, just not mine. Alternatively, I can wrap the output in this: {{#0}} ... {{/0}}

    But what I want is ‘for each key in json, make a file…’ and since I can put the JSON output right into the gruntfile (data: { greeting: "Hello", target: "world" }) then in theory I want to generate this:

    files: [
      {data: OUTPUT FROM KEY ,
       dest: KEY ID,
    ]
    

    Okay so how do I tell Grunt to read from a JSON and for each key, do a thing? First I use grunt.file.readJSON:

    pkg: grunt.file.readJSON('../data/parents.json'),
    
        mustache_render: {
            [...]
                files: [{
                    template: '../archetypes/gallery.mustache',
                    data: '<%= pkg.1 %>',
                    dest: '../content/<%= pkg.1.id %>-foo.md'
                }]
          }
        }
    

    And this will output item (Bob’s vacations). So stage one is complete. Now I want the 1 in pkg.1 to be iterative or I want to pass a variable to the mustache file and change {{#0}} ... {{/0}} based on its value. Obviously if I had one JSON file for each ‘post’ I wanted to make, this would be easy.

    Multiple Markdown Files from One JSON

    I have no idea.

    Seriously. I spent a few hours on this, lying in bed trying to feel less cold-and-flu-ish. At this point, I don’t really care what I use to do it, but the bones is this: Take one JSON file and split each object into it’s own file.

    Can’t figure that out yet. But hey, there’s my 2016 project!

    FYI: If you reply with ‘try project X!’ without a code example of how to do it, I’ll probably delete your comment. I’ve seen ‘Try handlebars!’ or ‘Try Assemble’ but my issue here is no one has an example of how to do it, and I work best by example. I need that Hello World.

    I am looking at Handlebars and Assemble, by the way, but at 900+ words, it’s time to call a pause on the adventure.