Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: wordpress

  • GPL Isn’t Protecting You

    GPL Isn’t Protecting You

    Some days I know my plugin reviews are going to wreck me. January has had a lot of complaints from people about aspects of the GPL. Specifically they wanted to know how to protect themselves with the GPL.

    The truth is the GPL is not protecting anything except the right of the next guy to take your code and do stuff with it. And that terrifies people.

    I’m not entertaining a discourse on the merits or legality of the GPL here. Those comments will be deleted. Simply put, a requirement of the WordPress.org repositories is that to be hosted there you must be GPLv2 (or later). At that point, every other argument is moot. Your code has to be GPLv2 to be in the repositories. End of story.

    Okay. So what’s there left to discuss about protecting yourself and your code? Three things: Trademarks, copyright, and theft. Here we go.

    Trademarks

    GPLv2 doesn’t protect your trademark, but that doesn’t mean your trademark isn’t protected. While any image you put in your WordPress theme or plugin has to be given as GPLv2 compatible, that doesn’t void your trademark. A freely offered image that is trademarked (say, the WordPress logo) can be used in your plugin, but it comes with restrictions after all. The inclusion of the SVG of the logo in GPL code doesn’t change that.

    One of the things that changed in GPLv2 and GPLv3 was related to this. Remember, GPLv2 allows all code that does not include any restrictions that were not already in GPLv2. As long as license was as free (or freer) than GPLv2, it was deemed to be GPL-compatible (see the WTFPL). The issue with that is some licenses were very easy to comply with but had clauses like you couldn’t use certain trademarks. This caused confusion, as it was read as a restriction. The thing was that it wasn’t! Regardless of what the license said, you never had permission to use the trademark.

    This is good for companies. You can trademark your logo and, if someone takes it redistributes a fork with the logos still in it, they’ve violated trademark law. And you can protect yourself there. I suggest you read Joomla’s post on the matter of Trademark protection to get a better idea of how it all works.

    Copyright

    Copyrights are another thing that the GPL doesn’t protect. Except it does.

    GPLv2 and GPLv3 are both copyleft:

    To copyleft a program, we first state that it is copyrighted; then we add distribution terms, which are a legal instrument that gives everyone the rights to use, modify, and redistribute the program’s code or any program derived from it but only if the distribution terms are unchanged. Thus, the code and the freedoms become legally inseparable.

    What does that mean? Your copyright is yours. By the act of writing code, you own the copyright (with some exceptions, like if you’re hired to write the code). When you contribute code to an open source project like WordPress, you STILL retain the copyright unless you give it away, but the license is whatever the project’s license is. Most of the time this is fine, but as I recently saw with Hugo, this can be problematic when a project wants to change their license. Hugo had to get permission from every single person who had contributed.

    This is, by the way, why WordPress will probably always be GPLv2.

    One way around this is to require everyone to waive their copyrights in order to contribute. I believe DotNuke did this. Whomever owns the copyright, if the code is still licensed in a way that allows for free distribution then nothing’s really changed. The code is still open.

    Of course, then there’s the jQuery Foundation does with their Individual Contributor License Agreement – In order to contribute to jQuery’s code or website, you have to sign that and provide a valid email. This gives them a way to contact everyone and also makes sure you understand what you signed up for. WordPress just has a checkbox when you submit your code to remind you that you’ve given it up.

    If you’ve ever looked at the jQuery Foundation License, you may have noticed this line:

    You are free to use any jQuery Foundation project in any other project (even commercial projects) as long as the copyright header is left intact.

    This is not imposing a restriction more than GPLv2. See the bit in trademarks. Legally you had to do that anyway, they’re just reminding you not to be a tool and leave this simple line in:

    • Copyright jQuery Foundation and other contributors

    I bark at developers a lot for removing the license headers from javascript files. Don’t do it. You’re violating copyright and, if the original devs complain, you’ll lose your code until you fix it. Which is the point here. Copyright exists beyond GPL, so the fact that it doesn’t actively protect it doesn’t make it not enforceable.

    Theft

    I don’t mean legal here.

    A lot (a lot) of people argue that their plugin should be able to be encrypted or obfuscated to make it ‘harder to steal.’ I hear that about once a week, if not more. And my answer to all of them is “Not if you want to be hosted on WordPress.org.” WordPress.org has an ‘above and beyond’ understanding of the idea of distribution and allowing people to edit. It’s felt that the spirit of GPL means your code should be easy for someone to read and fork.

    I said a dirty thing there, I know. The ‘spirit’ of the GPL is probably causing some of my friends to roll their eyes so hard they’ve got migraines. Sorry about that. But it really is the one time I use it. When I say the ‘spirit’ I mean the intention of the license and it’s application to WordPress.org’s repositories only. Right or wrong, agree or disagree, it’s straightforward. If you want to have your code in the .org repos, it’s gotta be human readable.

    There’s a simple reason for this. The GPL Copyleft is all about freedom and keeping that freedom alive. The Copyleft says that anyone who redistributes the software, with or without changes, must pass along the same freedom to further copy and change it. In order to allow people to change the code, we want it to be human-readable. We want people to be able to look at your code and say “Oh I understand how this works. I will improve it!” When you take away, or overly complicate their ability to do that, we feel you’re intentionally impinging on that freedom. You’re trying to find a way around it, basically.

    About the only time I’ve heard someone not claim they were smushing the code up to protect it from being stolen is when someone has smashed their javascript into a p,a,c,k,e,d() type compression file. I actually hate those files. Javascript is hard enough as is! Stop making it harder. Plus I need to tell you something really important.

    While minifying your javascript will improve a website’s performing by decreasing the load time, it doesn’t make it run any faster for the majority of code out there. Of course there are situations (large libraries or limited devices) where this is not the case, but trust me here. Your 7 line javascript is not going to be significantly faster just because you compressed it. I advocate using the .min version of common libraries, but unless your code is huge, leave it alone and let other people see how to edit it.

    Bonus: Distribution

    GPL comes into play when your code is distributed. If I put my code on my server and never give it to anyone, it’s not been distributed so licenses don’t really matter. As the GPL FAQ explains:

    But if you release the modified version to the public in some way, the GPL requires you to make the modified source code available to the program’s users, under the GPL.

    It’s the big if there. What constitutes distribution? Is your browser downloading a javascript file in order to run my site distribution? Is handing you a zip file distribution?

    I always recommend people play it safe.

  • Mailbag: Why Do Cannonical URLs Mess Me Up?

    Mailbag: Why Do Cannonical URLs Mess Me Up?

    This came from a DreamHoster who said it was okay to blog it if I didn’t mention them by name. They were embarrassed about the mistake.

    Why was my site having a redirect loop, and why did changing the domain to force www in panel fix it?

    The ‘panel’ being mentioned here is DreamHost’s Panel. We don’t have cPanel, we have PANEL, and it’s our special, 100%, tool. In there, you can edit your domain settings. One of the settings has to do with your domain settings. Simply it asks “Do you want the www in your URL?”

    DreamHost Panel: Do you want www or not?

    In general I check “no” because I’m a no-www kind of person. I don’t like it.

    The issue this person was faceing was that you’d go to http://www.wordpress.dev and everything was fine, but his site went all over the place weird when you went to

    In order to fix that person’s website, I changed “Leave it alone” to “Add WWW” (their preference, not mine).

    Why did that work? Let’s understand the issue first. What was happening was that your server generally doesn’t care if you visit http://www.wordpress.dev or http://wordpress.dev because it knows that they both mean /home/user/wordpress/public_html/ (or whatever). But! Some servers (like DreamHost) let you prioritize and redirect. If they don’t, you can use nginx/htaccess to force www or not. As far as the server cares, none of this matters. If you force no-www and someone goes to www, they get redirected and everyone’s happy.

    Except WordPress. WordPress is picky. WordPress has special settings:

    Your home and site URLs in WP

    Recognize that? If you have WordPress set to use www and you go to the non-www URL, it won’t redirect you. It will, however, force all internal links and generated paths to have the www. The same thing happens if you don’t force https for your site. WordPress will still serve http://wordpress.dev but all the links on that page would be to https://wordpress.dev and so will your stylesheets and javascript.

    This may make you think that WordPress doesn’t care. You’d be wrong. WordPress cares deeply, and it shows up when you go a URL like http://wordpress.dev/i-am-fake/ – assuming you don’t have a page named that. If you’ve forced www, WordPress will look for that page and do a 404 redirect. Suddenly you’ll have www!

    Except in some cases, what happens is WordPress accepts the URL you gave it and shows without the www. Then it remembers it should have the www and redirects to that. Only it knows you told it non-www and redirects to that. And you get an endless 301 redirect loop. And you cry.

    If you force www in DreamHost’s Panel and you force it in htaccess and you set it your settings, this still may not be enough! You may have to do a search and replace to change all the non-www urls to www!

    Amusingly, when WordPress 4.4.1 dropped, we found a rare race condition with http and https and WooCommerce, born from a simple mistake.

    Woo lets you force http on post-checkout pages

    Everyone I know who looked at that went “Oh, right, if my site has https in the home and site URLs, maybe I shouldn’t try to force http here!” But Woo lets you shoot yourself in the foot. Or at least they did. There’s a ticket open where I suggested perhaps they prevent that.

    And for what it’s worth, I totally get how these things happen. You set up your domain and your WordPress site and your http/https settings at different times. It’s understandable that in 2013 you didn’t have SSL, but you added it in 2014. And at that time, you only put SSL on your checkout pages (right?) and since you wanted your cache to work better, you said “no https for checkout’ed!” That’s perfectly sane and logical. But since LetsEncrypt and HTTPS Everywhere became big in 2015, you changed the whole site over and simply forgot about that one, teensy, toggle…

    At least, until someone got that horrible 301 redirect on checkout.

    Like I told the people I fixed, don’t kick yourself over this. You have a lot of moving parts and the secret is understanding how each bit works. If you know that you set Panel properly, and .htaccess properly, and site/home URLs properly, and it only happens on checkout, you can zero in on what’s left and debug that.

  • Why CDN Media Needs a Plugin

    Why CDN Media Needs a Plugin

    I wanted to write a blog post for work (something I try to do once a month) so I thought “I should write about using a CDN for images. I wonder if you can do this without a plugin…”

    One hour later, and 1200 words, the answer is no. You cannot.

    My initial goal was to move media files (and only the media) from http://example.com/wp-content/uploads/ to http://static.example.com/wordpress/ on DreamObjects. I’ve written a plugin that does that, but to my frustration, I found you cannot do this without a plugin for one incredibly simple reason.

    But before the big reveal, let’s break down, in simple terms, what I would actually need to do in order to move the uploads. My two assumptions are that I already have a WordPress powered site (check) and I already have DreamObjects (check)

    1. Setup a bucket on DreamObjects (or AWS whatever) to house my images, and give it a CDN alias.
    2. Copy all the images to the bucket.
    3. Change the Upload URL to http://static.example.com/wordpress/ (my CDN alias for the bucket).
    4. Edit all posts and GUIDs to the new URL.

    The thing is, I can actually do all of that!

    Setup a Bucket on DreamObjects

    Log in to your Panel, click on Cloud, go to DreamObjects, and create a new bucket in DreamObjects. I called my domain-static because that’s easy for me to remember. Static content lives here. Once you have your bucket, click on the “Change Settings” link and add an alias of “static” to the domain you’re hosting all this on (example.com in this example).

    You can name your alias anything you want. I just picked static because I knew I wanted my URLs for images to be http://static.example.com/wordpress/2016/01/happynew.jpg and this makes it simple. I like to leave my CDN available for more than just WordPress, and doing this will let me have more folders like wiki or gallery.

    Next, scroll down to “DreamSpeed CDN” and turn on CDN Support. You’ll notice that tells you a special URL for CDN:

    Connect to the accelerated DreamSpeed CDN version of this bucket at: examplecom-uploads.objects-us-east-1.dream.io

    Don’t worry! You can use your custom URL too. You don’t have to, but most of us want to, to feel special. In order to use your custom alias that we just made, scroll up a little and check the box for ‘CDN’ on your alias. Press save and you’re good to go!

    Copy all the Images

    Set the bucket public first!

    Just trust me, here. Okay? Good. Set the bucket public and then upload all the images. I used Transmit. Cyberduck also works.

    Change the Upload URL

    This part was scary.

    We need a moment of history first. Up until WordPress 3.5 this was actually pretty easy to do. You went to your media page and you changed things. As of 3.5, WordPress decided that this was more trouble than it was worth. Too many people were accidentally doing silly things, breaking their sites, and it was too dangerous. So they removed the part of the screen that let you change this.

    I have issues with this, since for what I want to do, it makes it a hassle. But given the flaw in my great plan, I recognize the change as one that protects people who know less than I do.

    Using a plugin to restore this missing setting would be the easiest on many levels. And there are a few plugins that do this but I like Upload Url and Path Enabler. It’s simple and to the point.

    Of course, I love wp-cli and we have it on all our servers at DreamHost, so this command is similarly awesome:

    wp option set upload_url_path http://static.example.com/wordpress

    The last option for this is the secret WordPress page called “All Settings” — This is hands down the most powerful and dangerous page in all of WordPress. It lists all your options and settings that you have in the database. And yes, you can edit many of them in your browser at http://example.com/wp-admin/options.php

    Go ahead. Take a look. It lists a lot of things, most of which you should never, ever, ever, touch. If you were going to use this, search for upload_url_path, change the value from empty to http://static.example.com/wordpress, and press save.

    Edit All Your Posts

    This is a sucky part. You have to edit all your posts now to move the existing images. This is because those changes you made are only for images going forward. Which is really cool and means you didn’t break anything so far. But WordPress hardcodes the paths to images in your posts (for many reasons) so you need to change all of them.

    If you know how to use the command line, this is actually really easy:

    wp search-replace http://example.com/wp-content/uploads/ http://static.example.com/wordpress/

    Run that command and WP-CLI will magically fix all your posts.

    If you don’t want to use the command line, I recommend Velvet Blues Update URLs.

    Done!

    And to be clear on things, this worked perfectly. My media library showed properly, all my images showed, and everything looked perfect. So I decided to run some basic tests and upload new images…. and that’s where my house of cards fell apart.

    You cannot save files to another server.

    Actually You Can… if you sudo

    Now I have to tell you something. I lied. You totally can do this. The problem is that it’s hard and requires mounting your bucket as a local filesystem so that it would be available via the file-path of the server. You see, you could save files to /folder/location/wordpress/ anywhere on the server. I save mine to /home/user/public_html/static/ on one set up, and I have a static domain (domain-static.com) that runs from that folder.

    What that means if I had a location I could access, I could tell WordPress to save there, making it ‘local’ enough to trick it. You can do this with tools like s3fs-fuse, however they’re incredibly complicated and weird for the layman. And that right there was my problem.

    If you’re trying to present something as ‘Without a plugin’ then you don’t want to make the tradeoff be ‘..but with sudo access to a VPS.’ That’s unreasonable.

    And it’s why, right now, if you want to use a CDN for your media in WordPress, use a plugin.

  • Mailbag: How do plugins update?

    From Ken the Web Mechanic comes this:

    Hi! I wonder if you’d have any insight on this… I’ll mention Wordfence, though I imagine that it applies to any security plugin that compares plugin files with those in the WordPress repository… One of the most common things that Wordfence reports as a file “inconsistency” is the readme.txt of many plugins. Frequently, when using Wordfence’s compare feature, the readme of an installed and up-to-date plugin is shown as not the most current version and the repository shows the latest version – which is for the version that is installed!! If you delete the plugin and then download and install it anew from the repository, the readme is for the current, Wordfence has no complaint and all is right with the world… So I guess my question is… When a plugin is updated through WordPress, does the readme.txt file always get updated? It’s always been my understanding that during an update all of the old filed are deleted and then the new version is installed from the repository… It’s seeming like this may not be 100% of the case. It’s certainly only a minor irritation to me… but it makes me curious… Inquiring minds and all of that! 😉 Thanks!

    I put the whole question in because while the crux is “How are these buggers updating?” the whole picture is interesting. I’ve talked about this once before, but it was a very long time ago. How the WordPress Upgrade Works was posted in 2011, and Why does the WordPress background auto-upgrade work? was back in 2013. So it’s about time for a revisit.

    The short answer is “Actually, the readme.txt is deleted on plugin (and theme) upgrades.”

    The longer answer means first we should understand what’s going on with upgrades! How plugins (and themes) update is pretty basic and you can check out /wp-admin/includes/class-wp-upgrader.php to see all this code:

    1. Connect to the filesystem.
    2. Download a package.
    3. Unpack a compressed package file.
    4. Clears the directory where this item is going to be installed into.
    5. Install a package.

    These are all great failsafes, making sure multiple times that we’re not about to leave a user in a bad state. We check to make sure we can download and use the file before deleting and replacing. We check to make sure WordPress can find all the folders it needs, like plugins and wp-content and so on. If it can’t connect to any of the ones it needs, it will fail. We make sure we can download and unzip the file. Even when we look at the complex fourth step, we’re all checking over and over to make sure that when we really do delete these files, we have something to replace them with.

    The goal is that WordPress should never be able to leave you with a plugin deleted and not installed again, nor should it leave you with a half-deleted plugin. Obviously critical failures, like a server reboot mid-stream, will have some catastrophic effects. This is why WordPress tosses a .maintenance file down, mind you. Stops your site from looking like total poop while all this is going on.

    When we look at the class Plugin_Upgrader itself (line 766 or so), we get into some nitty gritty things. The public function upgrade does some interesting things:

    		add_filter('upgrader_pre_install', array($this, 'deactivate_plugin_before_upgrade'), 10, 2);
    		add_filter('upgrader_clear_destination', array($this, 'delete_old_plugin'), 10, 4);
    

    Here we are clearly deactivating and deleting safely before we upgrade. And at this point, I’m very confident that we’re deleting that readme.txt before we upgrade.

    Okay! So why is WordFence being a dillweed?

    I have a theory it’s related to the updates we do when WordPress core has a new version. You see, instead of updating the whole plugin, we update the readmes only to edit the “tested up to…” value. If there’s no code change, after all, why would we bother? No need to push an update! This means you no longer have a plugin that 100% matches what’s on Wordpress.org, you have a plugin where the code files match, but not the readme.txt files.

    And then poor WordFence notices that the readme you have and the readme on .org’s servers is out of whack, and tosses an error.

    I don’t know how WordFence would fix that, but I’m pretty sure this would hang ’em up.

    WordFence folks, got any ideas?

  • Not Ditching ZenPhoto After All

    Not Ditching ZenPhoto After All

    When we last left our heros… We had taken a SQL database, converted it to JSON, split it into 885 separate JSON files, renamed and moved them based on the folder path previously assigned.

    That took a while, but now we’re ready for the fun!

    Convert them to MD

    I already know how to do this one. Run grunt-mustache-render!

    One problem. It didn’t like recursive folders. Pause for laughter. Thankfully I had done a cp and not an mv on my files with Mike’s script, so I just changed that to handle moving .md files instead of .json and it all went fine.

    Turn on Hugo

    I had a theme already so I copied it over and then started making my changes. I needed the layout to be a little different.

    First I needed a list of all the sections and files and I needed to grab all the items that were an ‘Index’ file and format them specially:

    <div id="albums">
    {{ range $section, $taxonomy := .Site.Sections }}{{ range $taxonomy.Pages }}{{if eq .Type "index" }}
    		<div class="indexalbum">
    			<div class="thumb"><a href="{{ .Permalink}}" title="View album: {{ replace $section "-" " " | title }}">IMAGE HERE</a></div>
    			<div class="albumdesc"><h3><a href="{{ .Permalink}}" title="View album: {{ replace $section "-" " " | title }}">{{ replace $section "-" " " | title }}</a></h3>
                {{ with .Params.desc }} {{ . | safeHTML}} {{ end }}</div>
    			<p style="clear: both; "></p>
    		</div>
    {{ end }}{{ end }}{{ end }}
    </div>
    

    I felt very pleased with myself at this point. Obviously I would need to replace “IMAGE HERE” with an image, but I had the basic idea down.

    Inside each post, there was a shortcode mentioned before, {{< gallery >}}, and this is where my drama started. What I wanted I thought would be (fairly) easy. Get a list of all the folders where this index.md file was and list them similar to what I had done before. They were sub-sections (in Hugo Terms), but as it happens, Hugo doesn’t have a way to handle those! There are no templates for subsections. Now this in and of itself wasn’t a deal breaker. I could made do with that shortcode except…

    Limitations

    The thing I really do like about ZenPhoto is that I can drop my new media in a folder and magically knows what’s up. Add a new folder? It knows it’s there. With Hugo (or Jekyll) I found it was incredible hard to do something as ‘simple’ as getting a list of all the items in a folder. With Jekyll I’d need a plugin (which I am certainly not adverse to). Hugo has a readDir call, but it’s limited to the working directory and I wanted to read the images in a CDN folder (on the same server, but still).

    Perhaps ironically, it’s easier to run a gallery with Hugo and Jekyll if you host the images off your own domain. It goes against my goal of having a self-hosted site, though. But for now, it’s not possibly to do what I wanted.

    Do I feel like I wasted time?

    No! In fact, I’ve learned a lot of things that tell me where the foibles are in my plans.

    ustwo actually did make a React powered site and in looking at their source code I see they did it by making the site ‘all one page.’ This is something I can understand today and wrap my head around. Because the alternative would be something like this:

    1. Write a post
    2. Run a script to grab all the pages via the JSON API as individual .json files
    3. Run a script to convert them into .md
    4. Build static site (test and then deploy)

    It’s not actually that horrible when I put it that way. It means everyone could write on WordPress and, when they’re ready to deploy, I just run my script, push to my dev site, validate, and push to live.

    But. That loses a lot of what makes WordPress cool. Editing posts on the fly, pushing live updates, and scheduling posts all becomes a nightmare. For my Gallery idea, it means that WordPress remains a viable option except for the fact that I can’t easily move from ZenPhoto to WordPress and WordPress behaves like a blog when I don’t want it to be one.

    I’m all about using WordPress for things other than blogging but the sad truth is it’s still structured like a blog. Custom Post Types are doing a great job, and a lot of what I do on my estore site is similar to how I’d want to handle a gallery. I could do posts as posts and organize them as categories, for example, and structure the whole theme to handle that…

    Or I could just stay on ZenPhoto, which does all that out of the box.

    And yes. That’s what I’m doing.

  • Ditching ZenPhoto for Hugo?

    Ditching ZenPhoto for Hugo?

    This started out as a 230 word bullet point list of how to do things. It’s now two entire posts!

    (Almost) Leaving ZenPhoto

    I have to preface this by telling you that I didn’t in the end. Yeah. I know. I planned to. And here’s why:

    I like ZenPhoto a lot. It’s a great Gallery tool, and after Gallery2 went to Gallery3 and I hated it, I was quite in love with them. But. They’ve started to inch towards being a general CMS and the whole ZenPhoto/ZenPhoto20 split left a weird taste in my mouth.

    Plus the DB I have is already 65 megs, which has started to behave a little oddly. Now I have 38,534 images in 885 albums, and to a degree that makes sense. If you think about it in WordPress terms, I have 885 taxonomies (tags and categories), and 38,534 posts. Given that each post in this example would be a sentence at most (describing the image), the WordPress DB would be around 15megs based on some napkin math I did. Validating this, there’s a WordPress Joke DB with 40k Jokes and it’s only 14 megs.

    Oh and did I mention that ZenPhoto has no export feature? I guess there’s a reason the first PHP script I wrote was a simple Gallery. All I really want is to display images in a structured format.

    But. As I said before, I did not leave ZenPhoto! Why not? Because I was unable to do what I wanted to in pure Hugo, and because porting everything over to WordPress is incredibly hard and complex.

    Why Not WordPress?

    WordPress has a lot of awesome advantages. It auto generates thumbnails for one, which Hugo does not, so porting the gallery over would save me one level of complication. But I have 38,534 images to import. No matter how you slice that, I’d have to import them and associate them with the right post. Probably manually.

    With a static generator, I can very easily assign a variable for each album to say ‘get images from this folder and display them.’ And if there are folders, list each folder as a link to another page. By contrast, I would have to use a plugin like NextGen gallery to do this in WordPress, and while I do like the plugin, it’s got the same issue that ZenPhoto has for me. It’s doing too much. And, to be honest, part of this process is to limit my potential security issues. Adding in really big, really complex plugins is the opposite of that, no matter how rigorously reviewed they are.

    Besides, if I hated it, I knew I could import MD files into WordPress. Someone already did it with Jekyll after all.

    Get and Grep the Data

    No matter what I needed the data. Thankfully I knew how to do this with phpMyAdmin already. I dropped the _albums table in JSON format. Why JSON? Well as I’d already learned, I could generate a Markdown file from a JSON pretty easily. Of course, I got stumped on having it loop through the JSON file, but I have a plan for this.

    On export, albums.json was 643 KB while images.json was 61.4 MB. And they were all one line. I’m not using images.json just yet.

    I tossed it into BBedit and scrubbed the data, removing everything I didn’t need. This did not change the file size at first, since I also made the JSON human readable. There was a lot of gripping here, but I took each entry from 16-22 items down to six: tags, categories, folder path, title, description, date. And the file went down to 143 KB. One-sixth, more or less.

    Finally I had to split each entry because, as I’d learned, trying to loop through a nested JSON array in grunt is complicated. So fine, I’ll do it manually. But how? Answer: By loving linux. You can split files!

    $ split -a 6 -l 8 ../albums.json album-
    

    There’s also csplit, which is contextual split. Either way, I now have 885 album-xxx.json files. Whew.

    Rename and Move the Files

    I messed around with linux commands to move files based on their folder paths and rename them based on the last variable. I dreaded having to just move them and then going to mess with the renames. So after two days, I asked Twitter and was saved by Mike Little and Kailey Lampert. Between them I learned that there was a jq tool for messing with JSON (homebrew install it) and that a shell script is a godsend.

    #! /bin/sh
    
    # awk script:
    # set field seperator to :
    # strip comma from field 2
    # strip quotes from field 2
    # strip lead space from field 2
    # strip trail space from field 2
    # print field 2
    
    for FILE in `ls *.json` ; do
        NAME=`cat $FILE | grep folder | awk 'BEGIN { FS= ":" } { 
            gsub( ",", "", $2 );   \
            gsub( "\"", "", $2 );  \
            gsub(/^[ \t]+/,"",$2); \
            gsub(/[ \t]+$/,"",$2); \
            print $2 
        }'`
        
        mkdir -p $NAME;
        cp "$FILE" "$NAME/index.json"
    done
    

    I ended up changing it to reduce future broken URLs. But basically everything except the mkdir comes from Mike. He’s my hero.

    To Be Continued

    Once I had all the files, it was time to change them all to Markdown files and whip out my templates!

    But you’ll need to wait for that because this post got really long.