Half-Elf on Tech

Thoughts From a Professional Lesbian

Category: How To

  • Editor Sidebar Madness and Gutenberg

    Editor Sidebar Madness and Gutenberg

    Way back when WP was simple and adding a sidebar to the editor was a simple metabox, I had a very straightforward setup with a box that, on page load, would tell you if the data in the post matched the remote API, and if not would tell you what to update.

    My plan was to have it update on refresh, and then auto-correct if you press a button (because sometimes the API would be wrong, or grab the wrong account — loose searching on people’s names is always rough). But my plan was ripped asunder by this new editor thingy, Gutenberg.

    I quickly ported over my simple solution and added a note “This does not refresh on page save, sorry.” and moved on.

    Years later brings us to 2024 and November being my ‘funenployment’ month, where I worked on little things to keep myself sharp before I started at AwesomeMotive. Most of the work was fixing security issues, moving the plugin into the theme so there was less to manage, modernizing processes, upgrading libraries, and so on.

    But one of those things was also making a real Gutenbergized sidebar that autoupdates (mostly).

    What Are We Doing?

    On LezWatch.TV, we collect actor information that is public and use it to generate our pages. So if you wanted to add in an actor, you put in their name, a bio, an image, and then all this extra data like websites, social media, birthdates, and so on. WikiData actually uses us to help determine gender and sexuality, so we pride ourselves on being accurate and regularly updated.

    In return, we use WikiData to help ensure we’re showing the data for the right person! We do that via a simple search based on either their WikiData ID (QID), IMDb ID, or their name. The last one is pretty loose since actors can have the same name now (oh for the days when SAG didn’t allow that…). We use the QID to override the search in cases where it grabs the wrong person.

    I built a CLI command that, once a week, checks actors for data validity. It makes sure the IMDb IDs and socials are formatted properly, it makes sure the dates are valid, and it pings WikiData to make sure the birth/death etc data is also correct.

    With that already in place, all I needed was to call it.

    You Need an API

    The first thing you need to know about this, is that Gutenberg uses the JSON API to pull in data. You can have it pull in everything by custom post meta, but as I already have a CLI tool run by cron to generate that information, making a custom API call was actually going to be faster.

    I went ahead and made it work in a few different ways (you can call it by IMDb ID, post ID, QID, and the slug) because I planned for the future. But really all any of them are doing is a search like this:

    	/**
    	 * Get Wikidata by Post ID
    	 *
    	 * @param int $post_id
    	 * @return array
    	 */
    	private function get_wikidata_by_post_id( $post_id ): array {
    		if ( get_post_type( $post_id ) !== 'post_type_actors' ) {
    			return array(
    				'error' => 'Invalid post ID',
    			);
    		}
    
    		$wikidata = ( new Debug_Actors() )->check_actors_wikidata( $post_id );
    
    		return array( $wikidata );
    	}
    

    The return array is a list of the data we check for, and it either is a string of ‘matches’ /true, or it’s an array with WikiData’s value and our value.

    Making a Sidebar

    Since we have our API already, we can jump to making a sidebar. Traditionally in Gutenberg, we make a sidebar panel for the block we’re adding in. If you want a custom panel, you can add in one with an icon on the Publish Bar:

    A screenshot of the Gutenberg Publish bar, with the Jetpack and YoastSEO icons

    While that’s great and all, I wanted this to be on the side by default for the actor, like Categories and Tags. Since YoastSEO (among others) can do this, I knew it had to be possible:

    Screenshot of the Gutenberg Sidebar, with YoastSEO's custom example.

    But when I started to search around, all anyone told me was how I had to use a block to make that show.

    I knew it was bullshit.

    Making a Sidebar – The Basics

    The secret sauce I was looking for is decidedly simple.

    	const MetadataPanel = () => (
    		<PluginDocumentSettingPanel
    			name="lwtv-wikidata-panel"
    			title="WikiData Checker"
    			className="lwtv-wikidata-panel"
    		>
    			<PanelRow>
    				<div>
    					[PANEL STUFF HERE]
    				</div>
    			</PanelRow>
    		</PluginDocumentSettingPanel>
    	);
    

    I knew about PanelRow but finding PluginDocumentSettingPanel took me far longer than it should have! The documentation doesn’t actually tell you ‘You can use this to make a panel on the Document settings!’ but it is obvious once you’ve done it.

    Making it Refresh

    This is a pared down version of the code, which I will link to at the end.

    The short and simple way is I’m using UseEffect to refresh:

    useEffect(() => {
    		if (
    			postId &&
    			postType === 'post_type_actors' &&
    			postStatus !== 'auto-draft'
    		) {
    			const fetchData = async () => {
    				setIsLoading(true);
    				try {
    					const response = await fetch(
    						`${siteURL}/wp-json/lwtv/v1/wikidata/${postId}`
    					);
    					if (!response.ok) {
    						throw new Error(
    							`HTTP error! status: ${response.status}`
    						);
    					}
    					const data = await response.json();
    					setApiData(data);
    					setError(null);
    				} catch (err) {
    					setError(err.message);
    					setApiData(null);
    				} finally {
    					setIsLoading(false);
    				}
    			};
    			fetchData();
    		}
    	}, [postId, postType, postStatus, siteURL, refreshCounter]);
    

    The reason I’m checking post type and status, is that I don’t want to try and run this if it’s not an actor, and if it’s not at least a real draft.

    The constants are as follows:

    	const [apiData, setApiData] = useState(null);
    	const [isLoading, setIsLoading] = useState(true);
    	const [error, setError] = useState(null);
    

    Right below this I have a second check:

    	if (postType !== 'post_type_actors') {
    		return null;
    	}
    

    That simply prevents the rest of the code from trying to run. You have to have it after the UseEffect because JS is weird and does things in an order. If you have a return before it, it fails to pass a lint (and I enforce linting on this project).

    How it works is on page load of an auto-draft, it tells you to save the post before it will check. As soon as you do save the post (with a title), it refreshes and tells you what it found, speeding up initial data entry!

    But then there’s the issue of refreshing on demand.

    HeartBeat Flatline – Use a Button

    I did, at one point, have a functioning heartbeat checker. That can get pretty expensive and it calls the API too many times if you leave a window open. Instead, I made a button that uses a constant:

    const [refreshCounter, setRefreshCounter] = useState(0);
    

    and a handler:

    	const handleRefresh = () => {
    		setRefreshCounter((prevCounter) => prevCounter + 1);
    	};
    

    Then the button itself:

    &lt;Button
    	variant="secondary"
    	onClick={handleRefresh}
    	isBusy={isLoading}
    >
    	{isLoading ? 'Refreshing...' : 'Refresh'}
    &lt;/Button>
    

    Works like a champ.

    Output the Data

    The data output is the interesting bit, because I’m still not fully satisfied with how it looks.

    I set up a filter to process the raw data:

    	const filteredPersonData = (personData) => {
    		const filteredEntries = Object.entries(personData).filter(
    			([key, value]) => {
    				const lowerCaseValue = String(value).toLowerCase();
    				return (
    					lowerCaseValue !== 'match' &&
    					lowerCaseValue !== 'n/a' &&
    					!['wikidata', 'id', 'name'].includes(key.toLowerCase())
    				);
    			}
    		);
    		return Object.fromEntries(filteredEntries);
    	};
    

    The API returns the WikiData ID, the post ID, and the name, none of which need to be checked here, so I remove them. Otherwise it capitalizes things so they look grown up.

    Then there’s a massive amount of code in the panel itself:

    <div>
    	{isLoading && <Spinner />}
    	{error && <p>Error: {error}</p>}
    	{!isLoading && !error && apiData && (
    		<>
    			{apiData.map((item) => {
    				const [key, personData] = Object.entries(item)[0];
    				const filteredData = filteredPersonData(personData);
    				return (
    					<div key={key}>
    						<h3>{personData.name}</h3>
    						{Object.keys(filteredData).length ===
    						0 ? (
    							<p>[All data matches]</p>
    						) : (
    							<div>
    								{Object.entries( filteredData ).map(([subKey, value]) => (
    									<div key={subKey}>
    										<h4>{subKey}</h4>
    										{value && (
    											<ul>
    												{Object.entries( value ).map( ([ innerKey, innerValue, ]) => (
    													<li key={ innerKey }>
    														<strong>{innerKey}</strong>:{' '} <code>{innerValue || 'empty'}</code>
    													</li>
    												)
    										)}
    											</ul>
    										)}
    										{!value && 'empty'}
    									</div>
    								))}
    							</div>
    						)}
    					</div> );
    			})}
    		</>
    	)}
    
    	{!isLoading && !error && !apiData && (
    		<p>No data found for this post.</p>
    	)}
    	<Button />
    </div>
    

    <Spinner /> is from '@wordpress/components' and is a default component.

    Now, innerKey is actually not a simple output. I wanted to capitalize the first letter and unlike PHP, there’s no ucfirst() function, so it looks like this:

    {innerKey .charAt( 0 ).toUpperCase() + innerKey.slice( 1 )}

    Sometimes JavaScript makes me want to drink.

    The Whole Code

    You can find the whole block, with some extra bits I didn’t mention but I do for quality of life, on our GitHub repo for LezWatch.TV. We use the @wordpress/scripts tooling to generate the blocks.

    The source code is located in folders within /src/ – that’s where most (if not all) of your work will happen. Each new block gets a folder and in each folder there must be a block.json file that stores all the metadata. Read Metadata in block.json if this is your first rodeo.

    The blocks will automagically build anytime anyone runs npm run build from the main folder. You can also run npm run build from the blocks folder.

    All JS and CSS from blocks defined in blocks/*/block.json get pushed to the blocks/build/ folder via the build process. PHP scans this directory and registers blocks in php/class-blocks.php. The overall code is called from the /blocks/src/blocks.php file.

    The build subfolders are NOT stored in Git, because they’re not needed to be. We run the build via actions on deploy.

    What It Looks Like

    Gif showing how it auto-loads the data on save.

    One of the things I want to do is have a way to say “use WikiData” or “use ours” to fill in each individual data point. Sadly sometimes it gets confused and uses the wrong person (there’s a Katherine with an E Hepburn!) so we do have a QID override, but even so there can be incorrect data.

    WikiData often lists socials and websites that are defunct. Mostly that’s X these days.

    Takeaways

    It’s a little frustrating that I either have to do a complex ‘normal’ custom meta box with a lot of extra JS, or make an API. Since I already had the API, it’s no big, but sometimes I wish Gutenberg was a little more obvious with refreshing.

    Also finding the right component to use for the sidebar panel was absolutely maddening. Every single document was about doing it with a block, and we weren’t adding blocks.

    Finally, errors in Javascript remain the worst. Because I’m compiling code for Gutenberg, I have to hunt down the likely culprit, which is hard when you’re still newish to the code! Thankfully, JJ from XWP was an angel and taught me tons in my 2 years there. I adore her.

  • Small Hugo, Big Images

    Small Hugo, Big Images

    In working on my Hugo powered gallery, I ran into some interesting issues, one of which was from my theme.

    I use a Bootstrap powered theme called Hinode. And Hinode is incredibly powerful, but it’s also very complicated and confusing, as Hugo’s documentation is still in the alpha stage. It’s like the early days of other web apps, which means a lot of what I’m trying to do is trial and error. Don’t ask me when I learned about errorf, okay?

    My primary issues are all about images, sizing and filing them.

    Image Sizes

    When you make a gallery, logically you want to save the large image as the zoom in, right? Click to embiggen. The problem is, in Hinode, you can load an image in a few ways:

    1. Use the standard old img tag
    2. Call the default Hugo shortcode of {{< figure >}}
    3. Call a Hinode shortcode of {{< image >}}
    4. Use a partial

    Now, that last one is a little weird, but basically you can’t us a shortcode inside a theme file. While WordPress has a do_shortcode() method, you use partial calls in Hugo. And you have to know not only the exact file, but if your theme even uses partials! Some don’t, and you’re left reconstructing the whole thing.

    Hinode has the shortcodes in partials and I love them for it! To call an image using the partial, it looks like this:

                        {{- partial "assets/image.html" (dict
                                 "url" $imgsrc
                                 "ratio" "1x1"
                                 "wrapper" "mx-auto"
                                 "title" $title)
                         -}}
    

    That call will generate webp versions of my image, saved to static image folder (which is a post of its own), and have the source sets so it’s handy and responsive.

    What it isn’t is resized. Meaning if I used that code, I would end up with the actual huge ass image used. Now, imagine I have a gallery with 30 images. That’s 30 big ass images. Not good. Not good for speed, not good for anyone.

    I ended up making my own version of assets/image.html (called lightbox-image.html) and in there I have this code:

     {{ with resources.Get $imagefile }}
         {{ $image    = .Fill "250x250" }}
         {{ $imagesrc = $image.RelPermalink }}
     {{ end }}
    

    If the file is local, which is what that get call is doing, it uses the file ($imagefile is the ‘path’ to the file) to make a 250×250 sized version and then grabs that new permalink to use.

    {{ if $imagefile }}
         &lt;img src="{{ $imagesrc }}" class="img-fluid img-lightbox" alt="{{ $title }}" data-toggle="lightbox" data-gallery="gallery" data-src="{{ $imagefile }}">
     {{ end }}
    

    Boom!

    This skips over all the responsive resizing, but then again I don’t need that when I’m making a gallery, do I?

    Remote Image Sizes

    Now let’s add in a wrinkle. What if it’s a remote image? What if I passed a URL of a remote image? For this, you need to know that on build, that Hinode code will download the image locally. Local images load faster. I can’t use the same get, I need the remote get, but now I have a new issue!

    Where are the images saved? In the img folder. No subfolders, the one folder. And I have hundreds of images to add.

    Mathematically speaking, you can put about four billion files in a folder before it’s an issue for the computers. But if you’ve ever tried to find a specific file to check in a folder that large, you’ve seriously reconsidered your career trajectory. And practically speaking, the more files, the slower the processing.

    Anyone else remember when GoDaddy announced a maximum of 1024 files in a folder on their shared hosting? While I question the long term efficacy of that, I do try to limit my files. I know that using the get/remote get calls with tack on a randomized name at the end, but I’d like them to be organized.

    Since I’m calling all my files from my assets server (assets.example.com), I can organize them there and replicate that in my build. And my method to do that is as follows:

         {{ if eq $image "" }}
             {{- $imageurl = . | absURL -}}
             {{- $imagesrc = . | absURL -}}
     
             {{ $dir := (urls.Parse $imageurl).Path }}
    
             {{ with resources.GetRemote $imageurl | resources.Copy $dir }}
                 {{ with .Err }}
                     {{ warnf "%s" . }}
                 {{ else }}
                     {{ $image    = . }}
                     {{ $imageurl = $image.Permalink }}
                     {{ $image    = $image.Fill "250x250" }}
                     {{ $imagesrc = $image.RelPermalink }}
                 {{ end }}
             {{ end }}
         {{ end }}
    

    I know that shit is weird. It pairs off the earlier code. If you don’t create the image variable, then you know the image wasn’t local. So I started by getting the image url from ‘this’ (that’s what the period is) as an absolute url. Then I used the path of the url to generate my local folder path! When I use the Copy command with the pipe, it will automatically use that as the destination.

    Conclusion

    You can totally make images that are resized in Hugo. While I wish that was easier to do, most people aren’t as worried as I am about storing the images on their repository, so it’s less of an issue. Also galleries on Hugo are fairly rare.

    I’m starting some work now with Hinode for better gallery-esque support, and we’ll see where that goes. Maybe I’ll get a patch in there!

  • Hugo and a Lot of Images

    Hugo and a Lot of Images

    One issue with Hugo is that the way I’m deploying it is via Github actions, which means every time I want to update, the site has to be totally rebuilt. Now the primary flaw with that process is that when Hugo builds a lot of images, it takes a lot of time. About 8 minutes.

    The reason Hugo takes this long is that every time it runs its builds, it regenerates all the images and resizes them. This is not a bad thing, since Hugo smartly caches everything in the /resources/_gen/ folder, which is not sync’d to Github, and when you run builds locally it doesn’t take half as long.

    Now, this speed is about the same whether the images are locally (as in, stored in the repository) or remote (which is where mine are located – assets.example.com), because regardless it has to build the resized images. This only runs on a build, since it’s only needed for a build. Once the content is on the server, it’s unnecessary.

    The obvious solution to solve my speed issues would be to include the folder in Github, only I don’t want to store any images on Github if I can help it (legal reasons, if there’s a DMCA its easier to nuke them from my own storage). The less obvious solution is how we got here.

    The Basic Solution

    Here’s your overview:

    1. Checkout the repo
    2. Install Hugo
    3. Run the repo installer (all the dependancies etc)
    4. Copy the files from ‘wherever’ to the Git resource
    5. Run the build (which will use what’s in the resource folder to speed it up)
    6. Copy the resources folder content back down to the server

    This would allow me to have a ‘source of truth’ and update it as I push code.

    The Setup

    To start with, I had to decide where to upload the content. The folder is (right now) about 500 megs, and that’s only going to get bigger. Thankfully I have a big VPS and I was previous hosting around 30 gigs there, so I’m pretty sure this will be okay.

    But the ‘where’ specifics needed a little more than that. I went with a subdomain like secretplace.example.com and in there is a folder called /resources/_gen/

    Next, how do I want to upload to for starters? I went with only uploading the static CSS files because my plan involves pushing things back down after I re-run the build.

    Then comes the downloading. Did you know that there’s nearly no documentation about how to rsync from a remote source to your Github Action instance? It doesn’t help that the words are all pretty generic, and search engines think “Oh you want to know about rsync and a Github Action? You must want to sync from your action to your server!” No, thank you, I wanted the opposite.

    While there’s a nifty wrapper for syncing over SSH for Github, it only works one way. In order to do it the other way, you have to understand the actual issue that action is solving. The SSH-sync isn’t solving rsync at all, that’s baked in to the action image (assuming you’re using ubuntu…). No, what the action solves is the mishegas of adding in your SSH details (the key, the known hosts, etc).

    I could use that action to copy back down to the server, but if you’re going to have to solve the issue once, you may as well use it all the time. Once that’s solved, the easy part begins.

    Your Actions

    Once we’ve understood where we’re going, we can start to get there.

    I’ve set this up in my ci.yml, which runs on everything except production, and it’s a requirement for a PR to pass it before it can be merged into production. I could skip it (as admin) but I try very hard not to, so I can always confirm my code will actually push and not error when I run it.

    name: 'Preflight Checks'
    
    on:
      push:
        branches:
          - '!production'   # excludes production.
    
    concurrency:
         group: ${{ github.ref }}-ci
         cancel-in-progress: true
    
    jobs:
      preflight-checks:
        runs-on: ubuntu-latest
    
        steps:
          - name: Do a git checkout including submodules
            uses: actions/checkout@v4
            with:
              submodules: true
    
          - name: Install SSH Key
            uses: shimataro/ssh-key-action@v2
            with:
              key: ${{ secrets.SERVER_SSH_KEY }}
              known_hosts: unnecessary
    
          - name: Adding Known Hosts
            run: ssh-keyscan -H ${{ secrets.REMOTE_HOST }} >> ~/.ssh/known_hosts
    
          - name: Setup Hugo
            uses: peaceiris/actions-hugo@v3
            with:
              hugo-version: 'latest'
              extended: true
    
          - name: Setup Node and Install
            uses: actions/setup-node@v4
            with:
              node-version-file: '.nvmrc'
              cache: 'npm'
    
          - name: Install Dependencies
            run: npm install && npm run mod:update
    
          - name: Lint
            run: npm run lint
    
          - name: Make Resources Folder locally
            run: mkdir resources
    
          - name: Download resources from server
            run: rsync -rlgoDzvc -i ${{ secrets.REMOTE_USER }}@${{ secrets.REMOTE_HOST }}:/home/${{ secrets.REMOTE_USER }}/${{ secrets.HUGO_RESOURCES_URL }}/ resources/
    
          - name: Test site
            run: npm run tests
    
          - name: Copy back down all the regenerated resources
            run: rsync -rlgoDzvc -i --delete resources/ ${{ secrets.REMOTE_USER }}@${{ secrets.REMOTE_HOST }}:/home/${{ secrets.REMOTE_USER }}/${{ secrets.HUGO_RESOURCES_URL }}/
    

    Obviously this is geared towards Hugo. My command npm run tests is a home-grown command that runs a build and then some tests on said build. It’s separate from the linting, which comes with my theme. Because it’s running a build, this is where I can make use of my pre-built resources.

    You may notice I set known_hosts to ‘unnecessary’ — this is a lie. They’re totally needed but I had a devil of a time making it work at all, so I followed the advice from Zell, who had a similar headache, and put in the ssh-keyscan command.

    When I run my deploy action, it only runs the build (no tests), but it also copies down the resources folder to speed it up. I only copy it back up on testing for the teeny speed boost.

    Results

    Before all this, my builds took 8 to 9 minutes.

    After they took 1 to 2, which is way better. Originally I only had it down to 4 minutes, but I was using wget to test things (and that’s generally not a great idea — it’s slow). Once I switched to rsync, it’s incredibly fast. The build of Hugo is still the slowest part, but it’s around 90 seconds.

  • Looping LinksWith The WP HTML Processor

    Looping LinksWith The WP HTML Processor

    Here’s your backstory.

    You need to search all the links in a post and, if the link is to a specific site (wikipedia.com) you want to add it to an array you output at the bottom of your post, as citations. To do this, you will:

    1. Search for tags for every single link (<a href=.... )
    2. If the link contains our term (Wikipedia), put it in the array
    3. If the link also has a title, we’ll use that

    If we do all that, our output looks something like this:

    • Source: https://en.wikipedia.com/wiki/foobar
    • Source: Foobar2

    While you can do this with regex, you can also use the (new) HTML Processor class to do it for you.

    RegEx

    As I mentioned, you can do this with regex (I’ll spare you the drama of coming up with this in the first place):

    $citations = array();
    
    preg_match_all( '#<\s*a[^>]*href="([^"]+)"[^>]*>.*?<\s*/\s*a>#', get_the_content(), $matches );
    
    // Loop through the matches:
    foreach ( $matches[0] as $i => $match ) {
    
        // If the URL contains WikiPedia, we'll process:
        if ( str_contains( $match, 'wikipedia.com') ) {
    
            // Build initial data:
            $current_citation =[
                'url'   => $matches[1][$i],
                'title' => $matches[1][$i],
            ];
    
            // If there's a title, use it.
            if ( str_contains( $match, 'title=' ) ) {
                  $title_match = preg_match( '#<\s*a[^>]*title="([^"]+)"*[^>]*>.*?<\s*/\s*a>#', $match, $title_matches );
                  $current_citation['title'] = ( ! empty( $title_matches[1] ) ) ? $title_matches[1] : $current_citation['title'];
            }
        }
    
         $citations[] = $current_citation;
    }
    
    ?>
    <ol>
        <?php foreach ( $citations as $citation ): ?>
            <li itemprop="citation" itemscope itemtype="https://schema.org/CreativeWork">
                 Source: <a rel="noopener noreferrer external" itemprop="url" class="wikipedia-article-citations__url" target="_blank" href="<?php echo esc_url( $citation['url'] ) ?>"><?php echo esc_html( $citation['title'] ) ?></a>
             </li>
         <?php endforeach; ?>
    </ol>
    

    This is a very over simplified version, but the basis is sound. This will loop through the whole post, find everything with a URL, check if the URL includes wikipedia.com and output a link to it. If the editor added in a link title, it will use that, and if not, it falls back to the URL itself.

    But … a lot of people will tell you Regex is super powerful and a pain in the ass (it is). And WordPress now has a better way to do this, that’s both more readable and extendable.

    HTML Tag Processor

    Let’s try this again.

    What even is this processor? Well it’s basically building out something similar to DOM Nodes of all your HTMLin a WordPress post and letting us edit them. They’re not really DOM nodes, though, they’re a weird little subset, but if you think of each HTML tag as a ‘node’ it may help.

    To start using it, we’re going to ditch regex entirely, but we still want to process our tags from the whole content, so we’ll ask WordPress to use the new class to build our our tags:

    $content_tags = new WP_HTML_Tag_Processor( get_the_content() );

    This makes the object which also lets us use all the children functions. In this case, we know we want URLs so we can use next_tag() to get things:

    $content_tags->next_tag( 'a' );

    This finds the next tag matching our query of a which is for links. If we were only getting the first item, that would be enough. But we know we have multiple links in posts, so we’re going to need to loop. The good news here is that next_tag() in and of itself can keep running!

    while( $content_tags->next_tag( 'a' ) ) {
        // Do checks here
    }
    

    That code will actually run through every single link in the post content. Inside the loop, we can check if the URL matches using get_attribute():

    if ( str_contains( $content_tags->get_attribute( 'href' ), 'wikipedia.com' ) ) {
        // Do stuff here
    }
    

    Since the default of get_attribute() is null if it doesn’t exist, this is a safe check, and it means we can reuse it to get the title:

    if ( ! is_null( $content_tags->get_attribute( 'title' ) ) ) {
        // Change title here
    }
    

    And if we apply all this to our original code, it now looks very different:

    Example:

    		// Array of citations:
    		$citations = array();
    
    		// Process the content:
    		$content_tags = new WP_HTML_Tag_Processor( get_the_content() );
    
    		// Search all tags for links (a)
    		while( $content_tags->next_tag( 'a' ) ) {
    			// If the href contains wikipedia, build our array:
    			if ( str_contains( $content_tags->get_attribute( 'href' ), 'wikipedia.com' ) ) {
    				$current_citation = [
    					'url'   => $content_tags->get_attribute( 'href' ),
    					'title' => $content_tags->get_attribute( 'href' ),
    				];
    
    				// If title is defined, replace that in our array:
    				if ( ! is_null( $content_tags->get_attribute( 'title' ) ) ) {
    					$current_citation['title'] = $content_tags->get_attribute( 'title' );
    				}
    
    				// Add this citation to the main array:
    				$citations[] = $current_citation;
    			}
    		}
    
    		// If there are citations, output:
    		if ( ! empty( $citations ) ) :
    			// Output goes here.
    		endif;
    

    Caveats

    Since we’re only searching for links, this is pretty easy. There’s a decent example on looking for multiple items (say, by class and span) but if you read it, you realize pretty quickly that you have to be doing the exact same thing.

    If you wanted to do multiple loops though, looking for all the links but also all span classes with the class ‘wikipedia’ you’d probably start like this:

    while ( $content_tags->next_tag( 'a' ) ) {
        // Process here
    }
    
    while ( $content_tags->next_tag( 'span' ) ) {
        // Process here
    }
    

    The problem is that you would only end up looking for any spans that happened after the last link! You could go a more complex search and if check, but they’re all risky as you might miss something. To work around this, you’ll use set_bookmark() to set a bookmark to loop back to:

    $content_tags = new WP_HTML_Tag_Processor( get_the_content() );
    $content_tags->next_tag();
    
    // Set a bookmark:
    $content_tags->set_bookmark( 'start' ); 
    
    while ( $content_tags-> next_tag( 'a' ) ) {
        // Process links here.
    }
    
    // Go back to the beginning:
    $content_tags->seek( 'start' ); 
    
    while ( $content_tags->next_tag( 'span' ) ) {
        // Process span here.
    }
    
    

    I admit, I’m not a super fan of that solution, but by gum, it sure works!

  • Interlude: Gutenberg Moves Fast

    Interlude: Gutenberg Moves Fast

    I’m taking a pause on my plugin posts to talk about Gutenberg.

    I really love Gutenberg. I’m not kidding! I find it far more enjoyable to write (stories) in plain apps (I used Apple Pages because it syncs between laptop and iPad, yes, I am often using my iPad to write my novel, yes, I will let the world know when it’s done). But when I write for the web, it’s a more visual medium, and Gutenberg is fantastic to represent what I’m writing as it will be properly seen by all!

    But.

    Gutenberg moves fast. Hella fast. So fast it can leave you in the dust, and it has a critical flaw that I feel has been stifling it’s growth and usage among developers.

    JS isn’t your Momma’s PHP

    This is obvious. Javascript ain’t PHP. PHP is a simple language that can be coerced into doing complex things if you understand basic algebra. Surprise! Everyone who considers themselves good at PHP? You’ve mastered the concepts of algebra! Alegrba is one of the easier ‘complex’ mathematic concepts to wrap your head areoud. You get “if a + b = c and a = 10 and c = 11 then b = 1” and you win!

    Javascript though, it’s a little more like calculus and trig, in that you have to understand the formulas a little deeper, and they have that thing where not just numbers and letters appear, but weird symbols.

    [nb: Like all analogies, this falls apart at scale, don’t read too much into it.]

    For the thousands of developers who whet their teeth on PHP, jumping into JS feels like you’re a first-year high schooler in senior maths! It’s scary, it’s complicated, and worst of all … it isn’t actually documented at the micro level, because it’s generally compiled.

    Micro vs Macro / Interpreted vs Compiled

    The macro scale is, more or less, the big picture of what your code is supposed to do. Micro would be each individual element. For PHP, you can clearly identify both the macro (the overall function) and the micro (each teeny process in that function). This is less so for JS because the languages are different.

    There are two primary types of code languages. PHP is what we call an interpreted language, because while the PHP binary is a compiled app, what you write is interpreted by the compiler. Basic JS (like jQuery) is also an interpreted language!

    Compiled languages need a “build” step – they need to be manually compiled first. And if that suddenly made you think “Wait, Gutenberg is JS but I have to build it!” then you have spotted the quirk! The JS we use in Gutenberg is actually JSX!

    JSX was designed for React (which is what we use to build in Gutenberg) and while it may contain some plain Javascript, it’s impossible to use the code without React. That’s why we have the build process, it takes the JSX, compiles it into JS, and saves it to a file.

    The Compilation Downfall

    This is where it gets messy … messier.

    When there’s an error in PHP, we get the error message either on the page or in our logs, depending on how we set up our environment. I personally pipe things to debug.log and just keep that file up as I bash on things. Those errors tend to be incredibly helpful!

    $mastodon not defined on /path/to/file.php:123

    In that example, I know “Ooops, I’m calling the variable $mastodon on line 123 of file.php and forgot to declare it!” Either I need an isset() check or (in this case) I brain farted and copied a line but forgot to rename the variable so I was setting $tumblr twice. Mea culpa, pop in, edit, save, done.

    On the other hand, I was testing out some blocks and modernizing them a little when suddenly … the block didn’t load. I got the WP notice of the block had an error. You’ve probably seen this if you’re a dev:

    Example of an error which says "This block has encountered an error and cannot be previewed"

    or this:

    An error: This block contains unexpected or invalid content.

    And if you’re like me, you used foul language and wondered ‘well… now what.’

    Enter the Console

    Unlike PHP, the errors don’t go to a nice debug.log file, it goes to your in-browser console. This is because, again, PHP is being directly interpreted on the server, and the server happily converts the PHP to HTML and Bob’s your uncle.

    JS (and JSX in this case) aren’t processed by the server. They’re processed on the fly in the browser. If you’ve ever wondered why too much JS, or bad JS, cause your browser to hang, that’s why. We moved the processing from the server (PHP) to the browser. On top of that, it’s also why JS content isn’t really cachable by traditional methods! But that’s another story.

    In this case, I got the first error (cannot be previewed) and being somewhat savvy with the world of Gutes, I popped open the console and saw this gem:

    wp.blockEditor.RichText value prop as children type is deprecated

    The rest of the message was warning me that the thingy would be removed in WP 6.3, and it had a link to ‘help’ resolve it. Spoilers? It didn’t. But take a deep breath. Let’s debug.

    Debugging Gutenberg

    The first issue was that the error came on a page with multiple blocks. I happened to be using a custom plugin I wrote that contains about 6 blocks, you see, so I opened a new page on localhost and added each block, one at a time, until I determined the issue was my incredibly simple spoiler block.

    How simple is this block? It’s basically a custom formatted paragraph, so everyone could use the same design without having to remember the exact colours. I could have made it a ‘reusable block’ on the site but, at the time, I wanted the practice.

    Next I went to that link, which was for “Introducing Attributes and Editable Fields“. I admit, I was a little confused, since I was already using attributes and editable fields! But I did the logical thing and searched that page for the word ‘children.’ My thought process was that if something was being deprecated, it would have a warning right?

    Gif from Blazing Saddles, where Dom DeLuise is a director and walks up to an actor who made a mistake. He uses his bullhorn to scream WRONG! at the man, and bops him in the head with the bullhorn.

    Okay, maybe I was looking in the wrong place. This error is specific to RichText so I clicked on the link to read the RichText Reference and again, looked for “children.” Nothing. Zip. Nada. I followed the link for the more in-depth details on GitHub and still nothing.

    At this point, I ranted on Mastodon because I was chapped off. I also popped open the Gutenberg Deprecations page, and looked for “children” but all I could find was a message to use children!

    RichText explicit element format removed. Please use the compatible children format instead.

    Logically there should be a note that “children is deprecated, please use…” but there is not.

    Now, here is where I accidentally stumbled on a fix, but after I made my fix is when I found the Github issue about this!

    If you are still using “children” or “node” sources in the block attribute definition, like so:

    content: {
     	type: 'array',
     	source: 'children',
     	selector: 'p',
     }
    

    Then change it to use the “html” source instead to get rid of the deprecation warning:

    content: {
     	type: 'string',
     	source: 'html',
     	selector: 'p',
     }
    

    And in fact, that was the correct fix.

    Here’s the Flaw

    None of that was properly documented.

    The link to ‘help’ fix the error didn’t mention the specific error, it talked about attributes at the MACRO level. I was (obviously) already using attributes, else I wouldn’t have had that error at all.

    There is no proper documentation that could help someone fix the issue on their own UNLESS they happened to be trawling through all the issues on GitHub.

    As I put it to my buddy, the reasons developers are salty about Gutenberg are:

    1. It changes pretty much every release
    2. There’s no real way to tell people if it impacts you so you have to check every release and read the console logs, which is not what devs are used to
    3. The JS console won’t tell you (I don’t know if it can) what file caused the warning, so finding it is a crap shoot
    4. The documentation is high level, which is not helpful when you get micro level errors

    Okay, can we fix it?

    At this point, if you’ve made a WordPress Block for Gutenberg, make sure you test every single release with the JS console open. If you don’t do this, you will have a rude awakening until things are made a little better.

    How can things be made better? It will have to begin with a culture shift. Traditionally WordPress has used a “release and iterate” model. With Gutenberg, we’ve become “move fast and break things,” but that is only sustainable if everything broken can be documented for a fix.

    That means I see only one way to correct this, and it’s to slow down Gutenberg enough that deprecations AND THEIR CORRECTIONS are properly documented, and the error messages link to a page about deprecations.

    We need to not link to the general “here’s how attributes work” page, but instead to a specific page that lists those deprecations along side the WordPress versions impacted.

    Another matter is we should be posting in the Field Guide about these things. Currently the 6.3 field guide links to all the various pages where you can find information, but that means you have to click and open each one and hopefully find your exact usage. In my case, those links to the ten Gutenberg versions being ported to core never mention the issue I had.

    If we don’t start slowing down and paving the road for developers, we will begin haemorrhaging the very people who make WordPress a success.

  • Zaptodon

    Zaptodon

    On a site, I use Zapier to automate a specific set of tasks. Every day, the website sets up a show/character of the day (I think you know what site this is…) and it posts that show/character to Twitter, Tumblr, Facebook and …. Mastodon.

    Or at least it does now.

    Zapier

    Caveat! I pay for this service. The webhooks used are in the starter package at $19.99/month annually.

    There’s a service, Zapier, that allows you to make incredibly complex if/then/else checks and performs an action. It’s not cheap, but at $250 a year it’s not expensive for my needs. Way back when, I picked it because I needed something that would let me actually script, and I recognized that running pushes like that all from my own server was a lot more work than it should be, what with all the libraries and so on.

    Seeing as that wasn’t driven by a post or any action except time, it had its own quirks. But then one day in May we all woke up and saw how dumb Twitter was. They priced WordPress.com and Automattic out of their own API!

    But not Zapier.

    Note: Automattic made the right choice here. With the API cost starting at $42,000 a month (yes, a month, and I can remember when I made that a year and thought I was hot shit), knowing how inexpensive Jetpack Pro/Premium/Whatever is, there was no way they could justify it.

    Zapier’s business model has a large chunk invested in pushing things to social (among all sorts of cool things, like calendar integration). So when I had to revisit how I posted new articles to Twitter anyway, I figured I’d wrangle Mastodon as well.

    The Zap Flow

    Overall, my flow looks like this:

    Screenshot of the Zapier flow, which shows: 1 - new item in RSS, 2 Post in Webhooks, 3 Create Tweet

    But what goes in each?

    The first one is a built in and easy trigger. It follows the RSS for the site and, when there’s a new article, off it goes.

    The third one is the tweet, which is about as straightforward as you might expect.

    The second one is Mastodon. That’s where we’re going to concentrate today.

    Add an App

    To do this, you need to create an ‘app’ for your Mastodon account. Log in to your instance (mine here is mstdn.social) and edit your profile. On that page, on the menu to the left, is an item called Development.

    On that page you’ll see a list of Applications, if you have any, and a button to create a New Application. That’s what we want to do today. Click on that button and you’ll get the basic data:

    New Application setup on Mastodon, asking for name and website.

    I put in “Zapier” and “https://zapier.com” as my

    Scroll further down and there are a bunch of options. You only need to have these two checked:

    • read:accounts
    • write:statuses 

    The read will let us know things worked, and write … I really hope that one is obvious for you, but you have to be able to write to post.

    Click create and you will be redirected to the Application page where it will now list your app to Zapier. Click on that and you’ll be show a page with the set up info, but now it has a Client Key and a Client secret.

    Example of the client keys and tokens.

    I clicked regenerate right after I took this screenshot.

    You’ll be able to get at this whenever you want, so don’t panic.

    Back to Zapier

    Over on Zapier, pop open your Zap. In my case, I had a pre-built one for Twitter, so I added this in by telling it I wanted to add an Action. Since I pay for Zapier, I have access to their premium webhook to post:

    Pretty clear I think. I need “Webhooks by Zapier” and the Event is POST. That’s telling Zapier what to do with the hook.

    The next part of the step is the Action and that has a lot of stuff. The first two are the URL and the Payload:

    The URL is going to be https://yourinstance.com/api/v1/statuses?access_token=[YourToken] — What’s your token? Remember that as the third item shown on the edit page for your Application over on your instance? Yep! Paste that in.

    I picked JSON for my payload since I’d been using it elsewhere. For the next part, I have to dig out the data. For Mastodon, you want your data ‘type’ to be status since that is literally telling it “I wanna make a status post!” and the content I made the description and the link.

    Example of STATUS and content.

    If you click on that box where I have description etc, it’ll pop up with more options!

    Example of other data you could insert.

    Pretty nifty! I left the rest as default:

    • Wrap Request In Array – no
    • File – empty
    • Unflatten – yes
    • Basic Auth – empty
    • Headers – empty

    Click on continue and you can test.

    Done!

    That’s it! Now your RSS feeds will auto post to Mastodon.

    I’m sure someone’s wondering “Why aren’t you using ActivityPub, Mika!?!” And the answer is… It doesn’t actually work on all hosts. ActivityPub requires you to be able to write to your .well_known/ folder and, currently, you cannot do that on DreamHost because it’s managed at the server level.

    This is not a wrong choice by either party! DreamHost (especially on DreamPress, the managed WP solution) wants to prevent you from breaking your SSL. Now, thanks to @diziara, there is a workaround if you can edit the .htaccess file in your .well_known folder:

    # Permit access to the challenge files but nothing else
    Order allow,deny
    Allow from all
    
    RewriteCond %{REQUEST_URI} ^/[.]well-known/webfinger+$
    RewriteRule .* /wp-json/activitypub/1.0/webfinger [L]
    
    RewriteCond %{REQUEST_URI} ^/[.]well-known/acme-challenge/[a-zA-Z0-9_-]+$
    RewriteRule .* - [L]
    
    RewriteRule .* - [F]
    

    Assuming your install is in root (mine is) you put that into the .htaccess and it works! I was surprised that it also let me edit on DreamPress, but I’m not sure if that will last. I’ll keep my support-thread updated though.

    And the other thing… I don’t want people to ‘follow’ my blog like that. I mean, you could, but also people follow me as me, and if I auto-post to ‘me’ then it works. Telling people to follow my blog and me is tricky since people are lazy (seriously we all are). But if that’s your thing, then yes, you absolutely can follow @ipstenu@halfelf.org and get all my articles.

    I’m still going to use a combination, since while I do want people to follow my blog, I suspect more will follow me instead. Also it’s easier to follow up on engagements (questions etc) if I’m watching ‘me’ and not two places. The other problem is it’s letting you follow ME at my blog. My other site has many more authors, and this isn’t quite right for that.

    The nice thing, though, is that there isn’t a single perfect answer for everyone’s use case. For most people, ActivityPub will work and they can be made discoverable. For the others, though, hold on until end of June. My friends at Automattic are planning to have post-to-mastodon support in their next iteration.

    I’ll still need my zaps since I post things that aren’t blog posts, but I’m looking forward to one less.