Half-Elf on Tech

Thoughts From a Professional Lesbian

Author: Ipstenu (Mika Epstein)

  • Custom Alexa Skills and WordPress

    Custom Alexa Skills and WordPress

    Before you start, please note that there is no Amazon Alexa SDK for PHP. And this is a big problem.

    An SDK is a software development kit that helps standardize the development of apps for the software. Using an SDK is basically using a standard library that everyone can access and call and not reinvent the wheel all the bloody time. And Amazon, for whatever reason, has decided that they’d rather push their Lambda hosting, which yes they charge you for, instead of clearly and cleanly document PHP code. Node.js? No problem. PHP? You’re on your own.

    Rant aside, I have now a custom Amazon Skill, self hosted, and powered by WordPress.

    Amazon’s Requirements

    On Monday I showed you how to build a very generic skill. It had no options, and it was actually missing a critical piece. You see, Amazon has six basic requirements to be an app:

    1. The service must be Internet-accessible.
    2. The service must adhere to the Alexa Skills Kit interface.
    3. The service must support HTTP over SSL/TLS, leveraging an Amazon-trusted certificate.
    4. The service must accept requests on port 443.
    5. The service must present a certificate with a subject alternate name that matches the domain name of the endpoint.
    6. The service must validate that incoming requests are coming from Alexa.

    The first five are pretty normal. If it’s not internet accessible, its not going to work. Same with the adherence to the skills kit interface. But that last one was surprisingly difficult and annoying. Mostly because of that lack of a standardized PHP SDK.

    Basically there isn’t a standard way to validate that incoming requests are coming from Alexa, but boy howdy, are there requirements.

    Validating Requests for Alexa

    While it says the requirement is to validate the requests, that’s only one aspect of the game. The three basic parts are these:

    • Verifying that the Request was Sent by Alexa
    • Checking the Signature of the Request
    • Checking the Timestamp of the Request

    And none of those are really well documented for PHP. Thanks.

    The Code

    In Monday’s post, I framed out the majority of the code that will be used. The change will be in this section:

    public function last_post_rest_api_callback( $data ) {
    	$response = $this->last_post();
    	return $response;
    }
    

    It now shows this:

    public function bury_your_queers_rest_api_callback( WP_REST_Request $request ) {
    	$date = ( isset( $request['request']['intent']['slots']['Date']['value'] ) )? $request['request']['intent']['slots']['Date']['value'] : false;
    	$validate_alexa = $this->alexa_validate_request( $request );
    	if ( $validate_alexa['success'] !== 1 ) {
    		$error = new WP_REST_Response( array( 'message' => $validate_alexa['message'], 'data' => array( 'status' => 400 ) ) );
    		$error->set_status( 400 );
    		return $error;
    	}
    	$response = $this->last_post( $date );
    	return $response;
    }
    

    This makes two changes. First it’s grabbing the date from the weirdly stored JSON POST from Alexa and passing it to my last_post function. That code I’m skipping since taking the date, parsing it, and changing your output from last_post is beyond the score. No, I’m going to concentrate on the alexa_validate_request function.

    You should take note of the success check if ( $validate_alexa['success'] !== 1 ) however. You must use a rest response with a 400 because Amazon is very picky.

    alexa_validate_request

    The brunt of the validation is to check if the URL came from Amazon, if the URL is on the certificate chain, if the certificate is legit, and finally if the request was made in the last 60 seconds. Which is a lot to look for.

    In order to write this function, I forked Rich Bowen’s Validate Echo request via PHP code for WordPress. This takes into account some WordPress code that isn’t otherwise available:

    function alexa_validate_request( $request ) {
    	$url            = $request->get_header( 'signaturecertchainurl' );
    	$timestamp      = $request['request']['timestamp'];
    	$signature      = $request->get_header( 'signature' );
    
    	// Validate that it even came from Amazon ...
    	if ( !isset( $url ) )
    		return array( 'success' => 0, 'message' => 'This request did not come from Amazon.' );
    
    	// Validate proper format of Amazon provided certificate chain url
    	$valid_uri = $this->alexa_valid_key_chain_uri( $url );
    	if ( $valid_uri != 1 )
    	    	return array( 'success' => 0, 'message' => $valid_uri );
    
    	// Validate certificate signature
    	$valid_cert = $this->alexa_valid_cert( $request, $signature, $url );
    	if ( $valid_cert != 1 )
    	    	return array ( 'success' => 0, 'message' => $valid_cert );
    
    	// Validate time stamp
    	if (time() - strtotime( $timestamp ) > 60)
    		return array ( 'success' => 0, 'message' => 'Timestamp validation failure. Current time: ' . time() . ' vs. Timestamp: ' . $timestamp );
    
    	// If there was no error, it's a success!
    	return array( 'success' => 1, 'message' => 'Success' );
    }
    

    Within that function, I reference two more: alexa_valid_key_chain_uri and alexa_valid_cert which parse the chain and validate the certificate.

    function alexa_valid_key_chain_uri( $keychainUri ){
    
        $uriParts = parse_url($keychainUri);
    
        if (strcasecmp( $uriParts['host'], 's3.amazonaws.com' ) != 0 )
            return ( 'The host for the Certificate provided in the header is invalid' );
    
        if (strpos( $uriParts['path'], '/echo.api/' ) !== 0 )
            return ( 'The URL path for the Certificate provided in the header is invalid' );
    
        if (strcasecmp( $uriParts['scheme'], 'https' ) != 0 )
            return ( 'The URL is using an unsupported scheme. Should be https' );
    
        if (array_key_exists( 'port', $uriParts ) && $uriParts['port'] != '443' )
            return ( 'The URL is using an unsupported https port' );
    
        return 1;
    }
    
    /*
        Validate that the certificate and signature are valid
    */
    function alexa_valid_cert( $request, $signature, $url ) {
    
        $md5pem     = get_temp_dir() . md5( $url ) . '.pem';
        $echoDomain = 'echo-api.amazon.com';
    
        // If we haven't received a certificate with this URL before,
        // store it as a cached copy
        if ( !file_exists( $md5pem ) ) file_put_contents( $md5pem, file_get_contents( $url ) );
    
        // Validate certificate chain and signature
        $pem = file_get_contents( $md5pem );
        $ssl_check = openssl_verify( $request->get_body(), base64_decode( $signature ), $pem, 'sha1' );
        if ($ssl_check != 1 ) return( openssl_error_string() );
    
        // Parse certificate for validations below
        $parsedCertificate = openssl_x509_parse( $pem );
        if ( !$parsedCertificate ) return( 'x509 parsing failed' );
    
        // Check that the domain echo-api.amazon.com is present in
        // the Subject Alternative Names (SANs) section of the signing certificate
        if(strpos( $parsedCertificate['extensions']['subjectAltName'], $echoDomain) === false) {
            return( 'subjectAltName Check Failed' );
        }
    
        // Check that the signing certificate has not expired
        // (examine both the Not Before and Not After dates)
        $validFrom = $parsedCertificate['validFrom_time_t'];
        $validTo   = $parsedCertificate['validTo_time_t'];
        $time      = time();
        if (!($validFrom <= $time && $time <= $validTo)) {
            return( 'certificate expiration check failed' );
        }
    
        return 1;
    }
    

    A Word About Testing…

    The problem with all this weird code is that the only way to test is to use Amazon’s testing platform and that doesn’t actually throw back errors. The testing environment is fun, because you can type in ‘when was last post’ and it prepends “Alexa, ask HalfElf…” for you. And it shows you exactly what JSON it’s passing to your API and what your API retuned.

    But…

    In the event your API throws an error, you don’t get to see what the error was. No, you get a message saying that the API returned an invalid output.

    Basically the Amazon API has no actual debugging if you’re trying to debug the connection requirements.

    There may have been a lot of swearing involved on my end.

  • Hello World: WordPress, the Rest API, and Alexa

    Hello World: WordPress, the Rest API, and Alexa

    I have a big issue with Amazon’s ‘documentation.’ Trying to learn how to do anything is akin to sniffing your cat’s butt to find out where the dog is. It’s written from a mindset I don’t share, it rarely has practical and functional examples, and the font is maddeningly small.

    How I learn best is by creating a “Hello World” type app. Even if it’s just copy and pasting, by doing that, my brain is able to follow the pathways and understand the logic steps.

    If you’re like me and have been swearing at Amazon just trying to make a simple ‘Hello World’ app for your Echo, here we go.

    The Outcome

    To summarize what we want here, is we would like to be able to do is turn to our Echos and say this:

    Hey Alexa, ask HalfElf what the last post is.

    And we want Alexa to reply:

    The last post on HalfElf was [Post Name]

    This is really simple on purpose. While eventually we want to be able to ask for a post on a specific date, we’re not there yet. You’ve got to run before you can walk.

    The Design

    Designing your API requires forethought. In a previous post, I named my flash briefing URL /MYSITE/v1/alexa-skills/briefing and, in keeping with that, this one will be named /MYSITE/v1/alexa-skills/last-post/

    You’ll need to hang on to your URL – https://example.com/wp-json/MYSITE/v1/alexa-skills/last-post/ – as you will need to put this into your Amazon Skill. This will be a custom skill and you’ll need to have the intent look like this:

    {
      "intents": [
        {
          "slots": [
            {
              "name": "Date",
              "type": "AMAZON.DATE"
            }
          ],
          "intent": "HalfElf"
        }
      ]
    }
    

    Remember, the goal is eventually to be able to use that date slot. We’re not right now, but be prepared, as Mr. Lehrer would say.

    With that in mind, the sample utterances look like this:

    HalfElf last post
    HalfElf post on {Date}
    HalfElf today's post
    HalfElf what the last post is
    

    You’ll notice I’m trying to think of every which way someone might ask the question. You need to do this. Alexa is very picky.

    The Rest API Code

    Once you’ve built out your Amazon Skill (and yes, that really is the easy part), you have to have the response. This is built off the same model I used before, and can be slipped in and shared.

    class MYSITE_Alexa_Skills {
    
    	public function __construct() {
    		add_action( 'rest_api_init', array( $this, 'rest_api_init') );
    	}
    
    	public function rest_api_init() {
    
    		register_rest_route( 'MYSITE/v1', '/alexa-skills/last-post/', array(
    			'methods' => [ 'GET', 'POST' ],
    			'callback' => array( $this, 'last_post_rest_api_callback' ),
    		) );
    	}
    
    	public function last_post_rest_api_callback( $data ) {
    		$response = $this->last_post();
    		return $response;
    	}
    
    	public function last_post() {
    
    		$query = new WP_Query( array( 'numberposts' => '1' ) );
    		if ( $query->have_posts() ) {
    			while ( $query->have_posts() ) {
    				$query->the_post();
    
    				$lastpost = 'The last post on HalfElf was .' get_the_title();
    			}
    			wp_reset_postdata();
    		}
    
    		$response = array(
    			'version'  => '1.0',
    			'response' => array (
    				'outputSpeech' => array (
    					'type' => 'PlainText',
    					'text' => $last_post,
    				),
    				'shouldEndSession' => true,
    			)
    		);
    
    		return $response;
    	}
    
    }
    new MYSITE_Alexa_Skills();
    

    As I said, it’s pretty simple. The output looks like this:

    {"version":"1.0","response":{"outputSpeech":{"type":"PlainText","text":"The last post on HalfElf was [POSTNAME]"},"shouldEndSession":true}}
    

    This is not the most efficient way to grab one post, but for the purposes of this example, it does get your head around the basic idea.

    Next Up?

    There are two issues with this code. First of all, it doesn’t meet Amazon’s requirements. Secondly, it doesn’t accept parameters. The first issue is much bigger, because as it turns out, Amazon requires you to check if the requests are coming from Amazon, are valid, and aren’t a bot-net attack. This is actually very smart, but very annoying, since they don’t make it easy to figure out how to do all that.

    But that’s next.

    If you hear screaming from California, it’s just me.

  • Tracking Changes On Sites

    Tracking Changes On Sites

    When you make a site dependent on others for your data, it’s important to be able to get updates on those sites promptly. Most of the time, a site has way to see what’s recently updated, be it by a page that lists what’s new, or an RSS feed, or an email list.

    But what happens when they don’t?

    Well. Then you need to look into monitors. And the bad news? Nothing is perfect. I’ve picked the top two services I tried over the month of May

    VisualPing

    If the content of a page is HTML only, then it’s great. But if you’re trying to monitor a highly dynamic javascript site, it can time out. Especially if the site has a lot of data. The interface of the site is nice, having a simple UX that was easy to understand. At the same time, it doesn’t handle abnormal well, and often wouldn’t tell me there were changes because simply it couldn’t tell.

    Overall, it was a disappointment for me and not useful for the javascript heavy page I was trying to monitor. As such, I’m not using it anymore.

    Versonista

    This is much better for a javascript heavy page that has a long load time. It can list out the URLs added that are new, and you can review the changes into the minutiae. But. The emails are incredibly inconsistent and the UX is overly complex. While I can go in and see what’s changed, down to the source-code, I’m supposed to get a daily email about that and I don’t. Also the options are too much. I just want to see what changed. A list of the changes, maybe a list of the new links. Instead I have to click around to figure out how to see the list better.

    Between that and the email situation, I’m unhappily still using it.

    Overall…

    The real issue I have is not with these services, but the fact that the webpage I’m trying to monitor was not intelligently designed. It’s trying to list everything on one page, using javascript, and sadly it’s not well optimized. I can’t even get the page to load properly on my iPad. The content is also not sortable. It’s always alphabetical, no matter what.

    My biggest takeaway from this is that with some content it makes sense to hard define your content. That is, sorting everything by name and not allowing it to be restored may make sense for many people. But you have to allow people an easy way to see what’s new if you want them to keep coming back.

  • Flash Briefing JSON Feed

    Flash Briefing JSON Feed

    The other day I mentioned a ‘solution’ to my problem of video enclosures would also be to use a JSON feed. As much as I’d like to tell you just to use JSON feed, you can’t because their specs don’t match Amazon’s.

    The creation of a JSON Feed that does match their specs is somewhat peculiar, but still straightforward. I went with making a JSON API output instead of making true feed, since frankly I don’t think Amazon’s all that consistent with their own spec, and I’ll need to tweak it later. I’d like to do so without breaking everything else.

    The Code

    class MYSITE_Alexa_Skills {
    
    	/**
    	 * Constructor
    	 */
    	public function __construct() {
    		add_action( 'rest_api_init', array( $this, 'rest_api_init') );
    	}
    
    	/**
    	 * Rest API init
    	 *
    	 * Creates callbacks
    	 *   - /MYSITE/v1/alexa-skills/briefing
    	 */
    	public function rest_api_init() {
    
    		// Skills
    		register_rest_route( 'MYSITE/v1', '/alexa-skills/briefing/', array(
    			'methods' => 'GET',
    			'callback' => array( $this, 'flash_briefing_rest_api_callback' ),
    		) );
    	}
    
    	/**
    	 * Rest API Callback for Flash Briefing
    	 */
    	public function flash_briefing_rest_api_callback( $data ) {
    		$response = $this->flash_briefing();
    		return $response;
    	}
    
    	/**
    	 * Generate the Flash Briefing output
    	 */
    	public function flash_briefing() {
    		$query = new WP_Query( array( 'numberposts' => '10' ) );
    		if ( $query->have_posts() ) {
    			while ( $query->have_posts() ) {
    				$query->the_post();
    
    				$response = array(
    					'uid'            => get_the_permalink(),
    					'updateDate'     => get_post_modified_time( 'Y-m-d\TH:i:s.\0\Z' ),
    					'titleText'      => get_the_title(),
    					'mainText'       => get_the_excerpt(),
    					'redirectionUrl' => home_url(),
    				);
    
    				$responses[] = $response;
    			}
    			wp_reset_postdata();
    		}
    
    		if ( count( $responses ) === 1 ) $responses = $responses[0];
    
    		return $responses;
    	}
    }
    new MYSITE_Alexa_Skills();
    

    Some Notes…

    This is built out with the assumption I will later be adding more information and skills to this site. That’s why the class is named for the skills in general and has the rest route set up for sub-routines already. If that’s not on your to-do, you can simplify.

    I also made the point to strip out the possibility of a StreamURL, which I don’t plan to use at all on this site. If you do, I recommend having a look at VoiceWP’s briefing.php file which does a nice job with handling that.

  • Alexa Flash Briefing Skills and Video Enclosures

    Alexa Flash Briefing Skills and Video Enclosures

    One of my goals this year, aided by the inimitable Chris Lema, was to make an Amazon Echo app.

    There’s a lot more to the whole plan, but I want to start with the simple stuff first. So the very first step is that I want to make a “Flash Briefing” app. That will allow people to get the latest posts from my site.

    For the most part, this is trivial. Creating a Flash Briefing Skill is fairly well documented and essentially is this:

    1. Make an account
    2. Create a Skill
    3. Point it to your RSS feed
    4. Give it an icon

    And that works great. Unless, of course, you have videos in a post.

    You see, when I went to add my RSS feed, I got this rather useless error:

    Error: Item [https://example.com/?p=9619] doesn't contain a valid Stream Url. Video Url (if present) must be accompanied with a valid Stream Url in an Item.
    

    What Went Wrong?

    The error was caused by having a video in a post. Now, I need to stress the stupidity here. I have a video inside the post. It’s not a video post, it just has an embedded video because it was contextually needed.

    Logically I googled the error and came up empty. This did not surprise me. I’ve been resigned to learn that Amazon is not actually very helpful with their UX or error messages. I’m not sure why this is but their tech UX, the stuff made for developers not the devices made for end-users, tend to be incredibly poorly designed and ill documented for new people.

    That said, I understood the error was reflecting on a ‘video’ URL, and I had a video in that specific post. I removed the video, tested, and it worked. Ergo the error was caused by the video’s existence. But as it happened, Stream URL had nothing to do with it.

    It Was Elements

    The real issue was found when I read through the feed format details which had mention of a need, for audio content, an “URL specifying the location of audio content for an audio feed.”

    This wasn’t an audio file, but the example for a JSON feed was to include a “streamUrl” value. Oh. And for RSS? An “enclosure element with type attribute set to audio/mpeg”

    This had to be related.

    When I looked at my RSS feed, however, I saw this:

    <enclosure url="https://example.com/path/to/myvideo.mp4" length="3120381" type="video/mp4" />
    

    Wasn’t that what I needed?

    A Second Enclosure

    Apparently the flash briefing RSS code is stupid and thinks that any enclosure has to have the “audio/mpeg” type. So how do I add in this?

    <enclosure url="https://example.com/path/to/myvideo.mp4" length="3120381" type="audio/mpeg" />
    

    By the way yes I reported this to them as a bug. Anyway, the first attempt at fixing this was for me to add a new custom post meta for the enclosure like this:

    
    3120381
    audio/mpeg
    

    That auto-added the proper enclosure code because WordPress knows what it’s doing. Once I was sure that worked, I filed the full bug report and then went the other way.

    Remove The Enclosures

    This is not something I generally recommend. However if you’re not podcasting or vlogging and you have no need to encourage people to download your videos and media via RSS, then you can get away with this:

    function delete_enclosure(){
        return '';
    }
    add_filter( 'do_enclose', 'delete_enclosure' );
    add_filter( 'rss_enclosure', 'delete_enclosure' );
    add_filter( 'atom_enclosure', 'delete_enclosure' );
    

    That removes the enclosure code.

    Build Your Own

    Another fix would have been to make a JSON output or use something like JSONFeed itself. Or of course I could have auto-duplicated the embeds, but that just felt wrong to me.

  • SEO and URLs and Indexes

    SEO and URLs and Indexes

    The question of the day. “Does having all your posts indexed on the main page of your site cause the highest SEO value to be in your main domain name and not the individual posts or categories?”

    No.

    What is your homepage for?

    As a reminder, you don’t have to have all your posts listed on your main page, or any page when you get down to it. When you don’t we call those ‘static sites’ but really what we mean is “A non-newspaper site.”

    Yoast talks about this with regards to what they call homepage SEO. As Michiel notes in that post, the point of your homepage is to load fast, explain the purpose of the site, and direct people to where they need to be.

    Where Is SEO Value?

    The SEO value in your site is not going to be in the homepage or the category pages. It’s not in the archive pages either. The value of your site is found in your important content. We call this your flagship or cornerstone content. Those are the pages you want to drive people to, to get the most out of their visit.

    There’s a lot of good advice about how to make good content like that, from CopyBlogger and Yoast and more. But the point they all make is that the mead and meat part of your site is the content and not the index.

    Do index pages lose SEO?

    Again. No. Look. I get it. The real question is “Will sending everyone to my home page screw up my cornerstone SEO?”

    No. That’s not how it works. If people are looking for “your website topic” then yes, they will end up on the home page. And if your home page is a constantly rotating list of pages, then yes, they will see links to some deeper content.

    But that doesn’t hurt your SEO. Google will rank your cornerstone pages properly because they will rank higher. They will have more specific content. They will be your centers. So spending all your time coming up with fancy ways to get rid of content that is underperforming, hiding it and removing it, it’s just a waste of time and energy. Of course that’s a bit of a different topic.

    Your homepage won’t hurt your SEO

    Listing your recent posts on your home page doesn’t hurt your SEO. Actually it helps a little to have a ‘recent posts’ section. But no, having the posts lists doesn’t hurt the SEO. Your site will be just fine. Don’t make weird CPTs to shuffle things around.