Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: search

  • Elasicsearch as a Service

    Elasicsearch as a Service

    Search is hard. Searching when you have custom meta data in post is harder. By default, WordPress does not search your custom meta data. And my LezWatchTV site is 75% custom meta data.

    I’d been using Google Search, but that has a lot of issues of it’s own like privacy, ads, accuracy, and most importantly, no way to tune it. I decided to try out ElasticSearch since I knew that was what WordPress.org’s internal search engine was going towards. After I added custom post meta to my search content, this post was going to be about how to install Elasticsearch on an ELK stack on DreamCompute, which turned out to be rather easy if time consuming and messy. And getting WordPress to work with it was as easy as installing the ElasticPress plugin (thank you 10up).

    What was complicated was making Elasticsearch work remotely. By default, it wants to only be accessible locally for your own security. But adding in Shield and still having all the logs and pretty things to understand what was happening and how to manage it when it was all new escalated quickly. It was simply too much all at once for me. Instead I decided to look into Elasticsearch as a service.

    There are a lot of options here,

    Self Managed

    I know I said ‘as a service’ but you really can use DigitalOcean or DreamCompute to do this. And there’s all sorts of documentation about how to do it available (like DigitalOcean’s ‘How to install the ELK Stack on Ubuntu 14.04’ which works on DreamCompute too). And Amazon Elasticsearch is also an option here.

    But… they’re all very self-managed. They require you to jump into servers, run a lot of commands, and they’re not new user friendly. Look, I get that this is complicated stuff, but people aren’t going to know if they want to learn all this if you make it monumental to get into.

    Services

    You can break these down into two main types.

    Enterprise Level:

    Free ‘Trials’:

    I wanted to use something ‘free’ to get started so I could figure out what I wanted to do and how to properly use Elasticsearch before deciding if I wanted to pay. But also I wanted to figure out exactly to do with search. Therefore I needed something ‘free’ to test with, something with logs, that would help me understand it all. I ended up trying both Bonsai and Searchly. While Bonsai gave me more room, Searchly had more information to the interface, but neither had a ‘Hey, here’s how you tune Elasticsearch!’ page.

    Neither had Kibana 4 though, which is a little sad.

    So when you don’t know how to do ‘anything’ with Elasticsearch, what can you test? The same search. I checked which was faster, which was more accurate, and which had the results I wanted. Bonsai was the winner here, so that’s what I went with.

    Integrating WordPress

    Thankfully this is the easy part.

    Install the ElasticPress plugin. Go to Settings -> ElasticPress and add in the URL from your Bonsai panel as your Host. It should look like https://username:password@yourcluster.us-west-2.bonsai.io (with some variation based on location). Save, press the ‘Run Index’ button, and you’re done.

    The nice thing about the plugin is if it breaks (like the service goes down), the plugin reverts to WordPress search! Which isn’t great, but … well.

    Next? How do I tune Elasticsearch?!

  • Search Options for Custom Post Data

    Search Options for Custom Post Data

    I use CMB2 to add in a bunch of custom meta data for my posts on a site. Seeing as I’m using it to allow layouts and formats to be consistent, it’s not surprising that I’ve chosen to split out my data like that. In another world, maybe it would be done differently, but this works.

    Except that search sucks. WordPress doesn’t search custom post meta out of the box which just kills me. That meant all the data I stored in for names and dates was never getting searched. There are two ‘easy’ solutions for this at least.

    Google Search

    Ew. I know. But ew. Since I’m using Genesis as my theme, it’s not super hard, just a little weird. Assuming you already have a Custom Search Engine set up, and you’re using Genesis, here’s what to do next.

    First I added this into my functions-site.php (note: I made a functions-site.php file so I can easily update my functions.php file on the rare occasion I need to update the child theme – it’s really rare – but also so I always know what’s me and what was Genesis):

    /* Google Custom Search Engine */
    add_filter( 'genesis_search_form', 'helf_search_form', 10, 4);
    function helf_search_form( $form, $search_text, $button_text, $label ) {
        $onfocus = " onfocus=\"if (this.value == '$search_text') {this.value = '';}\"";
        $onblur = " onblur=\"if (this.value == '') {this.value = '$search_text';}\"";
        $form = '<form method="get" class="searchform search-form" action="' . home_url() . '/search" >' . $label . '
    <input type="text" value="' . esc_attr( $search_text ) . '" name="q" class="s search-input"' . $onfocus . $onblur . ' /><input type="submit" class="searchsubmit search-submit" value="' . esc_attr( $button_text ) . '" /></form>';
        return $form;
    }
    

    Then I made a custom page template thanks to Rick Duncan:

    <?php
    /*
     * Template Name: Google Custom Search Engine
     *
     * This file adds the Google SERP template to our Genesis Child Theme.
     *
     * @author     Rick R. Duncan
     * @link       http://www.rickrduncan.com
     * @license    http://www.opensource.org/licenses/gpl-license.php GPL v2.0 (or later)
     *
     */
    
    //* Force Full-Width Layout
    add_filter( 'genesis_pre_get_option_site_layout', '__genesis_return_full_width_content' );
    
    //* Add Noindex tag to the page
    add_action( 'genesis_meta', 'lez_noindex_page' );
    function lez_noindex_page() {
    	echo '<meta name="robots" content="noindex, follow">';
    }
    
    //* Insert Google CSE code into <head> section of webpage
    add_action( 'genesis_meta', 'lez_google_cse_meta', 15 );
    function lez_google_cse_meta() { ?>
    
    	<script>
    	  (function() {
    	    var cx = '017016624276440630536:tpoclrwnxyy';
    	    var gcse = document.createElement('script');
    	    gcse.type = 'text/javascript';
    	    gcse.async = true;
    	    gcse.src = 'https://cse.google.com/cse.js?cx=' + cx;
    	    var s = document.getElementsByTagName('script')[0];
    	    s.parentNode.insertBefore(gcse, s);
    	  })();
    	</script><?php
    }
    //* Add custom body class
    add_filter( 'body_class', 'lez_add_body_class' );
    function lez_add_body_class( $classes ) {
    
       $classes[] = 'google-cse';
       return $classes;
    
    }
    //* Remove standard Genesis loop and insert our custom page content
    remove_action( 'genesis_loop', 'genesis_do_loop' );
    add_action( 'genesis_loop', 'lez_custom_content' );
    function lez_custom_content() { ?>
    
    	<div itemtype="http://schema.org/SearchResultsPage" itemscope="itemscope">
    		<header class="entry-header">
    			<h1 itemprop="headline">
        			<?php echo get_the_title($ID); ?>
        		</h1>
        	</header>
        	<div class="entry-content" itemprop="text">
        		<?php echo get_the_content();
        		//* Obtain querystring value if present and display on-screen
    			if ((isset($_REQUEST['q'])) && (!empty($_REQUEST['q']))) {
        			$query= $_REQUEST['q'];
        			echo '<strong>You Searched For:</strong> <em>'.$query.'</em>';
    			}
    			else {
    				echo 'Please enter a search phrase.';
    			}
    			if ( is_active_sidebar( 'google-cse' ) ) {
    				dynamic_sidebar( 'google-cse' );
    			}
    			?>
        		<gcse:searchresults-only></gcse:searchresults-only>
        	</div>
        </div>
    
    <?php }
    genesis();
    

    Finally I added a page called “Search Results” and assigned it this template. Done. Google, who searches the whole page content, will get everything. It just looks like Google.

    Having WordPress search your Custom Post Meta

    This was surprisingly annoying, but not as hard as all that. Adam Balee wrote Search WordPress by Custom Fields without a Plugin which, I know, is ‘without a plugin’ and sort of silly, but I put that in as an MU plugin and it worked perfectly!

    <?php
    /**
     * Extend WordPress search to include custom fields
     *
     * http://adambalee.com
     */
    
    /**
     * Join posts and postmeta tables
     *
     * http://codex.wordpress.org/Plugin_API/Filter_Reference/posts_join
     */
    function cf_search_join( $join ) {
        global $wpdb;
    
        if ( is_search() ) {    
            $join .=' LEFT JOIN '.$wpdb->postmeta. ' ON '. $wpdb->posts . '.ID = ' . $wpdb->postmeta . '.post_id ';
        }
        
        return $join;
    }
    add_filter('posts_join', 'cf_search_join' );
    
    /**
     * Modify the search query with posts_where
     *
     * http://codex.wordpress.org/Plugin_API/Filter_Reference/posts_where
     */
    function cf_search_where( $where ) {
        global $pagenow, $wpdb;
       
        if ( is_search() ) {
            $where = preg_replace(
                "/\(\s*".$wpdb->posts.".post_title\s+LIKE\s*(\'[^\']+\')\s*\)/",
                "(".$wpdb->posts.".post_title LIKE $1) OR (".$wpdb->postmeta.".meta_value LIKE $1)", $where );
        }
    
        return $where;
    }
    add_filter( 'posts_where', 'cf_search_where' );
    
    /**
     * Prevent duplicates
     *
     * http://codex.wordpress.org/Plugin_API/Filter_Reference/posts_distinct
     */
    function cf_search_distinct( $where ) {
        global $wpdb;
    
        if ( is_search() ) {
            return "DISTINCT";
        }
    
        return $where;
    }
    add_filter( 'posts_distinct', 'cf_search_distinct' );
    

    This is not the most efficient search, I know. But it works and gets my data where it’s needed.