Half-Elf on Tech

Thoughts From a Professional Lesbian

Author: Ipstenu (Mika Epstein)

  • Gravity Forms and Disallowed Keys

    Gravity Forms and Disallowed Keys

    Recently Gravity Forms was added to a site I work on. Now, I’ve never used it before, so I was hands off (except for changing the email it sent to) and I know pretty much nothing at all about it. But what I do know is that there’s a real jerk out there who’ll spam it, given a chance.

    Unlike other contact form plugins out there, Gravity Forms comes with built in free integration with Akismet! But, like pretty much every other plugin out there, it does not integrate with my disallowed keys.

    I’m a big proponent of not reinventing the wheel, and I strongly feel that being able to block someone from comments and contact forms should be a done deal. I opted to mark people who do this as spam, instead of a rejection, so they will never know if I ever saw their email or not. This is a questionable use of the spam settings, but at the same time, it’s been a rough couple of years.

    The Process

    Since the disallowed_keys list contains emails and words, the first thing I wanted to do was strip out everything that wasn’t an email or an @-domain — that means foobar@example.com is a valid entry, and @spammers-r-us.com is a valid entry, but foobar on it’s own is not. I run through my disallowed list, add everything valid to an array in a new variable.

    Before I can pass through the email, though, I need to remove any periods from the username. You see, Gmail allows you to use foobar and foo.bar and fo.o.b.a.r all as the same valid username on your email. Yes. all those would go to the same person. To get around this, I remove all periods and make a clean username.

    Also I have to consider the reality of jerks, who do things like foobar+cheater@example.com — Gmail allows you to use the + sign to get clever and isolate emails, which I use myself to track what sign-up spams me. At the same time, I don’t want people to get around my blocks, so I have to strip everything following the plus-sign from the email.

    While I’m doing this, I’ll save the domain as it’s own variable, because that will allow me to check if @spammers-r-us.com is on my list or not.

    Once I’ve got it all sorted, I do an in-array: if either the exact (clean) email is in the array, or the exact @-domain is in the array, it’s spam and I reject.

    The Code

    add_action( 'gform_entry_is_spam_1', 'my_spam_filter_gform_entry_is_spam_1', 10, 3 );
    
    function my_spam_filter_gform_entry_is_spam_1( $is_spam, $form, $entry ) {
    
    	// If this is already spam, we're gonna return and be done.
    	if ( $is_spam ) {
    		return $is_spam;
    	}
    
    	// Email is field 2.
    	$email = rgar( $entry, '2' );
    
    	// Build a list of valid emails & domains from disallowed_keys
    	$disallowed_emails = array();
    	$disallowed_array  = explode( "\n", get_option( 'disallowed_keys' ) );
    
    	// Make a list of spammer emails and domains.
    	foreach ( $disallowed_array as $spammer ) {
    		if ( is_email( $spammer ) ) {
    			// This is an email address, so it's valid.
    			$disallowed_emails[] = $spammer;
    		} elseif ( strpos( $spammer, '@' ) !== false ) {
    			// This contains an @ so it's probably a whole domain.
    			$disallowed_emails[] = $spammer;
    		}
    	}
    
    	// Break apart email into parts
    	$emailparts = explode( '@', $email );
    	$username   = $emailparts[0];       // i.e. foobar
    	$domain     = '@' . $emailparts[1]; // i.e. @example.com
    
    	// Remove all periods (i.e. foo.bar > foobar )
    	$clean_username = str_replace( '.', '', $username );
    
    	// Remove everything AFTER a + sign (i.e. foobar+spamavoid > foobar )
    	$clean_username = strstr( $clean_username, '+', true ) ? strstr( $clean_username, '+', true ) : $clean_username;
    
    	// rebuild email now that it's clean.
    	$clean_email = $clean_username . '@' . $emailparts[1];
    	
    	// If the email OR the domain is an exact match in the array, then we know this is a spammer.
    	if ( in_array( $clean_email, $disallowed, true ) || in_array( $domain, $disallowed, true ) ) {
    		$return = true;
    	}
    
    	// If we got all the way down here, we're not spam!
    	return false;
    }
    

    Of Note…

    Before you use this yourself, you will need to customize two things!

    1. gform_entry_is_spam_1 is actually the specific form I’m checking. Form ID 1. Customize that to match your form ID.
    2. $email = rgar( $entry, '2' ); — you may have noticed I put ’email is field 2′ as a note above it. That’s because email is the second field on form 1, so I hard grabbed it. If yours is different, change that.

    Also … I actually broke this out into two files, one that just checks “Is this a spammer?” and the Gravity Forms file, so the latter calls spammers.php and checks the email against the is_spammer() function. The reason I did that is because I need to run this same check on Jetpack’s contact form. Both call the same function to know if someone is evil.

  • When It’s (Not?) Burnout

    When It’s (Not?) Burnout

    I took 2020 as a break from speaking at conferences, live for obvious reasons, and online for a couple different reasons. It took me until November to get my home office set up in a ‘non-embarassing’ way so that I didn’t feel like I was showing everyone my mess when we video’d. Also I was exhausted and realized how close I was to burn out after the last four+ years of stress and travel.

    But there has been one other thing. I’d talked to a number of friends. I’ve broken down sobbing after a coworker mentioned what was going on. I’ve had longs talks with therapists and experts in this sort of thing. The issue wasn’t my workload, it wasn’t even the work I was doing. But I absolutely was burnt out.

    … But it’s not for why you’re probably thinking. I’m dead ass burned from being harassed.

    Harassment

    The largest contributor to my burn-out is an ongoing, over two years, harassment.

    A year ago I gave a talk in NYC about how to deal with being attacked online, and the tools you can use to protect yourself. What I didn’t mention in detail in that post was what has been going on since November 2018.

    Back then I was watching the Macy Parade (like I do every year), waiting for the oven to heat up, and cleaning out the emails for the plugin review team, when I got pinged by a forum mod. A plugin developer was being cruel to users, making weird threats and claims, and said volunteer wanted to know what to do, since that person had a flag on their account saying “If there are any guideline violations, report to plugins ASAP.” So I threw the turkey in the oven and pulled up the records.

    What I found was a series of minor issues, but all repeating. The developer was asked (twice) to change their plugin name to be less spammy (ex. “The world’s greatest slider plugin! Greater than anyone has known! Used by millions!”). There were also multiple emails reminding them not to ask to contact people off the forums.

    There was also a strange email from a couple months prior. A woman had emailed the plugins team about this developer, saying that after she left a bad review she was harassed by them on Facebook. At the time, we issued a final warning about behaviour (which is why the flag in the account existed). I had forgotten about it being related to this developer, as it was about their other plugin, but also we get a hundred emails a day, and I don’t memorize everyone’s drama.

    In looking at that, and the post the forum mod was worried about, I saw the parallels. This was very obviously repeat behaviour, and at the time I was pretty sure that the developer was account sharing (multiple people using the one dev account), which meant not only did they not understand the message about not being unkind, but they were not making sure everyone who worked for/with them did either, and they didn’t understand basic security (there’s no need to ‘share’ accounts on WordPress.org — you can make new ones and ad them to your plugin as support reps after all).

    This meant I did what I hate doing. I closed their plugins, locked the accounts, and emailed them saying that they were banned for repeat abusive behavior. After all, they’d had multiple warnings.

    In retrospect, I should have seen this all coming.

    Megs of Logs

    At this point I’ve amassed megabytes of logs on this drama. I’ve written up a nearly 30 page document (with citations no less) of everything that’s happened before and since. I thought about listing everything they did ‘wrong’ here but honestly it doesn’t matter if I list out everything. That was all ‘normal’ poor behaviour by developers. People make mistakes, and many times they really just do not grasp how serious things are even when the email says “This is your last chance.” Which means I know I have to be the bad guy to tell people “Hey. This ends now.”

    Now, banning people, especially existing developers, is not a common thing! It’s not un-common or rare, but it’s not like I do it every day. Around 4 people a year get banned following a final warning. Usually it’s only one person each year (though due to people being people, it may involve multiple accounts — we still consider that one). More often, people get insta-banned for trying to use the directory for malware. Once in a while someone will be banned without warning for lying about being previously banned, but usually we catch those pretty quickly these days. Even so, it’s not an every month thing, or even an every season occurrence! The majority of people get that final warning and stop and rethink their choices. That’s normal.

    What was abnormal is what happened after they were banned.

    Between November 21st and the 27th, the Plugins team received over 30 emails. The first few replies were replied to in kind, pointing out that they had their fair chance (and a couple extra) and they squandered it. At that point, emails were not replied to for 24 hours, when they were informed again as to their numerous violations, and asked to stop emailing or their actions would be treated as harassment.

    The emails did not stop. 21 more were sent following that caution.

    Yes that means over 50 emails in a 6 day span. Probably closer to 100, since we only tracked them by subject rather than by how many replies they got.

    On the 24th, they tried to bribe me by sending me money via PayPal (it was refunded and reported — and yes, this is why generally I don’t like when developers send me a donation, though I understand most are not trying this). The message asked me to ‘forgive’ them and rescind the ban. At that point I blocked their email on all my personal systems and went on my merry way.

    Instead, they thought “Well she blocked us on one email, let’s use a different one!” and found my old, only used for Google events, account. By the way, none of those personal emails were ever provided to them. It’s not hard to guess what my email on Gmail might be, though.

    On November 27th, a threat was made. They emailed saying they prayed to their god to “take away all your name, fame, respect, wealth everything” and more.

    And then it escalated…

    Yeah some of you are thinking “Wait, THEN it escalated?”

    • From November 24th to the end of the year, 77 separate email chains were sent, using 3 separate emails.
    • In 2019 there were over 600 separate email chains from 126 separate email addresses.
    • 2020? 34 separate email chains.
    • 2021? Only 3 email chains, but it’s only February.

    So yeah, 2019 was rough. My Dad died in the start, and this developer had the gall to say Dad’s death was my fault, as I was being punished by their (the developer’s) god. Yes, that really happened.

    I did a lot fewer talks in 2019 because I was coping with the world without my dad, and in 2020 …. well. We all took an in-person break, and I took a virtual one as well, because I was tired of prepping myself before talks.

    See, every time I would go to a WordCamp, I had to prepare myself. What will I do if they show up? They had made, after all, ‘threats’ to come to California, and they’d already sent physical items to my office. So how would I handle it? The odds of them getting to the United States, given our then administration, seemed unlikely, but what if… What if?

    I rehearsed, I practiced not being alone, I made sure at least one trusted person knew why I was nervous. My wife and I talked about strategies. But online? What if he saw something on my backdrop that let him figure out my home? What if he tracked me? What it he did something to put my family in danger? It was all too much to bear, so I simply didn’t.

    Somewhat related, my office knew and went way above and beyond what I had any reason to expect to make sure I felt safe there. I love those people.

    So … where are we now?

    The developer still emails, on average twice a month now. We’ve sent a cease & desist (which was repudiated) and I’ve spent a lot of time literally ignoring everything that comes in. I do have a list of all the various claims made, and all the email subjects. I stopped tracking the content of the emails in mid 2019 because they were so outlandish that I couldn’t even anymore. I mean, does anyone think Alexandria Ocasio-Cortez cares that someone in another country is angry they got banned from a website?

    Effectively? I am still being cyberstalked and harassed. And my god, it’s draining.

    I sat here, thinking ‘is this even a good idea? It’s just going to make them be bigger annoyances”

    After how disastrous 2020 has been? I think it’s right to step up and say “Hey, so this shitshow happens, and people are out there who are going to make it their mission to make you miserable. You’re not alone.”

    This is me, walking back into the fire because I’m refusing to let it make me smaller.

    What I want?

    It’s super simple. I want it to stop. I want them to accept that they’ve burnt every single bridge a human can burn, short of physically attacking me, and now, even if anyone accepted their apology, we cannot unban.

    There’s no way to know they won’t start this up again, or use this as the freedom to be a bigger harm to the community. There’s no way to walk back from this level of harassment. And if that means I have to shoulder this to protect everyone else? Well. I’ll do it, but I’ll do it my way, which means I post this. I share to the world “This is a thing.”

    And this sucks. I hate telling someone “Buddy, it’s over. You’re done.” But they are. Even if I overstepped or over-reacted, 700 emails, physical packages, cards, threats, accusations of killing people, etc … how do you go back and say “Oops, I was wrong” and expect everything to be okay.

    It’s not, because it can’t be. Things don’t just go away and get better because you said you were sorry. I do believe they’re sorry, but I think they’re sorry because they got caught and punished. They aren’t sorry they did harm (if they were, they’d have stopped). Right now, they’re at the point where their argument is “We will stop hurting you when you do exactly what we want.”

    And that, I simply cannot do. Not just because I’m standing to protect the rest of the WordPress.org users, but for the principle of the thing.

    What I want? I want them to stop trying to contact me in any way, shape, or form. I want them to accept the (painful) fact that they made a massive mistake and acted in a harmful manner. I want them to be grown ups and walk away.

    Sadly, this appears to be something they cannot do.

    It’s totally Burnout

    This absolutely is burnout.

    I’m socially burned out in a lot of ways. While I had some phenomenal support from WordPress, from my work, from my friends, from professionals, it was exhausting to have to deal with this. Legally? There isn’t much I can really do. The persons involved don’t live in the US, so our laws are not in play here. International harassment laws don’t really exist. There’s nothing the police can do to stop it unless they show up in the US (which is highly unlikely).

    At best, I can file complaints (which I have) and block their contacts (ditto). I can also be proactive, look them up, find out everything that’s them, and block them before they contact me (did that). I’ve done a lot more than I list here, by the way. I don’t want to tip my hand.

    People have done everything I could possibly expect from them, and more, but … it’s still going on.

    And yes, this is part of why Plugin Team emails went anonymous.

    It’s absolutely, 100%, burnout.

    And about speaking at events?

    I don’t know.

    The last two years I just needed a break from all that to process how I felt about the situation. I knew I was tired, but that isn’t really how I feel emotionally. The last year was so hard for everyone, so brutal for us all, that having it sit on top of the pain of loss meant I never really got the chance to process. I don’t feel like it’s been two years since Dad died, I feel like it was yesterday.

    What I feel is anger and annoyance and a lot of ‘damn it to hell.’ And I am filled with defiance.

    Now that there’s a little less stress in my life (and most of ours), and with the hope that people in charge will be held accountable for their seditious actions, I feel like I’m freer to say that this happens. This happened. This is happening.

    Soon, hopefully, I’ll feel like I can safely do interviews and talks again.

    Why did I post this on my Tech blog?

    The world is angry right now. Everyone’s at their limit for coping, and for most we’re well beyond what our brains can wrap around. Half a million dead in the United States alone? It’s nearly unimaginable. And I think we’re letting our anger get the best of us.

    I posted on HalfElf and not my personal me-blog because in tech, we can easily forget there are other people on the screen. I knew, when I banned this person, that I was harming a human. I felt I had run out of other options to get them to understand that they were doing harm to the world in general, and I didn’t want anyone else to get hurt. This is not an excuse, though. I hurt someone. I hate that I did it. I hate that I have to. But there’s literally no way to stop someone from hurting others without hurting them in some way. At least not that I’ve found.

    But if I banned someone from a physical location, I could get the cops to do something (in theory, I know). I could get legal help. I could have security escort them from my location and be within my rights.

    Online?

    We don’t build our tools to handle harassment. We just don’t.

    If someone harasses you on Twitter, or Facebook, the ‘solution’ is to turn your account private, because these people will just make more and more accounts. We can’t block by IP, because they can use VPNs. We could ban all VPNs, but that has a negative impact (just for an example, I can’t edit Wikipedia when I’m at my office because we have a firewall and VPN).

    Looking at WordPress, how would you stop someone from harassing you? You make use of banned terms and plugins, but did you know most contact form plugins don’t have block tools? Logically it’s so if someone’s accidentally blocked from commenting, they can get a hold of you. But most don’t even have this as an option.

    So I post this here to put a human face on the damage being caused by our own negligence, and to make us more aware of the monster we’ve created.

    When you write new code, think about how it can be abused. Think about disrupting harassment. Think about allowing people to protect themselves. And, above all, if someone tells you this is going on? Believe them. I was lucky. Everyone believed me. Most people are not.

  • Email Verification and Unsubscribing

    Email Verification and Unsubscribing

    If you follow me on Twitter (no you probably don’t want to), you know I’ve been dealing with the messy technical side of death for around 2 years now. My father died, unexpectedly, and I picked up his digital life and dropped it on my laptop in order to untangle things. While my father had shared his login information with me before, I did run into a number of technical issues like needing the phone for an SMS confirmation when I logged in from a new location.

    Now all that said… Here’s the technical problem a LOT of companies created for themselves.

    1. They don’t require you to verify an email before sending you advertisements
    2. Those emails do not have unsubscribe links

    Yeah, those two things are killing me, smalls.

    Why not delete the account?

    Someone’s thinking this…

    Because the last time someone emailed it, legit looking for my Dad to tell him something funny/relatable/personal, was December 2020.

    Dad was in his 70s. He had a lot of sporadic friends over that time, and sometimes they would randomly think about him and reach out. Many were long-standing friends, some I knew and hadn’t seen since I was in elementary school. He lived in a lot of places. Those people needed to be told he was dead.

    Maybe one day I’ll delete his account and his website, but it won’t be any time soon.

    How to Fix This

    The good news here is all this is fixable if people start caring about data properly.

    See the problem here stems from companies wanting your data. They want it so much that they use any excuse to grab it and never let go. But this is wrong both legally and morally.

    It’s not their data. It is YOUR data, and you should have a right to it. Per the GDPR, UK’s Right of Removal, and even California’s new laws, my data belongs to me, and I have a right (in most cases) to get it off their system. In the case of my dead father? That data is as useful for you as wings on a mongoose. But as his estate’s legal representative, I legally own Dad’s data, which means I should have control.

    Check The Email First

    Anyone who’s signed up for anything online lately knows that you have to opt-in to getting ads. That’s just how the world works now. But you also have to confirm your email before you can use your account fully.

    At the outset, that sounds great, right? It forces people to confirm! The reality though is that by letting people make an account, with or without verification of the email, those companies add the email to their mailing lists. That means that when some moron uses my father’s email to ‘test’ (or because they’re some idiot in the midwest who regularly thinks it’s his email even though Dad made it in the 1990s and has used it since then, seriously buddy, stop it), I get the email. And when they correct the email in the account, they retain access and I keep getting emails that I cannot unsubscribe from.

    We’ll get to the lack of links in a minute.

    The obvious thought process here is “People wouldn’t put in the wrong email!” but the reality? They do. They totally do. There’s a guy who bought a Ford, has a credit line, and a loan from a bank, and I know a whole lot about all this because he is a total idiot who keeps using the email that was my father’s. Seriously. It’s never been his email. The first owner of the domain was Dad. The second is me. The email he used has been in use, by my father, since March 2, 1995. Not joking.

    Now, if you keep along with the (incorrect) thought train, you’d think “Once someone enters their email, I can add it to my mailing lists as I have their consent.” And again, sure. IF the email is actually theirs. And what’s happening is all these sites add in your email to their lists before they confirm (if they confirm at all) that it’s really your email. This means my poor Dad’s email is not just added to an account, it’s added to all their lists as well.

    Let Us Unsubscribe

    The other (related) issue is there’s no unsubscribe link.

    Look, I get it. There are emails that are not unsubscribeable for as long as you have an active account. There are legal reasons why you have to be mailed some things. However all those emails must have a way you can actually close/remove your account. A link would be great, but even an explanation “Hey, we cannot unsubscribe you unless you close your account, here’s how to do that.” would be better than the message from a certain ISP who told me I had to log in to the account… but were unable to provide me with the login info.

    In the case of two separate companies, if you do have to legally send out emails to people because they have an active account, you should be including some information like ‘Your account name is X’ or even ‘Your account number is X’ so that we can have a place to start. Instead, I have a bunch of emails that all say they can’t unsubscribe me while I have an active account, please log in …

    And what do you think happens when I go to log in? Of course ‘There is no account with this email…’

    Which brings me to…

    Let Us Recover Accounts

    It needs to be ‘easier’ to recover account. Especially if someone’s dead.

    Now, I’m not talking about Facebook’s idiocy on locking people out and requiring them to have someone else verify them, only to send another email that bounces and you can never log in. Although that was certainly fun to do with my Dad’s stuff.

    Take a hard look around. People are dying by the thousands per day, and those are not ‘expected’ deaths by any means. This means the number of humans who were unprepared and unorganized are stuck trying to find things like account numbers, and have no clue where to start. If we’re lucky, we can get into their email and change the passwords so we can keep it but…

    This is not actually very easy! The only reason I had Dad’s email was because I was his email admin. If I wasn’t, I’d have to have logged in while I was still in Japan, from his laptop, and then hoped beyond reason that I was able to change the passwords without knowing the current one.

    Think about that for a second. My father lived in Japan, had a Japanese number. He’s dead, the phone number was closed, and I can’t get it back as I’m not a Japanese resident. Which means the methods to recover are … email. But that isn’t enough for some companies.

    My ‘favourite’ is someone telling me that there was no way to know what account used my Dad’s email. Yeah, they had no way to connect an email to any account, and required me to provide a local phone number to call me about it. I blocked their emails because I literally have no other solution. They can’t tell me what email uses the address I own, and they can’t help me except by a local-to-them phone call.

    Summary? Let People Own Their Data

    Okay, here’s your summary:

    1. Require email confirmation in all cases where an account is being made. No verification? No account.
    2. Allow people to correct the emails if they can’t verify. If someone put in stevejobs@appl.com and forgot that E, they should be able to fix this.
    3. Allow people to unsubscribe from all emails with an easy to find method. A link, some explanations, whatever. Make it obvious.
    4. If people cannot legally unsubscribe while having an account, then you need to make it possible to cancel accounts when a user DO NOT KNOW the account name. If you’ve verified emails, yo, magic. “I forgot my account name…” — And again, this needs to be easy to find information.
    5. If someone sends you a damn death certificate, you should honour it.

    This is not going to fix everything, but it would certainly make us hate a couple companies a lot less.

  • Algolia: Search Faster

    Algolia: Search Faster

    Note: While this post is about using Algolia, the irony is that shortly after I posted it, I removed Algolia. The reason being, InstantSearch counts as a separate search per letter used — that means I was about to skyrocket over my allowance and hit the thousands-a-month. I feel their pricing was quite unclear about this. But hey, now you know!

    My friends, it’s been a while. And if you follow my rants on Twitter (this is not a suggestion you should). you saw I faced off with ElasticSearch, my nemesis.

    Moons ago, I attempted to use it to make a site I run faster by using ElasticSearch. At the time, I struggled with search ranking and all those things. Then it broke with Jetpack and made my server core-dump. So in 2016 I tossed it in the can and walked away. After all, I didn’t need it. WP’s search was sufficient.

    Fastforward 4 years and with around 12k posts to search, guess what isn’t so okay anymore?

    Nothing.

    When you search on WordPress, it uses SQL queries to check in and find all instances of ‘a thing’ (whatever it is you searched for). So logically if you have a lot of posts (or a lot of content, be that in the form of a few huge posts or a high number of smaller ones), you’re going to experience slower searches.

    Also WordPress’ search isn’t customizable. You can’t tell it “Don’t search page X” or even “Prioritize post titles over content.” This leads to some odd results.

    But realistically neither of those issues are ‘wrong.’ Those are broad choices made to support 80% of WordPress users.

    This means your question of “What’s wrong with search?” is really “Are there specific cases wherein the default search won’t be the best choice for me?” And those two issues? They’re why. If your site is large (or getting there) and if you need to ‘weigh’ search results to prioritize A over B, then this post is for you.

    Solving The Right Problem

    While the first thing you always look at is “What do I need to solve?” by the time you get around to ranting about how WP search sucks, you kind of know where to start. That is, either search is too slow or you need to customize it. Or both.

    If you need to customize your search results, I recommend you look at plugins like Relevanssi, which does a great job of handling that. However there are two critical flaws for most (if not all) self-hosted plugin solutions. They’re going to make your database big. And let’s be clear here, a bigger DB is not going to help your speed issues. It becomes harder to back-up and more fragile. Relevanssi is refreshingly honest about this, warning you that your DB will triple in size, but also making sure you know that over 50k posts won’t work.

    Subsequently, a large site means you need to start looking at services. Algolia, Swiftype, ElasticSearch, and Solr are all amazing, viable, services. Some have plugins for easy WordPress integration, some do not. Some are open source, some are not. Some let you build your own… Let me just show you:

    NameOpen
    Source
    ServiceRoll Your
    Own
    Plugin
    AlgoliaNoYesNoUnofficial
    SwitftypeNoYesNoOfficial
    ElasticYesYesYesUnofficial
    SolrYesNoYesUnofficial

    You get the idea. Lots of options. And I did not pick Elastic (who owns Swiftype now, BTW).

    You see, Elastic is more than just a search. It’s really a whole database of your content. This means you can hook into it to speed up WordPress queries for long/large tables. But … That isn’t my problem. My problem is just search.

    Services are Spendy

    I ended up using a service because I was going bonkers. Seriously. The ‘directions’ for both Solr and Elastic are really terrible. They go in with some assumptions that you’ve done similar things (haven’t) and don’t have what I would call an ‘intro doc.’ Solr I got a lot further than Elastic, but WP integration was weird. And Elastic … No.

    Installing it was weirdly easy. The problem was I could not find any information about configuring it. People say “You must secure it.” Okay, sure, I can do that… But no one sat and explained why you want the nodes, what they do together, why you want them on separate servers (or even that you do) and .. Honestly I wanted to throw my laptop out the window.

    It does, mind you, bring up an important note. Search storage and Elastic services are expensive. Even Jetpack, who offers a bundled Elasticsearch integration (yes, that’s what Jetpack Search is) would cost a site with 10,000 posts around $600 a year. Even using Amazon’s Elasticsearch it’s going to run you a lot. How much? Well if you just toss in their defaults and accept the large settings, it’s to the tune of $22 a day. Give or take. Small settings for my site? Around $8 a day.

    ElasticPress (whom I do recommend if you’re using Elastic) starts at $79 a month. Jetpack’s new search is free for small sites, but for mine (again, we’re over 10k posts) it would begin at $60.

    Algolia though … 12k records is about $3 a month. And it’s all search.

    Enter Algolia

    The name meaning delights me:

    inability to speak due to mental deficiency or a manifestation of dementia.

    Because when you are searching, you often feel like you’re losing your mind and you have a problem. Search is hard okay? There’s a reason AltaVista, Lycos, Yahoo, and now Google are important. Searching is crazy weird and hard and sometimes it’s faster to go “lezwatchtv ACTOR NAME” than to search on our site.

    That was not good at all.

    Algolia is one of the more straightforward setups I’ve had in a while.

    1. Register on algolia.com
    2. Spin up a new app
    3. Install WP Search with Algolia
    4. Add in your keys
    5. Tell it what to search
    6. Tell it if you want auto-complete
    7. Tell it if you want a new search-result page
    8. Index
    9. Done

    Oh you know there’s a little more.

    Reducing Records

    I decided to make my records smaller. Algolia only cares about the number of records, not the size, as long as each record is under 10k. I have a lot of meta data and a lot of records. If I was to index everything, I’d be around 15k records, which isn’t bad but I really only needed about 12k of them.

    One of the odd things the plugin does is that it uses separate indexes for Auto-Complete. So I could store all my searchable posts and all my shows, characters, etc etc. Which would make for 50k records, and I didn’t want that. Sure it makes some aspects of search easier, but I knew I could do this a better way.

    I started by making a plugin and removing some records:

    add_filter( 'algolia_should_index_user',  'my_prefix_algolia_never_index' );
    add_filter( 'algolia_should_index_term', 'my_prefix_algolia_never_index' );
    
    function 'my_prefix_algolia_never_index'() {
    	return false;
    }
    

    This tells the plugin “Never index users or taxonomies” Most of you will want the taxonomies! I don’t, mostly because they don’t impact how people search really. And yes, I did study my logs. No one cares who wrote what for my site, and that’s okay.

    Refining Search Results

    Next I needed to make sure that autocomplete (which I use) and the search page both put the right content to the top.

    There is one and only one ‘flaw’ with Algolia, and that’s they don’t make it easy to define a ‘perfect’ match. I have a case where I have 5 post types (posts, pages, shows, actors, characters) and there’s crossover. If I search for “One Day at a Time” I get everything that mentions it. Which is not what I wanted. And while the title of the post I wanted to find was the TV show “One Day at a Time”, it was bringing up my blog posts (and the page!) first.

    This was solvable because the plugin is amazing. I filtered and told it what attributes to remove:

    add_filter( 'algolia_post_shared_attributes', 'my_prefix_algolia_attributes', 10, 2 );
    add_filter( 'algolia_searchable_post_shared_attributes', 'my_prefix_algolia_attributes', 10, 2 );
    
    function algolia_attributes( array $attributes, WP_Post $post ) {
    
    	// Remove things we're not using to make it easier.
    	$remove_array = array( 'taxonomies_hierarchical', 'post_excerpt', 'post_modified', 'comment_count', 'menu_order', 'taxonomies', 'post_author', 'post_mime_type' );
    	foreach ( $remove_array as $remove_this ) {
    		if ( isset( $attributes[ $remove_this ] ) ) {
    			unset( $attributes[ $remove_this ] );
    		}
    	}
    	return $attributes;
    }
    

    This ensured I keep my records small enough because I had some math to do.

    The function algolia_attributes() needed to promote certain posts over others, so I added in a switch using some data I already saved

    // Add Data for individual ranking
    switch ( $post->post_type ) {
    	case 'post_type_shows':
    		// Base score on show score + 50
    		$attributes['score'] = round( get_post_meta( $post->ID, 'lezshows_the_score', true ), 2 );
    		$attributes['score'] = 50 + (int) $attributes['score'];
    		break;
    	case 'post_type_characters':
    		$attributes['score'] = 150;
    		break;
    	case 'post_type_actors':
    		$attributes['score'] = 150;
    		break;
    	default:
    		$attributes['score'] = 0;
    		break;
    }
    

    This adds an attribute of ‘score’ based on post type. I could weigh the up or down as I wanted.

    Then I went into Algolia’s admin, and this is where the search tool becomes a champ. under Indices -> Configuration, I changed up the Ranking and Sorting:

    Their default is: [“typo”,”geo”,”words”,”filters”,”proximity”,”attribute”,”exact”,”custom”]

    Mine is:  [“exact”,”score”,”post_title”,”attribute”,”post_type_label”,”typo”,”proximity”,”words”, “is_sticky”, “post_date”]

    This actually handled 90% of what I needed without any custom tweaks or rules.

    But weight, there’s more!

    Those ‘Attributes’ are the searchable parts of the attributes I was messing with in the refinements section. Most of what they’re used for is helping rank and sort the relevant data to make sure Sara Lance is on top. Which she always is. But. I always wanted to make some related data show up.

    By default, the searchable attributes were title and content. I added in a new attribute called lwtv_meta and in it I added more data. When the index is built for a character (say), it adds a list of all the actors who play the character and all the shows they’re on into that meta attribute. Then I added that attribute to the search. This means if you look for “Legends of Tomorrow” you will see our girl Sara Lance.

    That has a small side effect though… Where’s the show!?

    So I still have some kinks to work out, but the point is that with a couple tweaks and some extra data, I got everything set up in 3 days. Bonus? The plugin came with templates I quickly tweaked to match my theme. And I’m bad at design!

    Algolia? Let’s be demented together!

  • The Importance of Correct Casting

    The Importance of Correct Casting

    So PHP 7.4 is rolling out, WP is hawking it, and you know your code is fine so you don’t worry. Except you start getting complaints from your users! How on earth could that be!? Your plugin is super small and simple, why would this happen? You manage to get your hands on the error messages and it’s weird:

    NOTICE: PHP message: PHP Warning:  Illegal string offset 'TERM' in /wp-content/plugins/foobar/search-form.php on line 146
    NOTICE: PHP message: PHP Warning:  ksort() expects parameter 1 to be array, string given in /wp-content/plugins/foobar/search-form.php on line 151
    NOTICE: PHP message: PHP Warning:  Invalid argument supplied for foreach() in /wp-content/plugins/foobar/search-form.php on line 153
    NOTICE: PHP message: PHP Warning:  session_start(): Cannot start session when headers already sent in /wp-content/plugins/foobar/config.php on line 12
    

    But that doesn’t make any sense because your code is pretty straight forward:

    [...]
    
    $name_array = '';
    if( !empty( $term_children )){
        foreach ( $term_children as $child ) {
            $term = get_term_by( 'id', $child, $taxonomy );
    
            $name_array[ $term->name ]= $child;
        }
    
        if( !empty( $name_array ) ){
    
            ksort( $name_array );
    
            foreach( $name_array as $key => $value ) {
        [...]
    

    Why would this be a problem?

    Error 1: Illegal string offset

    The first error we see is this:

    Illegal string offset 'TERM' in /wp-content/plugins/foobar/search-form.php on line 146

    We’re clearly trying to save things into an array, but the string offset actually means you’re trying to use a string as an array.

    An example of how we might force this error is as follows:

    $fruits_basket = array(
        'persimmons' => 1, 
        'oranges'    => 5,
        'plums'      => 0,
    );
    
    echo $fruits_basket['persimmons']; // echoes 1
    
    $fruits_basket = "a string";
    
    echo $fruits_basket['persimmons']; // illegal string offset error
    $fruits_basket['peaches'] = 2; // this will also throw the same error in your logs
    

    Simply, you cannot treat a string as an array. Makes sense, right? The second example (peaches) fails because you had re-set $fruits_basket to a string, and once it’s that, you have to re-declare it as an array.

    But with our error, we can see line 146 is $name_array[ $term->name ]= $child; and that should be an array, right?

    Well. Yes, provided $name_array is an array. Hold on to that. Let’s look at error 2.

    Error 2: ksort expects an array

    The second error is that the function wanted an array and got a string:

    NOTICE: PHP message: PHP Warning: ksort() expects parameter 1 to be array, string given in /wp-content/plugins/foobar/search-form.php on line 151

    We use ksort() to sort the order of an array and here it’s clearly telling us “Buddy, $name_array isn’t an array!” Now, one fix here would be to edit line 149 to be this:

    if( !empty( $name_array ) && is_array( $name_array ) ){
    

    That makes sure it doesn’t try to do array tricks on a non-array, but the question is … why is that not an array to begin with? Hold on to that again, we want to look at the next problem…

    Error 3: Invalid argument

    Now that we’ve seen the other two, you probably know what’s coming here:

    NOTICE: PHP message: PHP Warning: Invalid argument supplied for foreach() in /wp-content/plugins/foobar/search-form.php on line 153

    This is foreach() telling us that the argument you passed isn’t an array. Again.

    What Isn’t An Array?

    We’re forcing the variable $name_array to be an array on line 146. Or at least we thought we were.

    From experience, using $name_array[KEY] = ITEM; was just fine from PHP 5.4 up through 7.3, but as soon as I updated a site to 7.4, I got that same error all over.

    The issue was resolved by changing line 141 to this: $name_array = array();

    Instead of defaulting $name_array as empty with'', I used the empty array() which makes it an array.

    An alternative is this: $name_array = (array) '';

    This casts the variable as an array. Since the array is meant to be empty here, it’s not really an issue either way.

    Backward Incompatibility

    Where did I learn this? I read the PHP 7.4 migration notes, and found it in the backward incompatibility section.

    Array-style access of non-arrays

    Trying to use values of type nullboolintfloat or resource as an array (such as $null["key"]) will now generate a notice.

    The lesson here is that PHP 7.4 is finally behaving in a strict fashion, which regards to data types. Whenever a non-array variable is being used like an array, you get an error because you didn’t say “Mother May I…” it was an array.

    Whew.

    Now to be fair, this was a warning previously, but a lot of us (hi) missed it.

    So. Since WordPress is pushing PHP 7.4, go check all your plugins and themes for that and clean it up. Declare an array or break.

    Oh and that last error? Headers already sent? Went away as soon as we fixed the variable.

  • Shlinky Dinks

    Shlinky Dinks

    For a number of reasons it was time to move on to new things. I was looking for a better, more modern solution to running my own short URLs.

    There are a lot of reasons people want these. When I started with them, it was because Twitter had limits and I wanted to control my tweets and short URLs. But then time moved on, Twitter decided to meh, not care about URL length, which meant I didn’t really need the extra weight.

    But I had a reason to keep mine around, and that’s WordCamps. 99.999% of the use of have for short URLs is to link people to things for WordCamps, like my slides but also related links that otherwise would be too long for anyone to write down in a reasonable time frame.

    And while I’d been using the same old, functional, system, it had quirks that had long since frustrated me, including not being a modern design. I felt like I was stepping back into the early 2000s, and yes, that UX matters to me.

    Enter Shlink.io

    After experimenting around, I found Shlink.io, a GDPR (yes!) friendly self-hosted URL shortener that is a little more tech, but a lot more smooth. It has a full blown API, a deep command line, and an (optional) admin that is, well, nifty.

    Features include:

    • Custom short URLs
    • Multiple Domains
    • QR Codes
    • Tags
    • Robust stats
    • Validates URLs before linking

    It’s not a set-and-forget install, to be sure, and each server is going to have some quirks, but overall I’m happy with it already.

    What’s Missing

    There’s no WordPress plugin. Yet. I suspect this will happen once people realize the API is so freaking crazy.

    There’s no way to import everything from another service, but I did a fast export of my DB and then grep’d and search/replaced so I could run commands like this:

    php bin/cli short-url:generate -c SHORT https://example.com/
    

    Done and done. Imported a few thousand URLs. I will note that most of those links don’t matter, since nearly no one hit them, but I’m just a stickler for old URLs continuing to work. Most of the time. I went back through all the failed import and found I had old links to things like test sites.

    Also the admin backend is an add-on (or non-hosted but I’m neurotic). I installed the web client at a subdomain and then used the configurator to allow passwordless logins. No, I didn’t leave it unprotected! I went old school:

    #Protect Directory
    AuthName "Dialog prompt"
    AuthType Basic
    AuthUserFile /home/ipstenu/example.com/admin/.htpasswd
    Require valid-user
    
    SSLOptions +StrictRequire
    SSLRequireSSL
    SSLRequire %{HTTP_HOST} eq "sub.example.com"
    
    ErrorDocument 403 https://example.com
    
    <Files "servers.json">
      Order Allow,Deny
      Deny from all
    </Files>
    

    What Was Messy

    The GeoLiteDB stuff was weird. It took me a while to realize I was running out of space in tmp and that was blocking me from doing anything. Since I host this VPS on DreamHost at the moment, and I work there, I went and set tmp to disk instead of memory and that magically worked.

    Now. Would I like the admin stuff to be built in and easier to manage? Of course. And would I like ‘better’ security when I use the server.json file (like maybe telling people to protect it and hide their API keys, hey) but I’ve properly opened up a ticket for them on that one.

    End Result?

    I like it. So I’m using Shlinks now and there you go.