Author: Ipstenu (Mika Epstein)

Greylist, RBLs, and Spam

Recently I noticed I had 13 spam emails all from the same ‘company.’ The content was incredibly similar, though subtly different. The from email was always different, but you could tell by looking at it that it was the same. And even more damming, it all had ‘junk’ content and 100+ recipients. But for some reason, SpamAssassin wasn’t catching it!

After 5 emails came in back to back, I decided to do something about it.

At first I was trying to find a way to tell Spamassassin or Exim how to auto-turf the emails with 100+ people listed in the ‘To’ field. This proved to be a little more difficult and complicated than I wanted, and I was sure that these spammers would catch on to that sooner or later.

What I really wanted was for Spamcop to pick up on this, but I’ve been sending them in to no avail for a while. That got me looking into how cPanel handles Spamcop in the first place.

Real-Time Blackhole Lists

cPanel uses RBLs, Real-time Blackhole Lists, to determine if an email sent to you is spam or not. By default, it comes with SpamCop and Spamhaus. That means it will reject mail at SMTP time if the sender host is in the bl.spamcop.net or zen.spamhaus.org RBL. Well that was well and good, but could I add more to that list?

Of course. I pulled up cPanel’s documentation on RBLs and determined I could add as many as I wanted. On the top of the Basic EXIM Editor is a link to Manage Custom RBLs which is what I wanted. All I had to do was figure out what to add.

After reading through WikiPedia’s comparison of DNS blacklists, I picked a few and tested the latest emails that had come through, looking for ones that caught them. Then I tested known good emails and made sure they weren’t caught. I ended up adding Barracudacentral and IPRange.

Greylisting

The next thing I did was introduce Greylisting to my email. They way Greylisting works is that if it doesn’t recognize the email, it will temporarily reject it and tell it to resend. If the email is real, the server tries to send it again after a little while. There are some downsides to this, as it’s possible for a legit email to be trapped for a few hours (or days) if someone’s set up their server poorly. On the other hand, within half an hour, I blocked 11 emails.

I mean. I’m pretty sure monica@getoffherpes.com is spam. You know what I mean?

This was super easy to do, too. I turned on Greylisting, I restarted Exim, I walked away.

Okay no, I didn’t. I sat and watched it to see if anyone legit got caught (one did, it passed itself through properly).

Result?

A little less spam. I don’t expect this to work for everything, but it had an immediate impact on many of the spam emails that were annoying me.

10 August, 2016
CMB2 And The Dropdown Years
At WordCamp Montreal, I mentioned the database of dead lesbians that Tracy and I maintain. The camper looked at it and said “You know it would be awesome if you showed the shows airdates.”

Good point! Except I just plain struggled with the concepts and how to do them in CMB2. I knew I could make multiple fields in one ‘metabox’ as I read up on the snippet for an address field, but try as I might, I couldn’t make it work.

I tweeted my headache and ended up talking to Justin Sternberg who asked me if I could explain my use case better.
I have 300+ posts, all of which have a start and end date. Some may have an end date of “current” however.

Examples of valid data:
- 1977-1979
- 2016-current
- 2000-2016
I also need to sort by start and end year. So I can search for all posts with a start of 2014.

I could have two year-sorts, easily, but that makes for a clunky interface as it would be separate fields. I know CMB2 can have a combined field (like addresses) but while I got it to save, it wouldn’t properly display on the edit page.

This only needs to be editable on the WP admin edit post.
That night, he replied and asked if this year-range field type would work.

Mind? Blown. It works exactly how I need it to. I tweaked the code (and threw in a pull request) to set up a way to reverse the years (show newest first) which is more useful for my needs.

Now? Editing 319 show entries.
8 August, 2016
Mobile Ad Detection
I screwed up not that long ago.

I got an email from Google Adsense telling me that one of my sites was in violation because it was showing two ads on the same mobile screen, which is not allowed. Until I started using some of Googles whole page on mobile ads (an experiment), this was never an issue. Now it was. Now I had to fix it.

Thankfully I knew the simpliest answer would be to detect if a page was mobile and not display the ads. Among other things, I know that I hate too many ads on mobile. So all I wanted was to use the Google page level ads – one ad for the mobile page, easily dismissible. Therefore it would be best if I hide all but two other ads. One isn’t really an ad as much as an affiliate box, and one Google responsive ad.

For my mobile detector, I went with MobileDetect, which is a lightweight PHP class. I picked it because I was already using PHP to determine what ads showed based on shortcodes so it was a logical choice.

Now the way my simple code works is you can use a WordPress shortcode like [showads name="google-responsive"] and that calls a file, passing a parameter for name into the file to generate the ad via a mess of switches and sanitation. Really you can go to http://example.com/myads.php?name=leaderboard and it would show you the ad.

The bare bones of the code looks like this:
```
<?php

require_once 'Mobile_Detect.php';
$detect = new Mobile_Detect;

$thisad = trim(strip_tags($_GET["name"]));
$mobileads = array('google-responsive', 'affiliate-ad');

// If it's mobile AND it's not in the array, bail.
if ( $detect->isMobile() && !in_array($thisad, $mobileads) ) {
	return;
}

echo '<div class="jf-adboxes '.$thisad.'">';

switch ($thisad) {
	case "half banner":
		echo "the ad";
		break;
	case "line-buttons":
		echo "the ad";
		break;
	default:
		echo "Why are you here?";
}

echo '</div>';
```
The secret sauce is that check for two things:
1. Is the ad not one I’ve authorized for mobile?
2. Is this mobile?
Only if both are false will the script continue to run. It’s simple but I like to have things stop as soon as possible to make loading faster. There’s no css trickery to hide things via mobile size detection. It’s as simple, and as close to a binary check as I can make it.
5 August, 2016
Backtrack to Clean Code

I was watching The Bletchley Circle, about four women who were part of the code breakers in World War II, and how they stumbled upon a serial killer because only they could see the patterns. In the third episode of the first season, the main character is trying to explain why understanding the killer, Crowley, from before he started killing, and she says the following:

At Bletchley, when we came across corrupted data, we had to backtrack till we hit clean code. That’s how you find an error in the pattern. All Crowley’s giving us is corrupted data.We need to backtrack to before he was killing. We need to start from there. That’s how we’ll find him.

I’d never thought of it in those words, but that’s exactly right.

When we debug code, when we find errors, we always backtrack to clean code. Most of us aren’t trying to find psychopaths and serial killers, of course. What we’re trying to do is find the patterns and understand what went wrong. And many times, we’re trying to find patterns when the telling of the breaking doesn’t lend itself to any patterns.

Think about how you describe a situation, how you explain what’s broken. You start with your part. “I was trying to do X.” Then you explain what you expected to happen. “Normally that makes the color blue.” Next you say what did happen. “Instead, it made the color red.”

That’s all well and good, except there’s a great deal missing. Some of it will be pertinent and some won’t. Some will be overkill and useless signal to noise, and some minutiae will be just what is needed to solve a problem. The difficulty is that you may not know what happened that is important. If all you know is ‘I upgraded WordPress’ for example, then you may not be aware of all the changes that went into the WP_Http API. You may not know about the new Multisite functions.

If you’re not a developer, reading the field guide for WordPress 4.6 RC1, and all the linked posts, and did a compare of 4.5.3 to 4.6-RC1, then maybe you’d be surprised when your plugin breaks. And while you thought well of yourself for testing on the release candidate, you’re stunned at how much changed, and not sure what on earth happened.

So you backtrack. You know that the magic sauce is in the requests sent to the server. And you know you’re using wp_remote_request() to do it. So you look at anything related to that. What does it call? Did that change? You step back and back until you find as much as you can, and when you’ve determined it’s ‘something,’ you reach out for help.

In WordPress, this is why we tell people to switch to default themes or disable plugins. We’re asking people to backtrack to code we know is clean. We can’t read minds and know the little things. So we ask people to backtrack in the most obvious ways. “Does it happen with all the other plugins off?”

Backtracking to clean code.

3 August, 2016
Long Term Vision
Say what you will about Jetpack, the plugin serves a great purpose in a few major ways.
1. Once you register for the API, you never have to again.
2. Everything is easy to find to update and configure (Menu -> Jetpack).
3. New Features are added and you don’t need to install a new plugin.
Now look at something else. A company released over a dozen Facebook plugins. All the plugins required you to connect via their API (a separate connection in each). All the plugins required you to use their admin panel to set up a per-plugin configuration. All the plugins deleted those settings on deactivation. Or how about a WooCommerce related set of plugins that all required the use of their API (legitimately) but all the plugin did was connect you and send you to where that specific plugin part was configured?

Got that in your head? Good. Now what if Jetpack did that? What if to enable aspect of Jetpack you had to install Jetpack Stats, Jetpack Comment Form, Jetpack Subscriptions, etc etc etc.

You’d hate Jetpack. And worse, the Jetpack developers would too. They’d have to work extra hard to ensure all the suite of plugins conformed to style and protocol. Shared libraries? Gotta update them in all of the plugins. Oh and don’t forget to make sure they’re all backwards compatible in case someone updates one but not another. Figure out which one takes priority, make sure someone else’s changes on Stats doesn’t break Comment Form, and on and on and on.

There’s a reason Jetpack works as well as it does, and it’s not just because Automattic is behind it. Jetpack has one sign up, one registration, one setup for the connection. Each sub-app is toggled via Jetpack. New additions, when the main plugin is updated, are all easily checked for backcompat and everyone tests together before pushing out.

So why do I call this the long view?

Because the long view considers not just adding new users to your system, but keeping them in a way that makes them happy. The long view looks at the reality that your developers will leave. The long view thinks about the easiest way to maintain a lot of code. The long view makes sure that introducing old users to new things is easy.

And that means, the long view would look at your 15 or 20 plugins that all use the same ‘base library’ and tell you it’s a shitty plan. It’s more hours on more code with more potential conflicts. It’s less cross-code checking. It’s more testing. It’s more unit tests that have to be repeated over and over.

The biggest reason I see people argue that 18 plugins is better than 1 is ‘SEO.’ The quotes are there on purpose. Because it’s bullshit. Anyone who thinks 18 plugins will net you better SEO than one, well written, well curated document file on the master plugin has failed at SEO school and needs to meet Ted. Ted is a 12 inch lead pipe that the boss keeps in the top drawer of his desk at DreamHost. No, not really. But the point remains, they’re not an SEO Expert.

Content is king. Remember that? Duplicate content is bad.

However, in some cases, content is deliberately duplicated across domains in an attempt to manipulate search engine rankings or win more traffic. Deceptive practices like this can result in a poor user experience, when a visitor sees substantially the same content repeated within a set of search results.

That applies to your code too. Duplicate code, duplicate functionality, is bad.

Now there is always a time and a place for multiple separate plugins. I only want to use Easy Digital Downloads extension for Stripe, not any other payment gateway. So I don’t need the extra plugins in a ‘payment gateway suite.’ But there, EDD cleverly has all the base code in their plugin and the add-ons just enable more features. Yoast’s Video SEO is similarly an add-on. They didn’t waste time making a dupe of their main SEO plugin just to add in videos.

I hope the point is made. You can make your code simpler, easier to maintain, and easier for your users to find the new things if you keep it all in one. And that is a win.
1 August, 2016
Balancing Information and Monetization
One of the many ways in which newspapers are failing online is in monetization. We have very few options, when you get down to it.
1. Ads
2. Subscriptions
3. Donations
No company can really survive off donations, so the question really becomes how do we balance ads and subscriptions? Many newspapers have tried the simple tracking method of allowing people to read X number of articles before informing the reader they have to pay. Others throw up splash ads before the article is posted. And another one shows only some of the article before requiring registration.

They’re all problematic.

Users ignore the ads, they don’t register, and they walk away instead of reading. The issue for the user is that they want as few barriers as possible between themselves and the news. They want to pick an article, click the link, and read. To be inundated with ads and signup popups is annoying, and I suspect the attrition rate is abysmal.

This only gets worse when ads get ‘clever’ and make it hard to find the X to click out and get away from them. They trick users into clicking the wrong thing, which only annoys them more. Plus ads can slow things down on mobile, which is increasingly the way for things to go.

Recently I caught myself thinking that one way to encourage registrations in WordPress would be to have the post content ‘disappear’ after X days, unless the user was a member. Of course, that wouldn’t work for all sites, as not everyone wants to register on People.com. Also the old, archival news on The New York Times are things that really only the deep diving researchers (and weird net denizens) are after. Considering we can all go to the library and look everything old up on Microfiche, why do we have to pay for everything old?

So what should be limited?

How about we start with that cesspool of the internet: Comments. This is a double edged sword. If you allow open comments on a news site, consider requiring registration for them. This will allow you to more easily track and ban assholes. Sure, they can make new accounts, but in doing so you can follow them and block them. A win for everyone. Also you can track people who false-report bad people. Spam catchers will stop most bots from signing up at all.

In addition, you can turn off comments for older posts to non-paying users. After 45 days, only paid up members can comment. And make sure you don’t offer refunds if the guidelines are violated. If haters are gonna hate, make ’em pay for it.

Aaron Jorbin – Haters Gonna Hate (by Helen)

As for what content to restrict, it has to be more granular than just time. Take an election year. All articles about Hillary Clinton and Donald Trump should be readable. But read-only. No comments on any of them. Be realistic. Someone famous dies? Unlock all their posts so everyone can read all about them. The Olympics should have historical, important, events unlocked, but at the same time you don’t need every little detail.

This would be a tremendous amount of work, don’t get me wrong, but the days of assuming the internet is free money are long over. If we want people to pay us for content, we have to make it worthwhile.
29 July, 2016