Everyone’s heard of Semalt by now. They are, weirdly, an actual company run by actual people, who are entirely weird and annoying.
I should explain. I’ve talked to them in email and twitter, and I’ve read about them all over the net, like everyone else has. They’re an ‘SEO’ company who trawls the net via bots, just like Google and everyone else, tracking you and your competitors. Here’s how they explain it:
Semalt is a professional webmaster analytics tool that opens the door to new opportunities for the market monitoring, yours and your competitors’ positions tracking and comprehensible analytics business information.
That sounds vaguely legit when you look at it on the surface. They’re based in the Ukraine, which explains the imperfect English, and showed up right around the time Russia was invading, so most of us made Putin jokes and moved on. They’re not actually doing anything bad, they’re just acting like a regular bot, scanning your site…
Except they’re not.
When asked, they’ll tell you that Semalt crawler bots visit websites and gather statistical data for their service, simulating real user behavior. Their crawler bots, and yes, they admit they’re bots, don’t click on advertising banners or extend links. And all the visits are automatic and random.
This means their goal is to get a bot that acts like a human. Now I don’t know about you, but I don’t trust anything when I can’t see it’s brain, and I certainly know better than to believe in true random when it comes to software. But what gets me is how you stop the bot from scanning.
Everyone uses a robots.txt file to block bots from scanning things they don’t need to scan. If you use WordPress and have pretty permalinks on, go to http://example.com/robots.txt and you’ll see a default file, made by WordPress, to block various folders like wp-admin from being scanned.
Semalt ignores these. They also ignore things like bot rate limiting, and they use IPs from around the world to scan your site (arguably to get a better idea of real speed and response), so they end up acting a little like a DDoS attack. Worse, they claim to act like a ‘user’ but I never have a link to my wp-admin pages from the front of my site, which means their bot is checking for WordPress and going there not because a user would have any reason, but simply because Semalt knows WordPress is there.
Besides that, what’s the real issue here? Semalt is screwing up my stats. They’re using referrer links to check my sites out, which means I have a bunch of referral links like this:
Those links tell me someone linked to me, and generally I go back and check them out to see if they’re something I want to talk to or work with. These are not. Worse, they don’t really act like ‘real’ users, despite the claim. Karen Francis has a great explanation as to why Semalt is ruining your bounce rates in Google, and a couple good ways to block them.
Am I blocking them? No, not right now. Do I trust them? Not at all. They make it ‘easier’ for someone else to compare themselves to me, which is laudable, but they do it in a way that makes it harder for me to understand how my sites are doing. And that, to me, is the epitome of the goal of all black hat SEO companies. They gain at someone else’s loss.