I got an odd email from ‘Yan’ who, amidst the odd hate and sexist filled remarks that apparently I don’t like people in the WP community, had a legit question:
Seen your post:
But… Your actual robots.txt does not reflect the content of the article. Hmmm…
Well. That was almost 2 years ago. And it never reflected the content of this site. I don’t care if this site has it’s images snagged. I have anti-hotlinking protection in my .htaccess anyway. It’s my other site, my massively major huge gallery with 10G of photos that I protect.
That brings up a point that many people miss about this site. I’m not always talking about what I do here. For example, I talked about making a slide-up bit of code that duplicates WordPress.com’s follow tab. That email setup isn’t here and it never will be. I’m not particularly worried about my followers’ ability to find the sidebar email registration box there. This is a site for slightly more technical people.
However, the site that code is on is a site visited by luddites, primarily. They want their news about their thing in a way that is simple, straightforward, and easy. They need a reminder. Also they share links on social media a lot, so making a slide-up that auto-pops when you come from Facebook or Tumblr or Twitter was the right choice to make sure people knew what was going on and how to sign up. It’s had amazing results.
There are lessons I learn from running multiple sites and I bring them all back here to people who would appreciate them.
Now if people wonder, yes I do think that Google’s still evil for how they hotlink images. Of course, I’d think them equally evil for copying my images. Image search is just a really messy thing. The two options you get for it is that either Google has a copy of all the images on the planet or they hotlink. Even assuming they’re clever enough to protect themselves from duplicate images using some sort of super powerful algorithm, you get those options and each has a problem. If you’ve copied everything, you have to have a file server the likes of which would make the pyramids look teeny tiny, and in both cases you need a database with enough speed to stop us all from running Google Pagespeed tests on Google.
Am I the only one who does that? Oh. Sorry.
We’re talking about Google being evil and robots.txt files. The site that I do block Google Images on has a very large robots:
User-Agent: * # My stuff Disallow: /cgi_bin/ # WordPress Disallow: /trackback/ Disallow: /blog/ Disallow: /wp/ Disallow: /wordpress/wp-admin/ Disallow: /wordpress/wp-includes/ Disallow: /wordpress/xmlrpc.php Disallow: /wp-admin/ Disallow: /wp-content/ Disallow: /wp-includes/ Disallow: /xmlrpc.php Disallow: /wp- # Gallery Disallow: /gallery/albums/ Disallow: /gallery/themes/ Disallow: /gallery/zp-core/ Disallow: /gallery/zp-data/ Disallow: /gallery/page/search/ Disallow: /gallery/uploaded/ Disallow: /gallery/rss.php Disallow: /gallery/rss-comments.php Disallow: /gallery/README.html Disallow: /gallery/rss-news-comments.php Disallow: /gallery/rss-news.php # Wiki Disallow: /wiki/images/ Disallow: /wiki/bin/ Disallow: /wiki/cache/ Disallow: /wiki/config/ Disallow: /wiki/docs/ Disallow: /wiki/extensions/ Disallow: /wiki/languages/ Disallow: /wiki/maintenance/ Disallow: /wiki/math/ Disallow: /wiki/public/ Disallow: /wiki/serialized/ Disallow: /wiki/tests/ Disallow: /wiki/skins/ Disallow: /wiki/t/ Disallow: /wiki/index.php User-agent: Mediapartners-Google Allow: / User-agent: Adsbot-Google Allow: / User-agent: Googlebot-Image Disallow: / User-agent: Googlebot-Mobile Allow: / User-agent: Browsershots Allow: / User-agent: Dotbot Allow: /
So yes, actually, I am still using that code. There you are.