Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: google

  • Great Walls of Fire

    Great Walls of Fire

    If your website is ‘international’, you’ve probably already run into this. If you’ve ever worked for cooperate America, you’ve seen this happen. If you’re on an out of date browser, you’ve definitely seen it.

    It’s that horrible day when your site looks like absolute crap because something critical to your site isn’t loading because it cannot be accessed from where you are.

    Let’s step back.

    A wall of fire from a fireball

    Using Google’s Open Sans doesn’t work in China because Google is, generally speaking, blocked in China. What does this mean for people there? Well their WordPress dashboard is going to load slowly, in fact for many it’s so slow as to be unusable. This is because WP 3.8 uses Google fonts on the back-end. Now, you can disable it with the Disable Google Fonts plugin, but many people feel that’s a poor user experience.

    I want to take a moment to note that while I, personally, dislike using Google for required fonts on WordPress, there was a discussion on bundling vs linking that took place in November 2013. At that time, everyone agreed bundling (that is including the fonts with WP) would be best, however how to do that was more complicated. In order to provide a good experience for all users and all languages, it got messy and large really fast. You can read up on the history on make/core: Open Sans, bundling vs. linking.

    That being said, by remotely calling Google, there are two main issues: privacy (which I’m not getting into) and accessibility (that’s what I want to poke at).

    We know that calling Google remotely, for everyone in a country with good internet speeds, is great as it speeds up your site! That’s the premise behind Use Google Libraries after all. If someone’s already downloaded the files in their cache for JS (or fonts), then the next site they visit will load faster if it uses the same ones! It’s a sound, and accurate, theory.

    But are you making your site inaccessible with your need for speed?

    What happens when someone can’t load Google? In the case of the Use Google Libraries plugin, it falls back to using WP’s standard, which is good, but that initially connection has to fail first! The same happens for WordPress’s back end, which means the time it takes to load your site is longer for those users.

    I tell this to people a lot, and generally they reply “Well I don’t have users in China.” or “This doesn’t affect my visitors.” Really? REALLY? Sorry, but that is one of the most shortsighted views I’ve ever heard in the history of ever, and one I tell people “Yeah, that’s an incorrect and rather arrogant conceit.”

    Mt. Fuji

    Let me tell you a story. I went to Japan for 12 days to hike O’Henro with my Buddhist brother and our agnostic father. We made the “A Jew, a Buddhist, and an Agnostic walk into a temple…” jokes. While there, I checked in on my websites and on my regular sites I visited, only to find out that some were blocked because we were using a net connection that went through a country those sites blocked. Why? Because “no one” who would ever visit their site was from there.

    I no longer use those services, I no longer support those sites. My father goes to China regularly (he does risk assessment on the disused weapons caches in China). I turn off Google Fonts on his website in order that he, and his customers, can see the site as intended. You can tell me all you want that ‘no one’ visits from those places, and you’re just wrong, and arrogant, and yes, I strongly disagree with WordPress having made Open Sans via Google a requirement like that.

    My personal dislike of Google aside, it’s a reality I look at more than I wish I had to that people and places block third parties. This is why I get on the case of plugin developers who use third-party services when they don’t have to. They’ve created an unnecessary dependency on this other service, which will crazy to debug when someone says their code doesn’t work.

    Your code should be as self-sufficient as humanly possible. Offloading for ‘speed’ doesn’t work for all situations, and instead can make things slower by causing more external resources to load. Have you ever looked at those scan reports where they say your site calls too many sites for JS or CSS? This is what you’re doing. Even though it increases the odds that people who can get to these sources will already have the files cached, people who can never access them have a worse time. And then loading them locally will make your site heavier and load slower, unless you use proxy-caching (like Pagespeed or Varnish).

    It’s not a perfect solution for everyone. This is why websites are hard.

  • Plugin Wish: Login With Google

    Plugin Wish: Login With Google

    Now I know what you’re thinking. “Mika, there are a hundred plugins that let you log in via Google!”

    That’s not what I mean. Let me explain with a story.

    You have a business, example.com, and you use Google Apps for everything. Then you start tying this into other companies, like a time sheet company, that let’s you ‘Login with Google’ and redirects you to the right company settings. Cool, right? Kind of like this:

    replicon

    And you think you’d like an internal, private, blog, where people can post cat pictures. Or whatever. What if you could just have the login screen be that Google button? And you know there’s a bajillion plugins for it, but you want to have it be only people on example.com. So you@gmail.com can’t login, but me@example.com and dad@example.com can too!

    I want that.

    I have not yet seen it, but I think that would be an amazing plugin. By default, the domain it ‘validates’ would be the one on which it’s installed (so here it’d be halfelf.org), but you could override it (which is good, since I’d want to use ipstenu.org). Then you’d want it to ‘generate’ new users if they don’t exist, since you don’t want to have to add every single new person, right?

    Oh and you don’t have to terribly worry about that fired guy, bob@example.com, because once he’s fired and you disable the email account, he can’t log in!

    Some concerns of course would be Two-Factor Authentication. Also how do you handle multisite? I would envision a default nothing-set option for Multisite, where the network admin could network activate, and set the default domain there. Add in a check box for “Allow individual sites to override?” at the very least. Maybe a sneaky “Always allow the super admin to log in” setting too, though that gets complicated fast.

    Cliff Seal pinged me about this and said he’d been fidddling with https://github.com/logoscreative/wordpress-openid but he never finished. Who’s up for the challenge?

    And no, it did not escape me the hilarity of me, a loud “I don’t like Google owning all my data!” person suggesting this.

  • Privacy and Evil and Money

    Privacy and Evil and Money

    Google likes to say ‘You can make money without doing evil.’ It’s right in their Company Philosophy.

    I’ve never bought into that. I mean, I agree you can do it without being evil, but I think that evil is highly subjective and what I feel is evil may not be what they do. Case in point would be endorsements.

    Maybe you’ve noticed when you Google search, sometimes your friends’ recommendations pop-up in the results. Like I searched for fabric stores and got results from my BFF, Andrea. That was amusing, but also disturbing. See, there’s a big difference between search results, and results in ads.

    Let’s step back. Here’s what Google says about their ‘endorsement‘ system:

    Google makes it easy for you to get great recommendations from your friends. For example, when you visit the Google Play music store, you may see that a friend has +1’d a new album by your favorite artist. When you search for a restaurant, you may see an ad including a 5-star review by another friend.

    That sounds pretty cool, right? My friends, people I follow on G+, contribute to my results. That’s sensible, since one presumes I share some interests with my friends. But then you scroll down the page and see a section about endorsements in ads.

    This setting below allows you to limit the use of your name and photo in shared endorsements in ads. It applies only to actions that Google displays within ads; the “Summertime Spas” example above shows a shared endorsement appearing in an ad on Google Search. Changing this setting does not impact how your name and photo might look in a shared endorsement that is not in an ad — for example, when you share a music recommendation that is displayed in the Play Store. You can limit the visibility of activity outside of ads by deleting the activity or changing its visibility settings.

    google_moneyLet me get this straight. People pay for ads on Google, so Google is making money. People click on the ads, so the advertiser makes money. My ‘endorsements’ are posted, without my permission, to drive traffic to those ads to make people money. I am not paid for this service.

    Thanks, Google. Guess what I just unchecked?

    Look, if you want to use me in search results, that’s one thing. Using me in ads is another. If a company took a comment I made in email and used it on their site to say “The Half-Elf loves our cocoa!” without asking me first, I’d be upset. I don’t ever expect to be compensated for my endorsements, but I do expect to opt-in to them. Here’s a real world example. I went to a spa and they had a ‘fill out this card to tell us what you think’ thing at the end. At the bottom was a box. “Check here if we can use your comments, or excerpts there of, in our advertising.” I thought about it, looked at what I wrote, and checked the box.

    But they let me opt in. They asked me for my permission to use me to make more money than the money I gave them for services rendered. I have no idea if they did use what I said, but I liked that they asked (and I liked the services) so I went back a couple times before moving across the country.

    I wish Google understood that sort of respect.

    Have a read of their updated TOS just for fun.

  • Hotlinking is Evil (And So Is Google)

    Hotlinking is Evil (And So Is Google)

    We all know hotlinking is a bad thing. Hotlinking uses up someone else’s bandwidth, which costs them money. It takes away from any profit they might make on ads, because you’re not going to their site. It removes their credit from images. So why did Google decide to hotlink when they made their faster image search?

    This is what the new image search looks like:

    Faster Image Search

    I’ll admit, that looks pretty nifty. It’s a fast way to see images. But it’s also a fast way to lose attribution. Here’s what just the new image box looks like.

    Close Up

    This image now loads ‘seemingly’ locally. It’s totally a part of Google, though, there’s no reference back to how the site looks (it used to be an overlay). In fact, most people will just see the image, copy if they want, and move on to the next site. No one has any reason to dig deeper and to visit the image’s page.

    By contrast, the thumbnail images you see on Google, if you viewed source, look like this: https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcQPNtMLkk8rwj3lLv6a2kEQ8_eo6BuiUZYn3N5z3cbMu6rVPo3Xkw If you go to gstatic.com though, all you get is a 404 error page, but it’s pretty easy to find out this is where Google saves all their static content. Including images. These thumbnails are in moderate to low quality, and if that was all Google did, show small, iffy, thumbnails and redirect people to the real site, that would be great. Instead, now they actively hotlink from you. Oh yes, that full image you saw in my screenshot was directly linked to the owner’s media file.

    The first thing I did after noticing this was to add the following to my robots.txt:

    User-agent: Googlebot-Image
    Disallow: /
    

    Those directions are right from Google, who doesn’t even pitch you any reason to why you wouldn’t want to do this. Normally they’ll tell you ‘You can, we’d rather you didn’t because of XYZ, but here it is anyway.’ This time, it’s a straight up ‘Here’s how.’ I find that rather telling.

    Naturally I went on to read why Google thought this was a good idea

    The following points are all reasons Google thinks this is better.

    We now display detailed information about the image (the metadata) right underneath the image in the search results, instead of redirecting users to a separate landing page.

    The first part about this, the detailed information, is great. Having the meta-data right there without redirecting to the separate page like the used to, with the data on the side that no one read, is an improvement. Thank you for that.

    We’re featuring some key information much more prominently next to the image: the title of the page hosting the image, the domain name it comes from, and the image size.

    Again, this is great. I think that the data should be more visible than it is, especially the ‘This image may be copyright protected’ stuff. Considering Google won’t allow you to use ads if you use copyright protected material (which they claim I do here, by the way), they really have a higher measure of standard to live up to when it comes to informing people of the stick by which they are measured.

    The domain name is now clickable, and we also added a new button to visit the page the image is hosted on. This means that there are now four clickable targets to the source page instead of just two. In our tests, we’ve seen a net increase in the average click-through rate to the hosting website.

    I can see this being true. Again, the links should be more obvious, and really they should link not to the image directly, but the contextual page in all cases. Traffic is important, and if you send people to the image page where they don’t see the ads, you’re causing them to lose money. So the idea behind this part is really nice, and I’m for it, it just needs some kick-back improvements. Google should give people a good reason to go to the parent site. And this next item is where they fail…

    The source page will no longer load up in an iframe in the background of the image detail view. This speeds up the experience for users, reduces the load on the source website’s servers, and improves the accuracy of webmaster metrics such as pageviews. As usual, image search query data is available in Top Search Queries in Webmaster Tools.

    And now we hit the problem. While this is true (it will be both faster and use less of my bandwidth while decreasing load), it’s still showing my image off my servers! Worse? It’s got the full sized image from my server, which means if I have a 4 meg photo (and I do), they’ll be pulling all 4 megs down, and the reader can just right-click and save. They never need to touch my site.

    As Bill and Ted would say, Bogus.

    Go back to how Google shows thumbnails. They have their own, lower-rez version. I regularly post other people’s images on a site, and when I do, I purposefully keep a lower resolution version on my site, and link to them for the best. Why? Because it’s their image. They did the work, they made it, I should honor them and respect them, and be a good net-denizen. Google’s failing on that.

    For me their search has always been a little questionable for images. Now it’s outright evil.

  • CentOS and PHP 5.4

    CentOS and PHP 5.4

    PHP ElephantsI finally got around to PHP 5.4

    Alas this meant reinstalling certain things, like ImageMagick and APC.

    This also brought up the question of pagespeed, which I keep toying with. I use it at work, but since this server’s on CentOS with EasyApache, there’s no ‘easy’ way to install PageSpeed yet (not even a yum install will work), so it’s all manual work plus fiddling. I don’t mind installing ImageMagick and APC, but Google’s own ‘install from source’ aren’t really optimized for CentOS, even though they say they are, and I’m nervous about the matter. Well… I did it anyway. It’s at the bottom.

    The only reason I had to do this all over is that I moved to a new major version of PHP. If I’d stayed on 5.3 and up’d to 5.3.21, that wouldn’t have mattered. But this changed a lot of things, and thus, a reinstall.

    ImageMagick

    ImageMagick I started using ImageMagick shortly after starting with DreamHost, since my co-worker Shredder was working on the ‘Have WP support ImageMagick’ project. It was weird, since I remembered using it before, and then everyone moved to GD. I used to run a photo gallery with Gallery2, and it had a way to point your install to ImageMagick. Naturally I assumed I still had it on my server, since I used to (in 2008). Well since 2008, I’ve moved servers. Twice. And now it’s no longer default.

    Well. Let’s do one of the weirder installs.

    First you install these to get your dependancies:

    yum install ImageMagick
    yum install ImageMagick-devel
    

    Then you remove them, because nine times out of ten, the yum packages are old:

    yum remove ImageMagick
    yum remove ImageMagick-devel
    

    This also cleans out any old copies you may have, so it’s okay.

    Now we install ImageMagick latest and greatest from ImageMagick:

    cd ~/tmp/
    wget http://imagemagick.mirrorcatalogs.com/ImageMagick-6.8.1-10.tar.gz
    tar zxf ImageMagick-6.8.1-10.tar.gz
    cd ImageMagick-6.8.1-10
    ./configure --with-perl=/usr/bin/perl
    make
    make install
    

    Next we install the -devel again, but this time we tell it where from:

    rpm -i --nodeps http://www.imagemagick.org/download/linux/CentOS/x86_64/ImageMagick-devel-6.8.1-10.x86_64.rpm
    

    Finally we can install the PHP stuff. Since I’m on PHP 5.4, I have to use imagick-3.1.0RC2 – Normally I’m not up for RCs on my live server, but this is a case where if I want PHP 5.4, I have to. By the way, next time you complain that your webhost is behind on PHP, this is probably why. If they told you ‘To get PHP 5.4, I have to install Release Candidate products, so your website will run on stuff that’s still being tested,’ a lot of you would rethink the prospect.

    cd ~/tmp/
    wget http://pecl.php.net/get/imagick-3.1.0RC2.tgz
    tar zxf imagick-3.1.0RC2.tgz
    cd imagick-3.1.0RC2
    phpize
    ./configure
    make
    make install
    

    Next, edit your php.ini to add this:

    extension=imagick.so
    

    Restart httpd (service httpd restart) and make sure PHP is okay (php -v), and you should be done! I had to totally uninstall and start over to make it work, since I wasn’t starting from clean.

    Speaking of clean, cleanup is:

    yum remove ImageMagick-devel
    rm -rf ~/tmp/ImageMagick-6.8.1-10*
    rm -rf ~/tmp/imagick-3.1.0RC2*
    

    APC

    APCI love APC. I can use it for so many things, and I’m just more comfortable with it than xcache. Part of it stems from a feeling that if PHP built it, it’s more likely to work. Also it’s friendly with my brand of PHP, and after 15 years, I’m uninclined to change. I like DSO, even if it makes WP a bit odd.

    Get the latest version and install:

    cd ~/tmp/
    wget http://pecl.php.net/get/APC-3.1.14.tgz
    tar -xzf APC-3.1.14.tgz
    cd APC-3.1.14
    phpize
    ./configure
    make
    make install
    

    Add this to your php.ini:

    extension = "apc.so"
    

    Restart httpd again, clean up that folder, and then one more…

    mod_pagespeed

    mod_pagespeedI hate Google. Well, no I don’t, but I don’t trust them any more than I do Microsoft, and it’s really nothing personal, but I have issues with them. Now, I use PageSpeed at work, so I’m more comfortable than I was, and first I tried Google’s installer. The RPM won’t work, so I tried to install from source, but it got shirty with me, fast, and I thought “Why isn’t this as easy as the other two were!?” I mean, APC was stupid easy, and even easier than that would be yum install pagespeed right?

    Thankfully for my sanity, someone else did already figure this out for me, Jordan Cooks, and I’m reproducing his Installing mod_pagespeed on a cPanel/WHM server notes for myself.(By the way, I keep a copy of this article saved to DropBox since invariably I will half-ass this and break my site.) Prerequisite was to have mod_deflate, which I do.

    The commands are crazy simple:

    cd /usr/local/src
    mkdir mod_pagespeed
    cd mod_pagespeed
    wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_x86_64.rpm
    rpm2cpio mod-pagespeed-beta_current_x86_64.rpm | cpio -idmv
    cp usr/lib64/httpd/modules/mod_pagespeed.so /usr/local/apache/modules/
    chmod 755 /usr/local/apache/modules/mod_pagespeed.so
    mkdir -p /var/mod_pagespeed/cache
    chown nobody:nobody /var/mod_pagespeed/*
    

    Once you do this, you have to edit the file, and this is where I differ from Jordan’s direction. He just copied this over /usr/local/apache/conf/pagespeed.conf but I had an older version from a ‘Let’s try Google’s way….’ attempt and someone else’s directions, so I made a backup and then took out the ModPagespeedGeneratedFilePrefix line since I know that’s deprecated. I also added in a line to tell it to ignore wp-admin.

    Here’s my pagespeed.conf (edited):

    LoadModule pagespeed_module modules/mod_pagespeed.so
    
    	# Only attempt to load mod_deflate if it hasn't been loaded already.
    <IfModule !mod_deflate.c>
    	LoadModule deflate_module modules/mod_deflate.so
    </IfModule>
    
    <IfModule pagespeed_module>
    	ModPagespeed on
    
    	AddOutputFilterByType MOD_PAGESPEED_OUTPUT_FILTER text/html
    
    	ModPagespeedFileCachePath "/var/mod_pagespeed/cache/"
    
        ModPagespeedEnableFilters rewrite_javascript,rewrite_css
        ModPagespeedEnableFilters collapse_whitespace,elide_attributes
        ModPagespeedEnableFilters rewrite_images
        ModPagespeedEnableFilters remove_comments
    
    	ModPagespeedFileCacheSizeKb 102400
    	ModPagespeedFileCacheCleanIntervalMs 3600000
    	
    	# Bound the number of images that can be rewritten at any one time; this
    	# avoids overloading the CPU. Set this to 0 to remove the bound.
    	#
    	# ModPagespeedImageMaxRewritesAtOnce 8
    
    	<Location /mod_pagespeed_beacon>
    		SetHandler mod_pagespeed_beacon
    	</Location>
    
    	<Location /mod_pagespeed_statistics>
    		Order allow,deny
    		# You may insert other "Allow from" lines to add hosts you want to
    		# allow to look at generated statistics. Another possibility is
    		# to comment out the "Order" and "Allow" options from the config
    		# file, to allow any client that can reach your server to examine
    		# statistics. This might be appropriate in an experimental setup or
    		# if the Apache server is protected by a reverse proxy that will
    		# filter URLs in some fashion.
    		Allow from localhost
    		Allow from 127.0.0.1
    		SetHandler mod_pagespeed_statistics
    	</Location>
    
    	ModPagespeedMessageBufferSize 100000	
    	ModPagespeedDisallow */wp-admin/*
    	ModPagespeedXHeaderValue "Powered By mod_pagespeed"
    
    	<Location /mod_pagespeed_message>
    		Allow from localhost
    		Allow from 127.0.0.1
    		SetHandler mod_pagespeed_message
    	</Location>
    	<Location /mod_pagespeed_referer_statistics>
    		Allow from localhost
    		Allow from 127.0.0.1
    		SetHandler mod_pagespeed_referer_statistics
    	</Location>
    </IfModule>
    

    To tell Apache to run this, edit /usr/local/apache/conf/includes/pre_main_global.conf and add:

    Include conf/pagespeed.conf

    Note: We put this code here because EasyApache and httpd.conf will eat your changes.

    Finally you rebuild Apache config and restart apache and test your headers to see goodness! My test was a success.

    HTTP/1.1 200 OK
    Date: Mon, 21 Jan 2013 03:12:13 GMT
    Server: Apache
    X-Powered-By: PHP/5.4.10
    Set-Cookie: PHPSESSID=f4bcdae48a1e5d5c5e8868cfef35593a; path=/
    Cache-Control: max-age=0, no-cache
    Pragma: no-cache
    X-Pingback: https://ipstenu.org/xmlrpc.php
    X-Mod-Pagespeed: Powered By mod_pagespeed
    Vary: Accept-Encoding
    Content-Length: 30864
    Content-Type: text/html; charset=UTF-8
    

    For those wondering why I’m ignoring wp-admin, well … sometimes, on some servers, in some setups, if you don’t do this, you can’t use the new media uploader. It appears that PageSpeed is compressing the already compressed JS files, and changing their names, which makes things go stupid. By adding in the following, I can avoid that:

    	ModPagespeedDisallow */wp-admin/*
    

    Besides, why do I need to cache admin things anyway, I ask you?

    So there you are! Welcome to PHP 5.4!

  • Google Apps Ain’t Free

    Google Apps Ain’t Free

    gmaildollarA lot of my friends tout Google Apps for email. I use them on three sites, mostly as an experiment.

    Google Apps for Business isn’t free anymore. You used to be able to go to their pricing and click on free to set everything up for a few users. No more.(Irony, Google Apps doesn’t accept Google Checkout for currency.)

    What does this mean? Well a lot of people will have to run their own mail servers again. It’s not surprising to me, given how hard Google had been making it to find the free version, but it is a bit of a dick move to say “And now we’re charging.” I would have thought dropping the number of free from whatever it is now to five, or even one, email account would have been better. Actually, what would be great is if Google had a just email domain mapping you could do, but they don’t.

    This sucks a lot for a lot of people, including me as I was thinking it’d be nice to have all my friends who are hosted here using Google for email – they know it and are used to it.

    ETA: As Otto pointed out, the no-longer-free is only for new people. Anyone with an existing account is fine. So don’t make any more domains! (And now we see how IP addresses will last…)