Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: domain

  • HTTPS and WordPress

    HTTPS and WordPress

    Really there’s a right way and a not-quite-as-right way to handle HTTPS on WordPress. It’s not that hard to do, and if your whole site is going to be HTTPS, then the easiest way is to change your home and site URLs to be https://example.com/ and put define( 'FORCE_SSL_ADMIN', true ); in your wp-config.php file. Then you should (if this is an existing site) search your database for the old HTTP url and change that to HTTPS.

    Seriously, that’s it. That tells WordPress to be HTTPS all the way and you’re done. Of course, that doesn’t actually work 100% for everyone, because there are some silly plugins and themes that do things like this:

    add_action('wp_enqueue_scripts', 'enqueue_google_maps');
    function enqueue_google_maps() {
      wp_enqueue_script('google-maps', 'http://maps.googleapis.com/maps/api/js?&sensor=false', array(), '3', true);
    }
    

    The problem there is they’ve defined the script as HTTP and if your site is HTTPS then you’re going to get mixed content messages. And the real issue here is that means your connection is only partially encrypted! That non-encrypted content is accessible to sniffers and can be modified by man-in-the-middle attackers. This, clearly, is not safe anymore. The right way to do your enqueues is with protocol relative URLs:

      wp_enqueue_script('google-maps', '//maps.googleapis.com/maps/api/js?&sensor=false', array(), '3', true);
    

    Alternately you can just use the HTTPS url, because that won’t break HTTP visits and it won’t make anything less secure.

    But. Since you really can’t go in and edit all your themes and plugins, the plugin WordPress HTTPS is the way to go. That can force everything around. I know it’s not updated in a long time, but it still works. I keep thinking I’ll fork and clean it up… Well in my free time. The point of that plugin is that it lets you force everything to HTTPS, and will rewrite things on the fly. It’s a good idea.

    Instead of using the plugin, I’ve seen a lot of people do this in their .htaccess:

    RewriteEngine On 
    RewriteCond %{SERVER_PORT} 80 
    RewriteRule ^(.*)$ https://example.com/$1 [R,L]
    

    In and of itself, this isn’t wrong. This forces everything HTTP to redirect to HTTPS. The problem is you’re still actually sending data from WordPress over HTTP first, and you’re right back to opening up to man-in-the-middle attacks because the data from WordPress goes from HTTP first and that’s, say it with me kids, insecure!

    Now that said. This should be okay for most things. The POST calls should be sent securely, and all you should see on the return end is everything after that 301 redirect, but we can’t be absolutely sure about this. My buddy Jan used mod_substitute to force HTTPS (back before he moved to nginx). His code looks like this:

    <Location />
     AddOutputFilterByType SUBSTITUTE text/html
     Substitute "s|href="http://example.com/|href="https://example.com/|"
     Substitute "s|href='http://example.com/|href='https://example.com/|"
     Substitute "s|src=\'http:|src=\'|"
     Substitute "s|src=\"http:|src=\"|"
    </Location>
    

    In doing this, he doesn’t need to worry about the HTTPS plugin I mentioned, because it forces everything with a src attribute to be protocol relative. He also doesn’t have to search/replace his content if he doesn’t want to, which makes switching back easier. If you wanted to do that. But as Jan pointed out to me, he switched to nginx because it’s easier and supports variable substitutions.

    Should you use .htaccess or nginx to force https instead of a plugin? That’s totally up to you. I use the plugin since I trust it to only mess with WordPress and not anything else I may have lying around. Also since my domains are often more than just WordPress, it’s a little easier for me to segregate their control. The flip side to this is that WordPress doesn’t redirect http traffic.

    By this I mean if you turn your whole site to HTTPS properly, you can still go to http://example.com/this-is-a-page/ and WordPress will load it as HTTP. This is and is not a bug. WordPress is (properly) trusting your server to tell it what it should be. Your server is saying “Be HTTP or HTTPS! Whatever!” Now there is a trac ticket to have FORCE_SSL really force SSL but that’ll be a while because there are a lot of complications in that change.

    So yes, for now, I would use .htaccess to add an extra later of SSL forcing, but with a bit of caution. If you’re proxying HTTPS (like you’re on a Varnish cache behind something like Pound or nginx) then you may need to use this code for your .htaccess redirect.

    RewriteCond %{HTTP:X-Forwarded-Proto} !https
    RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [R,L]
    

    The reason for this is Apache can’t always see SSL if it’s not in charge of it (because it’s proxied or handled by a load balancer), and to teach it where it really lives. The trick there is the code I just showed you may not be right because every server’s a little different. There’s a great StackOverflow post on the problems of redirect loops while forcing https that you should read.

    Good luck, and safe HTTPSing!

  • Mailbag: I Lost My Site

    Mailbag: I Lost My Site

    Don from Worcestershire started an email in a way that normally would result in a quick delete:

    Before you cut me off I am a 78 year old gambling historian, specializing in horse racing between 1850 and 2015.the age alone will convince you that my knowledge of this topic is limited.

    It’s not the age, Don, it’s that the email starts with a long, dramatic, kind of non-essential story about your life, your ideas, your dreams, your goals, and your son. By the way, I’m sorry for your loss.

    Yes, I read the whole email.

    It took a long time to sift through the drama to sort out the issues.

    1. Spam – He had a lot of spam and used Akismet (yay!) to deal with it.
    2. Bandwidth – The spam made his site hit the overages on his hosting.
    3. Domain suspension – Due to circumstances, he was late in paying his feeds and the domain was sniped.

    I feel bad. I really do. But here’s the thing, Don. Your domain is like your rent. If you don’t pay for it, you lose your home. And while this sucks a lot, and yes there were mitigating circumstances, you didn’t pay in time, and the company is legally within their rights to sell. It’s the same as your phone. No pay, no phone. It’s just that simple.

    Now I have good news.

    According to whois records, you still own your domain! So you actually didn’t lose the domain you lost the hosting plan.

    These are different things. They’re very easy to get confused.

    There isn’t a great analogy to all this, I’m afraid, but as it works, you’re paying for two things.

    First you pay for the domain. This reserves your ‘name’ on the internet. I recommend paying for it first for a year and then, if you like how things are going, pay for as long as you possibly can. I did mine for a decade at one point in time. I knew I wanted the domains and I knew I was using them.

    Once you have the domain, you need to pay for webhosting. The host is where you data is stored.

    Now I need to take a digression.

    BACK UP YOUR WEBSITE OFFLINE.

    If the backup tool you use only lets you backup to your webserver, it’s a shitty backup tool. Stop using it unless you are automatically downloading that backup somewhere else.

    Because you see, Don, what happened was that you didn’t pay for your webhosting. You didn’t pay for the storage unit that housed your data. And they can auction that off like they do on shitty shows like Storage Wars.

    So what do you do when this happens?

    If it’s just the webhost, it’s easy. Contact the webhost. http://www.whoishostingthis.com/ is a great resource to find out who your host is. Be honest but keep it short. “I’m sorry, I wasn’t able to pay on time and my site was suspended. Is there any way I can get it back?” That’s it! That’s all you have to do. If you’re lucky, they may still have all your data. You pay, they flip a switch, it’s back.

    The worst case … that’s why you need backups.

    If it’s the domain registration, though, that can be a mess. If you bought the domain through your webhost, it may surprise you to find out that the host doesn’t have control over your domain registration. The host is an intermediary. That means, if you go to your site and see a placeholder page owned by a domain registrar, it may or may not be a cybersquatter.

    You may have heard about domain hijacking or domain theft. That’s when someone changes the registration of your domain name without your permission. A hijacking is not the same as when you’ve failed to pay for your domain and the registrar slaps up a placeholder. A great many hosts put up a branded placeholder if you’ve registered a domain and not yet updated content. Sometimes it says “This domain has been registered at…”

    If it says “This domain is suspended” then the issue is with the webhost. If it says “This domain has expired” then it’s likely to be the registrar. You need to figure out who the registrar is, log in with your info, pay the fine, and get the site back.

    I strongly urge you to put a reminder in your to-do list or whatever you use to keep track of things. “Domain name renewal due on day X.” It’s like paying your rent. Don’t forget. Make reminders. Do it.

    By the way, no matter whom you talk to, don’t give them the sob story. While they do care, in as much as any human does, it rarely changes the reality of what’s going on. Shit happened, you couldn’t pay. Your personal drama is not their problem. I know how harsh that sounds, but it’s not. And the more you make it how you need an exception because you’re a special case, the more people hear it as an excuse.

    I know it’s not. You know it’s not. Except sometimes, Don, for a lot of people, it is. If I told you how many idiots complain they couldn’t pay for $4/month hosting, while stilly buying a top of the line iPhone, you’d understand why it’s draining.

    And as someone who’s fucked up before, I find that being honest where it’s my fault gets better results. “I’m sorry. I screwed up and didn’t pay. Is there anything I can do to get my content back? It matters a lot to me, and I’d appreciate anything you can do to help me.”

    Works great.

    By the way, Don, I see that you have your site back right now. You should upgrade. You’re running WordPress 2.6.1 and that’s really old and vulnerable.

  • Mapped Domains And Google Search

    Mapped Domains And Google Search

    The other day I was surprised to learn that Google still looks for tech.ipstenu.org

    Kind of.

    If you go search for it, Google still believes that URL is a real thing: https://www.google.nl/search?q=site:tech.ipstenu.org

    Some of those URLs were made long after I mapped the domain, by the way. And yes, of course I have a 301 redirect for the subdomain.

    <If "%{HTTP_HOST} == 'code.ipstenu.org' || %{HTTP_HOST} == 'tech.ipstenu.org' ">
        RedirectMatch 301 (.*) https://halfelf.org$1
    </If>
    

    What’s going on here? Strictly speaking, Google’s right and stupid. The URLs are correct, but Google should be honoring the 301 redirect. Because it’s not, you have to tell it not to trawl your subdomains and use a robots.txt file, just for your mapped subdomains.

    First we’ll need to make a special robots.txt file, like robots-mapped.txt, and put the following in it:

    User-agent: *
    Disallow: /
    
    User-agent: Googlebot
    Noindex: /
    

    This tells Google to sod off. Then you need to specify when to use this special file, and that brings us to the lands of options. Since .htaccess is a top-down file, that is it reads from the top of the file down, you can get away with this:

    RewriteCond %{HTTP_HOST} = (code|tech).ipstenu.org
    RewriteRule ^robots\.txt$ /robots-mapped.txt [L]
    

    Just have that above any redirect rules for other things. But what if, like me, you’ve got Apache 2.4?

    <If "%{HTTP_HOST} == 'code.ipstenu.org' || %{HTTP_HOST} == 'tech.ipstenu.org' ">
        RedirectMatch 301 ^/robots\.txt /robots-mapped.txt
        RedirectMatch 301 (.*) https://halfelf.org$1
    </If>
    

    Of course, that sends tech.ipstenu.org/robots.txt to https://halfelf.org/robots-mapped.txt which is scary but still works, so don’t panic.

    Another way to do it would be to have a massive rewrite for all my subomains:

    # All Mapped
    <If "%{HTTP_HOST} == 'code.ipstenu.org' || %{HTTP_HOST} == 'tech.ipstenu.org' || %{HTTP_HOST} == 'photos.ipstenu.org' ">
        RedirectMatch 301 ^/robots\.txt /robots-mapped.txt
    </If>
    

    I will note, it should be possible to have (code|tech).example.com work in there, instead of all those OR statements, but I’ve yet to sort that out (corrections welcome in the comments!).

    The last step is to fight with Google Webmaster Tools. Add your subdomains and you should get this on the robots.txt checker:

    Example of robots.txt for tech.ipstenu.org in Google Webmaster

    If you don’t, don’t panic. Go to the Fetch as Google page and tell it to fetch robots.txt. That will force it to recache. Once you have it right, ask Google to remove the URL from their index, and in a few days it’ll sort out.

    It’s very annoying and I don’t know why the 301 isn’t honored there, but oh well. At least I can make it work.

  • Mapping Domains Without a Plugin

    Mapping Domains Without a Plugin

    Recently I wrote WordPress Murder Mystery. The day I released it, I got on a plane to fly to Miami, and proceeded to have a pretty awful travel day, thanks to a storm that pretty much knocked travel to the SouthEast out of commission. While I waited for my flight to DFW, I got pinged. “Hey, I can’t add anything from your store to my cart!”

    I pulled up my laptop and thought about what I’d changed. Oh. I’d turned off domain mapping. But not really.

    Ghost Cookies

    You see, I love the WordPress Domain Mapping plugin. But I’m not using it here. No, I’m actually doing something I expressly and patently tell people not to do … because time has changed with 3.9 and it’s almost okay to do things this way. The change here is that WordPress is smarter now, and it’s safer, and you actually can just edit the home and site URLs in the Network Dashboard. But you know how I said ‘almost’ back there? Your ability to shoot yourself in the foot is directly proportional to how smart you think you are.

    What I did? It’s basic but assumes you already know how to add a domain onto another.

    1. Go to Network Admin -> Sites
    2. Edit the site in question
    3. Change the URL to the mapped domain, check the ‘change home and site URL’ box, and click update

    That was it! Three steps and everything still worked! This also let me force change a site to https all the way, and since I didn’t have content, I didn’t bother with a search/replace. If I had, I’d use that Interconnectit Script or WP-CLI for it. Still, like a wise person, I always get the domain ‘right’ before I add content.

    And at this point, everything seemed to work just fine! And so I left it as is and published my book. And as you know, it all failed, spectacularly. I rolled everything back (because I knew what I’d changed last, see? always remember that!) and it worked again, so I knew I had to have missed something big here. Since I was stuck in an airport with choppy wifi, I disconnected everything and fired up my localhost version of my site. Banging on that for a while, I saw I missed a small, but hugely important factor in the plugin.

    See most of what it does is that pretty interface to say “Pull domain.foo content from site (aka foo.example.com)” and force redirects. It also lets you do cool things like “Allow users to log in from foo.example.com instead of domain.foo” but really I didn’t need any of that. The meat of the code is in the sunrise.php file, which when I studied, I realized was just doing the redirects “Send Site to domain.foo” and for me, by renaming the home and siteURLs, I was already doing that.

    So what had I missed?

    define( 'COOKIE_DOMAIN', $_SERVER[ 'HTTP_HOST' ] );

    That was it. I forgot to tell it “Cookies belong to the domain you’re on.” What this means is that if you log in at example.com, the cookie you get is for example.com and not domain.foo! For the most part, this isn’t a problem since no one logs in but me … until you try to make a purchase and it validates a cookie which doesn’t match the domain. I added that to my wp-config.php (down at the end of my Multisite section, where the SUNRISE define had been earlier) and everything magically worked.

    Two lessons! First, test everything. Second, you can map domains without a plugin, safely.

    I will note that, over time, it’s possible those settings for home and site URL may vanish from display. They’re powerful and dangerous settings, and you should not mess with them without a good backup.

  • How To Duplicate Content

    How To Duplicate Content

    I’ve talked about this before. 100% duplication of content on multiple sites is bad. So why am I going to tell you how to do it? And better, why am I going to tell you how to do it without Multisite? Because as a proof-of-concept, it was interesting.

    The rest of this post tells you how to do something I don’t advocate, nor will I support. If you have a better way, or improvements, please leave comments. Otherwise, you’re on your own when you do this. I will not help you do it in any way, shape, or form.

    Honestly, I still think this is a pretty silly idea. Duplicating content is a terrible user experience, and I still flat-out decline to accept any work for doing this. Sharing content is one thing, but 100% duplication of sites makes no sense at all to me. Yoast also says it’s a bad idea. But, if you really are totally 100% dedicated to do this, and you absolutely are going to, damn the torpedos, then you should do this in the least computationally expensive way possible. And that would be a single install.

    Now all that said, this means you’ll need to do a lot of monkey-work, so why do I call this ‘easier’? In many ways, easy is relative and this will be hard, complicated, and may I stress, entirely unnecessary. You’re going the hard way around for something that good planning and a solid understanding of the Internet totally negates. Remember the absolute rule of the Internet: Use one URL per page and never change that URL. (With all rules, there are exceptions, of course.)

    The way to make all this work, without Multisite, is by tricking your domain a little. There’s a neat trick with parking (or mirroring, depends on your host) domains, that lets you keep the other domain URL in your browser’s address bar. That’s what I do with this site, actually, halfelf.org is parked on top ipstenu.org. And with a park, the URL always stays as halfelf.org. Hey look! Two URLs, one site! Multisite has secret sauce to know “Someone’s coming to HalfElf, send ’em to site .” But on a single site, all my links would still be ipstenu.org and not halfelf.org.

    Now how do you use this to duplicate content? You use relative URLs.

    So here’s a real example. I have twofer.elftest.net set up to mirror plugins.elftest.net (which will give you a coming soon page, it’s just where I like to blow things up for tests).

    In the beginning of this post, when I linked to my old post about Duplication Dilution, the URL was https://halfelf.org/2012/duplication-dillution/ and that is what we call an absolute URL. Because I’m mapping domains, I can leave those in without worry, but if I wasn’t, I’d change that URL to /2012/duplication-dillution/ instead. Right away this makes my URLs entirely relative, no domain name included, and I’m off to the races.

    This doesn’t solve everything, though. See, WordPress really wants to use absolute URLs. There are plugins like root relative URLs, and those will help a lot. None of them back-ports your existing posts, though, so for that it’s nothing for it but to search/replace the DB and change your post content.(ONLY change your post content. Do not change GUIDs!) I really like those plugins because now for a new post, when I add a link and chose to link to existing content, it happily works:

    addinglink

    And when I add an image, it too smartly handles as I want it to:

    addingimages

    That’s the easy part of all this, though. Now you have to disable canonical URLs, so that you don’t end up with even more of the dread duplicate content penalty in WordPress.

    remove_action('wp_head', 'rel_canonical');
    

    This also stops WP from redirecting things like http://plugins.elftest.net/?p=1 as well, however, so keep that in mind. Of course, that’s what you wanted. But they don’t address the problem of your source code. If I view source on twofer.elftest.net, it still showed plugins.elftest.net, and that would be a problem for images and themes. You’ll need to toss in this to your wp-config.php, which will dynamically change your URL to be whatever URL I’m visiting from, so that changes automatically. Awesome.

    define('WP_HOME', 'http://' . $_SERVER['HTTP_HOST']);
    define('WP_SITEURL', 'http://' . $_SERVER['HTTP_HOST']);
    define('DOMAIN_CURRENT_SITE', $_SERVER['HTTP_HOST']);
    

    Now I want to tell WordPress that wp-content is not in URL/wp-content/ so let’s just put this in and make it relative too!

    define('WP_CONTENT_URL', '/wp-content');
    

    I’m still going to have to search and replace my old post content (I used Velvet Blues for this):

    Velvet Blue Update

    But that didn’t address the problem of the source code. 90% of WP now thinks it’s all on twofer, which is what I wanted, but look at XMLRPC:

    sourcecode

    the_more_you_know2And even better, when I try to log in via twofer, it still says I’m going to plugins. Oh and it doesn’t pass through cookies, so really, I never log in to Twofer. Realistically? This isn’t a problem. I’m always going to use plugins.elftest for all this when I log in on the backend, and since the convenience of all this was meant for the front end, and it’s just pingbacks. And why is that? Honestly, I don’t know. I have a guess that since, at WordPress’ heart, the site is always plugins, the absolute URL there has to be what it is, but in so far as all that goes, I think it meets the needs of why most people want to do this.

    Conclusions? You can do this. If you wanted to, you can hardcode the theme so the domain you visit the site with will dynamically change the header image, or widgets, or anything else you want. PHP is pretty cool that way and WordPress is too. But I would never do it, except as an experiment to see what I could do it at all.

  • Subdomain vs Domain

    Subdomain vs Domain

    When two words are very similar, it’s easy to get confused. Which witch is which? Whether, weather, and wether. Affect vs effect… Okay, you know, English sucks. We have way too many words that will drive you to drink, and if you know anyone who’s learned English as a second language, please take time to tell them how amazing they are. My father’s wife is Japanese, bilingual in French, and is learning English. I know a smattering of French. Our conversations are fantastically amusing and thankfully we have great senses of humor.

    Because I’m that familiar with the crazy of my native language, I have no surprise that people get subdomains and domains confused. Here’s the basic statement:

    A subdomain is not the same as a domain.

    That’s it. But since I don’t expect everyone to know what the heck I just said, I’m going to explain. Remember, don’t think you’re stupid for not knowing this! You can’t magically know everything, you have to learn it, and there are people like me who want to help you. Where the confusion kicks in isn’t that we call it a ‘subdomain’ but that the official definition is “a subdomain is a domain that is part of a larger domain.” So we’ve just said a domain of a domain, and yet here I am pushing you and saying that a subdomain isn’t a domain when it clearly is.

    It is and it isn’t.

    • A domain is pretty simple: elftest.net is a domain. It’s the solid basis that all websites are built on.
    • A subdomain is a subset of the domain: tools.elftest.net is a subdomain on elftest.net.

    Notice how ‘tools.’ is in front of elftest.net? That extra period between tools and elftest is how we know this is a sub domain. The .net part is called the ‘Top Level Domain’ and any time you see www, that actually isn’t a subdomain, but a special term… You know what, let’s break this down with a picture.

    Domain Example

    You can ignore protocol for now (we can get into that another time). What we’re looking at is this:

    URL: http://www.example.com/index.html
    Top-level domain name: com
    Second-level domain name: example.com
    Host name: www.example.com OR example.com

    Why is www special? It has to do with a lot of boring history, but suffice to say that used to be how we knew it was a webpage! Now we use http:// to say ‘This will be a webpage’ so many of us (myself included) feel that www is unnecessary and just makes URLs longer. However because of history, http://www.elftest.net and http://elftest.net will forever point you to the same place. This actually means that www is a subdomain, but it’s a very special one that points to the same place as no www at all. In very rare cases, a fancy website will redirect www and non-www to different places, but this is the exception, not the rule. Good SEO practices are to have the www and non-www point to the exact same place.

    The meat of the matter is that most of the time, when someone asks ‘What’s your domain?’ they really mean ‘What’s your host name?’ My host name is elftest.net (or ipstenu.org or halfelf.org…. I have a lot of domains

    Now let’s look at a subdomain.

    Subdomain Example

    They look shockingly similar, except that instead of www in front, I have sub. So what’s the deal here? Well because I’m using something other than www, I’ve designated sub.example.com as a subset of example.com, and thus a subdomain. Yes, it’s backwards. Sub should be below or behind, but remember, we’re calling .com the top level domain, so right-to-left this makes more sense.

    I know. It’s all clear as mud. Even writing this I sat there and muttered “This stuff is nuts.” I know all this didn’t explain everything as clearly as I could wish, but I’ll break it down into the simplest terms that, while not 100% technically accurate, will tell every decent web tech what you mean:

    When someone asks “What’s the subdomain?” you answer ‘sub.example.com’

    If someone asks “What’s the domain?’ you say ‘example.com’ (sometimes they’ll ask “What’s the main domain?”)

    If you’re on Multisite and someone asks “What’s the mapped domain, and what subdomain does it point to?” you say ‘mappeddomain.com and it points to mapped.example.com’

    And never ever use domain mapping plugins for your subdomains. Those are for grownup domains only, not your subdomains.

    For extra credit: Third level domains are what you get when you see things like example.co.uk – example isn’t a subdomain here, it’s the main domain. co.uk is the TLD. Why third? Well, we’d already used sub and second, and we needed some way to say that this is part of the primary URL, and not a subset. Also geeks love to confuse people.