On The Segregation of Code

WordPress stores your URLs in the database as the full URL. That is to say all the links to other pages on your site, all the images, use the full path.

When I changed this site from tech.ipstenu.org to halfelf.org, I knew that meant I’d need to do a quick search/replace in the database. That meant first I searched wp_2_posts for everything saying tech.ipstenu and replaced it with halfelf. Then I did the same in wp_posts (since I knew I’d done some of that). I did this directly in SQL because I can, and I believe in using the right tool for the job. To whit:

UPDATE wp_2_posts SET post_content = REPLACE (
post_content,
'tech.ipstenu',
'halfelf');

That code searched all my posts and changed the URLs so my images looked right. Now, since I’m using a domain mapping plugin, I don’t need to go in and change wp_blogs, which I would be very hesitant to do anyway. I’d sooner make a new site with the same name (either manually or via a replicator), delete the old one, and use .htaccess to redirect.

Why would I do all this? Well, because WordPress stores URLs in a non-relative way. The link to my about page is http://halfelf.org/about, not just /about, and that screams in the face of what I learned back when we all agreed that paying for Netscape as a browser wasn’t a terrible idea.

The real question people are asking is why are we this way?

There are a lot of arguments, one is that you shouldn’t change your URLs. Another is that WP is written for the user, not the developer. A third is that you’re trying to force your process on a tool, when you should really develop the process for the tool. That third argument is where I live. I believe in being adaptable, while I want all my code to be flexible, adaptable and fluid, there’s a time and place for the flexibility being the code, and the flexibility being me. In the case of URLs, I think the flexibility must be mine.

In the ‘real’ world, you should never change your URLs, and if you do you always have the old ones forwarding to the new ones. Go on, go to http://tech.ipstenu.org/about/ and where do you end up? That’s because the plugin works. Go to http://code.ipstenu.org/about/ or even http://ebooks.ipstenu.org/ and see what happens. That’s because I forward those (obvious) URLs to where they should be. I know that people can adapt to change, but I can help with them a well crafted .htaccess rule. That’s flexibility in my code.

When we take another step back and consider the flexibility of code, we muddle that up quite nicely with the flexibility of content. These are different things. Code should be flexible and adapt to any situation you put it in, but you need to be flexible to adapt your content to the situation. Content cannot be considered in the same breath as code, because they are segregated by their very nature. We should be able to pick up our content and plunk it down in any code, and have it work equally well. This argument was fought, and won, by CSS back in the day. So how does this work with URLs? They’re in my content as links, and if I change them, I have to change my content.

Let’s take the example of building a site locally. Personally, I use a hosts file to make halfelf.loc so that when I’m done building it all out, I export the DB, replace halfelf.loc with halfelf.org, and I’m done. Why do I pick halfelf.loc? If you’ve been here before, you may remember I wrote about Moving Multisite. In there I mention serialization. That’s why. Searching for domain.loc and replacing with domain.org will not break serialization! This is, some claim, a ‘bush league’ maneuver (yes, zamoose, I heard you), but have you ever tried to move an application on your desktop to a new location? You have something installed in Program Files, let’s say, and you want to move that to Program Files/Half Elf? Congratulations, you get to search the whole registry!

See, it’s not just WordPress. This isn’t an excuse, but a statement of fact. Many applications on the web don’t use relative paths, and they do force you to search/replace things to move to a new location. This is especially the case when you think about how we write links. Some people use the link interface, which if used, could insert variables for ‘base path.’ But some of us use the old fashioned manual way, and now you have to code a really fancy bit to check “Did Ipstenu post a link to myself? If so, I need to change halfelf.org/foobar into [basepath]/foobar so I can update it later if I move.” Certainly it can be done, but it’s actually going to be easier to just search/replace. The ‘more’ you search for, the easier it gets. I actually know someone who did this. He says it was more trouble that it’s worth.

That really doesn’t surprise me. What all the proposed fixes to this try to do is to force WordPress into working a certain way, rather than customizing a solution. That probably sounded the same to a lot of people, but it’s not. The method by which you move static HTML files around servers is, for lack of a better term, a migration method. When you migrate an application from sys to test to production, you do so because there are a lot of moving parts to test. It’s in these moves that we face headaches with the relative path locations. When we hard code locations into our content, like in a URL, we lock ourselves into that URL, now and forever. But does the content actually matter?

For me, the real question is ‘What moving parts of WordPress might I be testing that I would need to ‘push’ to my production site?’ I came up with three things:

  • Themes
  • Plugins
  • Content

I ‘push’ themes and plugins out to a test server and then my production server all the time. I do it for hundreds of different applications a day, and it’s all automated. We never have a problem (unless the code is bad) because we perfected our migration method, and let go of the idea of pushing our content. Our content is not our code.

Initially, we didn’t replicate content for legal reasons. We have sample data to test with when we’re checking out theme and plugin changes, but we don’t bother with data replication. Realistically we can’t, since the people who are testing things out don’t have the security clearance to look at the content. Because of that, we have to trust that the live data is the live data, and that it’s good. Instead of sweating over the content, we concentrate on copying up, in this case, themes and plugins. Since those may have customizations, we take a snapshot of the database, install the theme, make our changes, and take a snapshot after. Get the diff, and now we know what needs to be applied up. Then we script it. The job will FTP up the files, run the SQL diff (which we’ve already edited for the new domain name if needed, but rarely since we use hosts to point at our dev server instead), and off we go. Same thing for plugins.

If I apply this to WordPress, suddenly I have but one, minor, headache in moving my code, and that’s the wp_options table, and it’s serialization which includes the URL. I agree, that’s a headache and if I was going to put paid effort to fixing anything about relative URLs, that would be it.

But I don’t think that URLs for WordPress should be non-relative. Given the alternatives and the possibilities, right now the issue isn’t that WordPress stores the like that, but that we don’t have a migration ‘process’ defined yet. We should stop getting hung up on ‘fixing’ what isn’t broke, and instead start looking at the best ways to move what needs moving. See, once we found we couldn’t replicate our content, we started looking into what needed to be done to protect it, outside of ‘migrations.’ In WordPress, today, the best way is to have your writers submit their posts, and your editors review and publish. You may want to look into groupflow plugins, or do what the Bangor Daily News did.

The right tool for the write job. 1 Moving your content between testing and live servers isn’t needed. Just concentrate on moving the ‘app’ as it were. It’ll work better.

Notes:

  1. Pun intended.
StudioPress Theme of the Month
Half-Elf? Try Half OFF WordPress ebooks!