Half-Elf on Tech

Thoughts From a Professional Lesbian

Category: How It Works

  • How the WordPress Upgrade Works

    Edited: This post was linked to by the folks at WTC, so I’m getting a lot more traffic! If you asked a question, I will try to answer it promptly, but if you need serious help fixing a problem, please consider posting in the WordPress forums for help.

    I was 90% sure about this before I started writing the post, and Andrew Nacin was nice enough to tweet me the exact file I needed to look at, so when I got home to look, I was ready to go!

    There are two kinds of automated upgrades for WordPress. The main ‘core’ upgrader and then the ‘child’ upgrades used for themes and plugins. They behave differently.

    Most of the time, we all see the plugin and theme installer, where it downloads the plugin to /wp-content/upgrade/, extracts the plugin, deletes the old one, and copies up the new one. Since this is used more often than the core updater (most of the time), it’s the sort of upgrade we’re all used to and familiar with. And we think ‘WordPress deletes before upgrading, sure.’ This makes sense. After all, you want to make sure to clean out the old files, especially the ones that aren’t used anymore.

    This is not how a core update works.

    WordPress core updates, the ones to take you from 3.0.3 to 3.0.4, do not run a blanket delete. They don’t even run a variable delete. They don’t even run a wild-card delete on files in wp-admin (which they could). Instead they have a manually created list of files to delete, files that have been deprecated, and delete only those files. Here’s a snippet of what it deletes:

    $_old_files = array(
    'wp-admin/bookmarklet.php',
    'wp-admin/css/upload.css',
    [...]
    // MU
    'wp-admin/wpmu-admin.php',
    'wp-admin/wpmu-blogs.php',
    [...]
    
    // 3.1
    'wp-includes/js/tinymce/blank.htm',
    'wp-includes/js/tinymce/plugins/safari',
    [...]
    'wp-admin/images/visit-site-button-grad.gif'
    );
    

    As you can see, the files and folders are all very carefully specified. They are not ac-hoc, they are all determined based on what WordPress has removed. If you read the whole thing, the part that would impress you the most is, perhaps, the folder wp-content, where your themes and plugins are installed, is nowhere on that list. Don’t believe me? Go look at wp-admin/includes/update-core.php and search for it. It’s not not there!

    Once the old files are listed, remember that we have not deleted anything, the upgrader runs through 9 steps.

    1. Download the zip file of the new release, unzip it and delete the zip
    2. Make sure the file unzipped!
    3. Make a .maintenance file in WordPress base (this makes your blog ‘down for maintenance’ so no one can do anything and screw you up mid-stream
    4. Copy over the new files. This is a straight copy/replace. Not delete.
    5. Upgrade the database. This may or may not happen.
    6. Delete the unzipped file
    7. Delete the .maintenance file
    8. Remove the OLD files. This is where it goes through the list of deprecated and unused files and deletes them.
    9. Turn off the flag that tells you to upgrade every time you’re in wp-admin

    Nowhere in there is wp-content and your themes mentioned.

    Well, except in this one weird way.

    See, in order for WordPress to work out of the box, a theme must be included, right? WordPress has the default theme of Twenty Ten. Now, I’ve mentioned before that you should never update your themes directly, and instead make child themes. This is why. When the WordPress core files update the copy over everything. Included in everything are two plugins (Akismet, Hello Dolly) and one theme (Twenty Ten).

    Normally this isn’t a problem. Sure, someone always edits those files directly and lives to regret it, but they do live. This last cycle, with 3.0.4, Akismet was accidentally rolled back from 2.5.1 to 2.4. Again, normally? Not a problem. We just upgrade our Akismet installs, remark on the silliness and annoyance, and move on. The problem for one user is she had another plugin or theme edit that hooked into the new Akismet. With the way WordPress updates core, it only deletes the files it knows to delete (and Akismet isn’t any of them) and it copies over the ‘new’ files. Or in this case, the old ones. Which broke her site. Her fix? Just re-update Akismet manually. Not a big deal and an easy fix. (Personally I think NOT updating core, even a security upgrade, with the latest Akismet is a poor choice, but the rational is that they want to keep it small and easy to maintain. So okay, I get why, I just don’t like it. I will support that as their choice, and continue to do my manual upgrades, which include deleting Akismet, Hello Dolly and Twenty Ten before I upgrade.)

    So then, why doesn’t this work 100% of the time? (Yes, I mentioned this before in the 3.0.1 days, but apparently it bears repeating. Read Why doesn’t the WordPress Auto-Upgrade Work? for more thoughts on the subject.) Well, to start with, that’s impossible. Nothing works 100% of the time. Me submitting this post won’t work 100% of the time. You safely walking up the stairs won’t work 100% of the time. This is just how the world works. There are, simply, far too many variables out there to allow this. Even though I make part of my living ensuring that I have an automated process run in a repeatable fashion, I tell people that I only ask for a 75-80% success ratio on the process before I’ll agree to automate it. Why? Because that’s actually phenomenal.

    If you take into consideration all the moving parts, variables, and possibilities that goes into any one individual WordPress install, it’s sort of impressive this stuff works at all. In baseball, if you hit a ball and get on base 33% of the time, you’re considered a fantastic hitter. If you do it 40% of the time, you’re pretty much promised a slot in Cooperstown. Whenever I try and sort out if something is worth the risk, I quote my father “What can go wrong? How likely is it? What are the consequences?” It boils down to an understanding of risk analysis, and what that actually means. The problem is that most of the people using WordPress are users. They’re not generally expected to think about risk. Not that many don’t, but just that Joe Blogger doesn’t tend to look at an upgrade as a ‘risk.’ After all, WordPress tested this, so it should be easy.

    Most of the time, it is. And when it’s not, it was an acceptable risk. All those horrible and terrifying outcomes you read about (from a very vocal minority) are gut-wrenching when they happen to you, don’t get me wrong, but at the end of the day, you have to ask ‘If I use the default WordPress theme and no plugins, do they happen?’ If the answer is no, then WordPress has done all the testing that is required. The onus is not on WordPress to test every upgrade with every theme and plugin. That responsibility is firmly on the shoulders of the person using the themes and plugins. A good developer tests their plugins and themes the moment a release candidate comes out for a new version of WordPress, and sorts out how best to support both the current users and the future ones.

    It’s what we call ‘acceptable’ risks. You take them all the time. You took an acceptable risk brushing your teeth. You take one every time you walk out your door. You know the risks and you accept them. So when you get to arguing that ‘WordPress upgrades always break’ or ‘I hate upgrading because it means my themes/plugins won’t work’ and using these as reasons to show that WordPress is bad software, then I think you’re missing the point. The more you customize, the more things break. This is something I mentioned in When To Code, and it bears repeating. The more you customize, the more things will break when you upgrade.

    But this is an acceptable risk for most of us!

    So when the WordPress auto-upgrade breaks, and I promise you, it will for you at least once in your experience, you have to learn to accept this. It’s not going to work on every server, it’s not going to work on every host. Your server is being constantly upgraded and tweaked. Security patches for PHP, FTP and everything else are applied, most of the time automatically so you don’t have to waste time thinking about it. And when you combine all those things, yeah, it’s going to break.

    This isn’t meant to scare you off of upgrading, but an attempt to raise your awareness of what’s going on, so when things break (and they will), you have a better understanding of why, and what to do. If the automated upgrade of WordPress breaks, upgrade manually. If every single automated install (upgrades, plugins, themes) always breaks, then start to diagnose your server. But if it’s a one off, just do it manually. It is, literally, copying files up to your server. If that’s too hard for you, you may not want to run your own, self-managed, WordPress install.

    As a blanket reminder, in order to prepare yourself for an upgrade, always make backups of your database and your files. A good backup. Never edit core files (even themes) and always remember that the computer is out to get you.

  • The Latest Malware Malfeasance

    I preface this with I really don’t have time to de-malware everyone’s site who emailed me, so please don’t ask for help right now, I’m not a freelancer for a reason and I’m booked till … Uh, August at this rate. So, no. I’m not going to be able to help you. I am going to post HOW to fix it, but if you need serious help after that, at the bottom are links of people to help you.

    If you find this helpful, great! There’s a donate link to the right on my site, but personally I feel it’s more important people get the right information!

    So you logged into your site and the admin side looked something like this:

    The odds are that you’ve been hacked by the latest malware. Malware is short for “malicious software” and basically it’s someone screwing with you. Why? Because they can. I’m not going to get into why, it doesn’t matter. What matters are two things:

    1. How can I fix it?
    2. How can I stop it from happening again?

    Before we go any further, though, go run the Sucuri Scan. That will tell you if you’ve really been hacked, or if it’s something else. For the rest of this post, I’m assuming you’ve been hacked.

    How can I fix it?
    Make a fresh backup of everything on your site. Download it all. Yes, it’s probably got the virus in it, but that’s okay. It won’t hurt your desktop. Also backup your database to your desktop computer. The hack doesn’t seem to have affected your database, but you should always make a good backup before you try this stuff. Make note of your theme name (and where you got it from), as well as all your plugins. You’ll need this in a moment.

    Put a copy of the following files/folders in a safe place, separate from the rest of your backup:

    /public_html/.htaccess
    /public_html/wp-config.php
    /public_html/wp-content/uploads (and ALL files and folders under this)

    Now, delete everything from public_html on your server. Yeah, everything. This is why I said make a backup, folks!

    Once the server is naked, change your passwords for FTP/SSH. If you’re using a non-Secure method of accessing your server, stop and get something like WinSCP or CyberDuck or anything that allows SECURE FTP access. SFTP should be the ONLY way you FTP to your site.

    Download, from WordPress.org a new copy of the latest and greatest core WordPress files (at this posting, it’s 2.9.2, but 3.0 is in beta, so that may change shortly). Install from that, NOT from your site’s automated installer. You should be able to copy all the files up and then add those files I told you to put aside. Remember them? The .htaccess, the wp-config.php and the uploads folders all go back up.

    Under no circumstance should you upload anything else from your backup at this time! Also don’t bother visiting your site, it’ll look weird.

    Once your files are back, go to http://wordpress.org/extend/plugins/ and download all your plugins. One at a time.

    Repeat with your themes, going to http://wordpress.org/extend/themes/ or wherever you got your theme from in the first place.

    If you made your own theme, it’s a little harder, since you’ll need to go over every single PHP file in your theme and look for ‘weird’ code. Sucuri has a cleanup script, but pretty much open them all up, look for encoded information that will look something like this post from Sucuri. If you see that in a file, kill it with fire.

    Finally, go into your /public_html/cgi-bin folder. If there’s a file called php.ini in there, delete it. There may not be, so don’t worry about it too much if not.

    How can I stop it from happening again?

    I’ve got some advice, but right now, if you’ve been told ‘Just upgrade WordPress’, well, that’s not enough. Yes, I know that GoDaddy was claiming for a LONG time that’s what you needed to do. I’m here to tell you this: GoDaddy is incorrect when they tell you ‘Just Upgrade.’

    That doesn’t mean you shouldn’t upgrade, in fact, you may note I said to get the latest and greatest WordPress version (again, 2.9.2 as I write this). That’s because it’s going to have every security fix they’ve come up with to date. It’s almost always best to use the latest version of software. For most of you, it’s always better.

    You may want to look into something like WordPress File Monitor, which emails you if files are changed. Just turn it off when you plan on making a lot of changes!

    By deleting your files, getting a secure FTP client and changing passwords, you’ve closed the biggest security hole: You. I hate to say it, but every time I’ve ever been hacked it’s been right after I opted not to follow security protocol that I know damn well. And here’s my protocol: Always use secure connections to your website when editing data or accessing sensitive areas.

    And that’s really simple. If I use cPanel or WebHost Manager, I connect via HTTPS, which is secure. If I use shell, I’m using SSH (secure!). If I’m FTPing, I’m using SFTP. You see the trend? I’m also only using software I know and trust. My browsers of choice are Chrome, Firefox and Safari. The last time I used IE 8, I got hacked. My SSH terminal is the Mac Terminal or PuTTY for Windows (which I only download from http://www.chiark.greenend.org.uk/~sgtatham/putty/ – there are other, fake, PuTTY sites). My FTP clients are (for Macintosh) Transmit and CyberDuck. For Windows… Well I actually don’t FTP much from Windows. I have been known to use WinSCP, but I’m not comfortable recommending it, as I haven’t had time to really look into it’s security. In addition, I don’t connect to my site’s back end from non-secure WiFi. That means I don’t go in on my laptop in StarBucks. Anyone can jimmy my connection!

    Now that you’re being secure, go to talk to your web host. Tell them what happened. Since you have a backup of your files, you can even show them the hack! Any decent web host will sit up and pay attention. Sometimes they’ll be a bit shady, but pay attention. If they say ‘We’re going to look into this, but in the meantime, please upgrade and change passwords.’ then they’re okay. If they just say ‘Yeah, its’ your fault, upgrade.’ then you’re in trouble. When I was hacked, my host helped me sort out what it was, admonished me appropriately where I’d screwed up, and pointed out ‘Here’s when and where it happened.’ To which I said ‘Shoot! That was all on me!’ But they took the time to work with me.

    If you’re on GoDaddy, LEAVE. GoDaddy Doesn’t Give A Damn, or at least they’re acting like they don’t. A user found the code used to inject malware and it’s not a WordPress specific file. In fact, this annoyance is attacking multiple servers, multiple hosts, and multiple PHP based apps.

    Besides, Go Daddy is telling people to upgrade to fix the issue, but they’re running an old version of WordPress on http://community.godaddy.com (which is where they happen to be telling people to upgrade).

    It’s 2010, and apps like WordPress are here to stay. Mark Jaquith wrote a deft admonishment to web hosts, telling them to adapt:

    WordPress is the number one user-installed web app, and its growth is showing no signs of slowing. If you are a web host, and you don’t have a specific strategy for WordPress, you’re likely operating your service inefficiently, and may be opening yourself up to security issues. This is the year to adapt, or be left behind by nimbler upstarts.

    As a side note, GoDaddy has contacted Sucuri, saying they are looking into it, but they’ve taken weeks from when this issue first sprung, Athenaesque, into the spotlight. The full-grown goddess has a spear, guys. Pay attention. If they had said, from the get go, “Gosh, this is weird, we’re looking into it!” or asked for information, or not dismissed willing technical users, they might not be on my shit-list right now. As it stands, I cannot recommend them as a host.

    GoDaddy has a special contact form just for these security issues. If you were infected, use it.

    Me? I use LiquidWeb
    Dedicated Servers by Liquid Web

    So you still need help?

    Ask your host for help. If they can’t (or won’t), try to get them to do a restore from backup. But some hosts are better than others about this.

    Your next step is to open your wallet:

    Those are three people I ‘know’ (as much as you can know anyone on the net). Plugged In is the only one who, up front, says she’ll remove malware, but the other two are savvy enough that I suspect they may as well. If not, they’ll tell me. Kim Woodbridge assured me that she does indeed remove malware (thanks, Kim!). and I’m fairly sure WP Turnkey might, but if not, based on his services listed, he can get you up on a new server that isn’t GoDaddy. Chip, of WP-Turnkey also said he does this, so there you have it! Ask them, and please feel free to tell them ‘Ipstenu sent me!’

    And yes, these are going to cost you money. Well, running a website costs money. Welcome to the costs. I’ve paid out the nose to bail myself out of these situations before, which is why I’ve learned what to do. And even then, I pay a good host a lot of money a month to help when I’m in over my head.

  • Hack’n’Slash Security

    I was intending on a totally different post, but, well, this came up instead.

    Recently, WordPress, my preferred blogging software, has been under attack by both hackers and critics. There were actually three attcks that all got lumped into one so I’ll try and break this down. If you’re of the ‘Too long! Didn’t Read!’ variety today, you can get by with knowing this: If your WordPress install is not secure and if your web host is not secure and if YOU do not follow security practices, then you will be hacked. Period. Security relies on you, your web host and your web apps all being sensible about the whole thing to be effective. Remember, it’s okay to ask for help!

    Also go read Hardening WordPress right now.

    Okay, so security.

    Back in Feburary/March, there was a sudden influx of users complaining their sites had been hacked by inii.info, whereby the hack was to edit the wp-blog-header.php and change it so any time a search engine bots visited your site, they went to inii instead. This matters because search engine bots collect information about your site and use it to rank your website against all the other sites about a given topic. There was a second hack where a file named ... (yes, three periods) had even more redirect code in it. And it was heavily encoded so you couldn’t read it without decoding.

    The reason I call this a Media Temple hack was that it seemed to be prevalent to Media Temple installs. While at first people jumped the gun and said ‘It’s WordPress!’ Media Temple came out with a detailed Q&A about the matter and the attack appeared to affect ALL webapps via compromised passwords. If Media Temple ever revealed what happened, I’m not aware of it, but it wasn’t just WordPress that was affected. They ended up changing DB passwords for every webapp, from Drupal to vBulletin.

    In early April, there was another rash of hacks, this time targeting Network Solutions. This time, it looked like a clear cut case of database changes. WordPress, like most PHP/SQL apps out there, uses a database to store all its information. In this instance, the database entry for the site’s URL was changed from (for example) https://ipstenu.org to an iframe link I’m not reproducing here.

    At the same time, there was a ‘Pharma’ hack, where links with ‘pharma’ in them were slipped into your site, in a rather genius fashion. Chris Pearson has a decent explanation on the matter, but I feel he’s barking at the wrong car for part of it.

    Chris and Media Temple and Network Solutions and a horde of people on Twitter and forums every where jumped up and said “AHA! It’s a WordPress hack!!!111!” Which … well, yes, but not exactly. As the very wise Andrea_r put it, there’s a difference between attacking WordPress installs and targeting WordPress installs.

    An analogy if you please. There’s a rash of break-ins in a small town. The houses that are broken into are all bungalows. People shout ‘Aha! It’s a problem with bungalows not being secure!’ The police look into the matter and find out that in every house broken into, the bathroom window was left open. Now, is this the fault of the builder, who designed bungalows to have a window people could fit in through or is this the fault of the residents who didn’t close and lock their windows?

    If you said ‘It’s a little of each!’ then thank you, you can stay after class and clean the erasers.

    Security depends on many things, but to the topic at hand, server security is a tripod, and relies primarily on these three legs:

    • The Web Host is responsible for making sure the sever itself is up to date with the latest patches etc, and that the server is configured in a safe way.
    • Web-apps are responsible for not unleashing needless insecurities to the system.
    • The end-user we pray to the flying spaghetti monster that they’ve not done something to violate security out of ignorance.

    To understand how these hacks all worked, yes all of them, you have to look at the perfect storm. This is what had to happen in order for all these accounts to be compromised:

    1. Someone saved their wp-config.php file in a way that it was readable by the free world.
    2. Someone scanned for and found that file.
    3. The user was using their ID and Password, rather than creating a DB user just for the blog.
    4. That account had read access to other accounts on the same server
    5. The malicious user used the account to scan for other wp-config.php files, even if they were saved securely and compromised their accounts/databases as well.

    That’s a lot of wrong on one box. With most webhosts, you’re on what’s called ‘Shared Hosting’ which means a whole mess of people are on the same server, each with their own ID and password. Much like if multiple people have IDs on a desktop PC, the inherent security of the server does not allow Joe to look at Jane’s files, unless she saves them in a public space. Alas, one a couple sites, this was not the case. SO Joe, who saved his wp-config.php file with 777, and used his server ID and password to access his database in that file, was compromised. And once the hacker had Joe’s information, he scanned the entire server and hurt everyone.

    Ouch.

    But wait, doesn’t that mean it’s WordPress’ fault for saving passwords in the wp-config.php file in a way a hacker can read them!? Well, yes, it’s certainly WordPress’ ‘fault’ but you have to realize that doing so is an accepted risk of most PHP/SQL webapps, in that for the SQL DB to be read, the password to that database must be kept in clear text (i.e. not encrypted). This is in the wp-config.php file.

    Okay, so it’s Joe’s fault for saving his file in a readable fashion? Somewhat. By having their wp-config.php file set so that anyone can read it (bad permissions – 777 for example), Joe put himself at risk. This IS NOT a flaw in web-app or the ISP, it’s just … well, ignorant (unless the ISP is forcing the file to be 777 to run WordPress, at which point it’s their fault, and yes, there’s an ISP that does that!). In addition, I know a lot of people who, instead of making a DB user for their blog, will put their server ID and password in that file, which means once it’s been read, ANYONE can log into that server as them. I suspect this is done from ignorance as well. By the way, your server ID and password is the same as your FTP user ID and password in most cases.

    Back to WordPress, shouldn’t they check for that? Maybe. But it’s not that easy, since there are a lot of different ‘acceptable’ security settings for that file, and it all depends on the server. Maybe one day WordPress will figure that out, but right now they tell you to make it secure.

    What about the web server? They are responsible for making sure that if Joe User set his WP config file to 777, and put their server ID/Password in there, the worst they can do is shoot themselves in the foot by preventing them from reading anyone else’s user directory. Limit the destruction on a per-user basis. There are a lot of Shared Hosts out there with lax security policies, which makes this more prevalent than I’d like.

    Hopefully that made sense.

    All of these hacks seem to be looking for people with wp-config files that can be read, logging into the account as the user (or the database user), and either adding files that edit the database, editing the database, or both editing the database and adding the fake plugin files.

    Once your server is insecure, because of compromised IDs and Passwords, you have to go back to zero, reset ALL your passwords, scan your PC for viruses, and be careful. Remember, if they have your password, they can do everything you can do.

    Good luck out there. Be smart, be secure, be safe.

    Edited to add…
    Also check out Mark’s well written post about how your security? Is your responsibility. Because dude, is SO is.

  • MediaWiki – All Powerful, All Annoying

    Don’t get me wrong, I love MediaWiki. It’s ‘overkill’ for what I need, but then again, I wanted a stand-alone ‘encyclopedia’ where primarily text based articles were listed, without the ability to comment. And until someone can trim WordPress to run as fast as MediaWiki, I’m sticking with it. Well, that and they need an ‘import from MediaWiki’ tool, cause at 700-odd pages, I’m not doing it by hand. It’s a static website, and it does it’s job well.

    But right now, and every time I need to update it, I hate it.

    I don’t mind using command line to wget the latest version and unzip it, overlaying the new files atop the old ones. What I mind is having to manually visit the pages for all my extensions, and determine if I need to upgrade or not. It makes me wish for WordPress with the happy ‘Hey, that plugin needs updating!’

    See, there’s no admin ‘side’ to MediaWiki, like there is for WordPress, or ZenGallery, or anything else I run on my sites. MediaWiki is for the hardcore people who don’t mind getting their hands dirty. And as a user, I think this is the real problem with the whole thing. Until they make a user friendly admin side of the whole thing, MediaWiki will remain used by the nerdy, the geeky and the techie, rather than the whole world. Part of why WordPress became so popular is they made it not easy, but easier to run your own blog. It’s still got problems, sure, but they made it so you could easily learn how to manage your own site.

    And then there’s MediaWiki.

    MediaWiki sucks to admin. Like today I found out I could turn on File Caching. That’s great new, I think! I use it for my gallery and my blogs (runs faster among other things). Except that, unlike WordPress (where Donncha’s freakin’ amazing WP Super Cache can clear out files on a scheduled basis) or ZenPhoto (where it runs once a day, or whenever I press ‘clear!’), MediaWiki has no cache expiry. That blew my mind, but seeing as MediaWikis are ‘mostly’ static content, it makes a little sense.

    So I turned it on and ran $php maintenance/rebuildFileCache.php which force caches everything. All at once. This is awesome to get your site ‘started’ and all told, it took up about a moderate, but not huge, bit of space.

    Also, I was told ‘When you edit a page, the cache is refreshed’ except I did, and it didn’t. Then I was told ‘Add this to your page URL and it will prompt you to recache.’ (this being ?action=purge) except that didn’t either. If I was logged in, it did nothing. If I was logged out, it did, but then I went back and it was still the old page. Finally I sorted out that the cache pages had to be owned by ‘nobody:nobody’ (this isn’t too weird, BTW). The problem NOW is that if they were owned by that, then the script rebuildFileCache.php didn’t work!

    So, great, it now works, it now flushs when I edit and save a page. If I run the rebuild command, I’ll have to manually go in and chown the files to nobody, which annoys me, but I have godlike access to the server and I can always fix it. But what if I want to delete everything in the cache? Basically I have to dump the entire folder. Which is annoying, but at least it’s working now.

    Why would I have to flush the whole cache? Because I make a formatting change, let’s say. Also, I have advertising on my sites. How does this get affected?

    In the end, I’m going to keep the cache running for a month, see how it goes. But it still annoys me how much of this is lacking because of no admin ‘dashboard.’

    Then again, that’s MediaWiki. Function over form. All powerful, all annoying.

  • But If, Baby, I’m The Bottom, You’re The TOP

    Earlier this month I talked about how my server was acting wonky and how I fixed it using, among other tools, TOP.

    This week I was chatting with a fellow about CPU usage and his site. He runs a rather large WordPress blog and the database is about 500 megs. As a comparison, this site, with about 500 posts, is under 5 megs, and my big site, with thousands of posts, comments, and a forum, is 10 megs. The biggest site I run on my server is 850 megs (just down from 910 after some clean up). The difference between his site and mine is that his is slow and he knows it. As we discussed ways to speed it up, I had some thoughts on WordPress and how, at a certain point, you’re going to need to dig into the guts of your server and learn TOP.

    The ‘problem’ with most ‘How do I make my WordPress site run faster?’ tutorials, as I’ve seen it, is they address surviving the digg effect. That is, they talk about how to deal with having a high volume of traffic on your site and, for the most part, you can make it with just adding caching plugins.

    Once your site gets ‘big’ or ‘popular’ you’re going to have to move off shared/cloud hosting and over to your own server. For most of us, the first step is a VPS (Virtual Private Server). Shared Hosting means ‘You have an account on a server with a hundred other people.’ It’s great for small sites, inexpensive and easy to use. The problem is you could have terrible neighbors, who use up all the CPU. Think of it like those old New York apartments where someone’s a jerk at 5am and uses up the hot water so you, at 7am, have none. Yeah, it’s kind of like that. That’s the day you think ‘I want a house!’

    Only, well, we’re not all up for houses just yet. A house would be a dedicated server, where it’s just you. Cloud hosting, which I touched on earlier, would be the college dorms of webhosting. It has a lot of benefits for the really small sites, and actually some for large sites, but I’m not sold of their overall usefullness yet, so I’ll talk about them some other time. What I want to talk about are Virtual Private Servers, the condo-sub-leasing (or rent-to-own maybe) of website hosting, and how the new VPS user should really get on TOP of things (sorry, bad pun) to make their lives easier.

    TOP. Well ‘top’ really. Unix commands are generally all lower case like that.

    The top command is a system monitor tool that outputs a list of processes. Have you ever seen Task Manager in Windows? It’s kind of like that tab for ‘Processes’ that you look at and run away from. The default view of top is by percentage of CPU usage and the “top” CPU users are listed. See? The name made sense. You can also see how much processing power is being used, memory hogs and other cool things. Most modern Unix-systems let you sort the list, colorize it, etc, though you have to be command line savvy.

    Here’s what top looked like for me about an hour ago.

    top - 12:44:44 up 126 days, 23:13,  1 user,  load average: 0.12, 0.17, 0.17
    Tasks:  91 total,   1 running,  90 sleeping,   0 stopped,   0 zombie
    Cpu(s):  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,  0.0% si
    Mem:    524288k total,   358248k used,   166040k free,        0k buffers
    Swap:        0k total,        0k used,        0k free,        0k cached
    
      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    15616 nobody    15   0 94540  65m  20m S  1.0 12.8   0:00.27 httpd
    12261 ipstenu   16   0  1908 1012  780 R  0.4  0.2   0:00.30 top
    [...]
    28630 root      16   0  107m  86m 1096 S  0.0 16.8   1:06.30 /usr/sbin/clamd

    I wanted to point out clamd, which has been the bane of my existance. Thing won’t DIE. I ended up going in to /etc/exim.conf and manually commended out the clamd line (and restarted the service) to finally get it gone.

    But top, as you can see, has a freakishly large amount of information. My server is doing fine, at this point, so I don’t have a whole lot to show you. What you can see right away is that I can tell, with a glance, what’s going on. I could see, though and at this point I have a ‘nobody’ process. That just means someone’s accessing my website. No, really! That’s good! The CPU and memory usage seem high, but they vanish in a second. Basically, someone rang my doorbell and for that brief moment, electricity was used. The next thing I see is the top command, which is run by me (hi!) and down the line is that idiot, clamd.

    I actually scan top a lot at work these days, trying to understand what’s causing issues. It’s good for ‘right now!’ things, but not so much if I want to see what started a strange spike a couple hours (weeks) ago. For that you need a whole mess of tools.

  • Woop(ra)! There it is!

    woopra A couple months ago, I stumbled onto this statistic site called Woopra, and signed up to be a Beta tester. I already use things like Google Analytics and SiteMeter, which let me see how much traffic a site gets a day, based on about twelve different interpolations of the metrics.

    Basically, I’ve learned I can tweak my results to make it look like I get a lot of traffic or a little, which serves no purpose. But I can also compare my sites to previous days, which I actually do find helpful. I can learn what days my sites are heavily hit, which days are good days to upgrade code because no one’s there, and so on and so forth. What you have to figure out is why you need stats. Statistics are meaningless for a site like ipstenu.org, because there’s no money to be made here. For jorjafox.net, I find that they help me understand trends and as that site averages about $.75 a day in ads, it’s beneficial.

    Google Analytics and SiteMeter are both ‘yesterday’ code, however. I don’t get to see the current status of my site until the day after. Most of the time that’s okay. If I really am desperate for pageviews and such, I have other tools on my server to figure that out (and Google Adsense can be brute forced into helping). But sometimes you want to watch what people are doing as they’re doing it, in real time.

    Enter Woopra.

    With Woopra, I can sit and watch people ping the heck out of my sites and see what they do as they do it. It’s a little Big Brother, but honestly, if you didn’t know that someone can tell who you are when you visit their website, it’s too late for you. Woopra lets me watch as people from different countries sneak in and out, where they come from and where they go to when they leave. Like I find that the majority of my hits come from the Gallery (200 pageviews an hour, give or take), and most of the referrers are from the main site or the wiki. This is all stuff I knew, but it’s nice to see them in live tracking.

    Do you need this stuff to run a good website? No, not at all. But if you’re starting to move your site from ‘good’ to ‘moneymaking’, then these are things you have to start to study and understand. Like that it’s okay to have an 11% drop in pageviews at noon, because the average at the end of the day will balance out. Or that you get a lot of traffic at 3pm from youtube. All these things help you better understand the Venn diagram that is your website, and the more you know …

    Well there you are, then, aren’t you?