Half-Elf on Tech

Thoughts From a Professional Lesbian

Author: Ipstenu (Mika Epstein)

  • I take it back. WP-Super-Cache is a Super Hero

    cupspikes I’m going to be upfront and admit that I’ve never actually liked this plugin. A very large part of me wants to side with Matt Mullenweg in that if you have a good server, configured properly, with a decent host, you should be just fine. Also, it doesn’t really work well with my favorite anti-spam plugin, Bad Behavior, which stops 99.999% of my spam cold. But. Over the years of running a vaguely popular fan site, I’ve been nailed by service spikes that killed me and everyone else on my shared hosting setup (multiple websites, not all connected, sharing a virtual server). At one point, I had to offload ‘news’ to LiveJournal, but since then, I’ve pulled it all back to WordPress, moved to a virtual private server (VPS, just me and my sites on a virtual server) due to the need for better support, and I was kind of complacent. Things were trucking along just fine, we had some major news that were handled without a blip, and I thought I was cool.

    Yesterday I had to cycle the HTTP service three times to clear things up. The first time, someone was using a really old URL (for a part of the site gone 2 years now) and, when it didn’t give them what they wanted, they kept hitting it. I blocked the IP address and we were fine. But. Then the news that I had some new, cool, information hit, and suddenly I was spiking like mad. I checked my stats, trying to see what was the culprit. The gallery is pretty tolerant of these things (though I have turned on the Static HTML cache right now) and while I did have some hefty images (1 to 2 MB, I usually try to keep ’em to .75 megs), it wasn’t ZenPhoto borking.

    No, my poor, poor WordPress was having a heart attack because I’d gotten myself crosslinked from a couple high traffic sites. How bad? Well that spike on the graph below may explain it.

    spiked

    The first thing I did was tune the server. Actually, I’d done that months ago, dropping my memory usage from 77% to about 60%, but now I went in to see how well that was working. There was a little more I could do, so I optimized a couple more settings and things eased up a little. Not enough. I scrubbed the CPU usage, too, and normally we never spiked over 1 for a load average, but that wasn’t working yesterday. Sidebar. CPU Load is a very bizarre thing to most newbie server admins, and I’m not great at sorting it out myself. Of course, I know that a ‘good’ load is anything around .3 and a ‘bad’ load is something like, oh, 9. And yes, I hit 9 yesterday on my 8 processor VPS box. I’m not going to explain it here, as I’m still learning and I’m sure I’d get it wrong, but the gist is that you don’t really worry if your load hops up to 1 or 2 for a short amount of time. When it stays there and is spiking at 4 or 5, however, you need to pay attention.

    What kept happening to me was that it would spike up to a load between 5 and 9, and the HTTP service (the bit that serves up webpages) would scream and fall over. Email, FTP, shell access and the rest were all okay, though, so I knew the server itself was fine. Thus I deduced something was sucking up the load and I knew I had three choices: JFO’s blog (it’d happened before), JFO’s gallery, or YTDaW. While I host YTDaW, I don’t actively admin it in any authoritative stance. The only ‘mod’ work I’ve done is turn off email alerts for people who are using non-existent emails (and then, only when I’m tired of getting their bouncing email). Devon pretty much keeps me on tap for server admin and security stuff, and I do my utmost best to keep my hands OUT of the pie. It’s her baby, I’m just the tech.

    And while they’re using a pretty old version of the forum software, it’s secure enough and solid enough that I didn’t think they were the culprit. The evidence (heh) supported that theory, so I went to look at JFO. It was definitely my old girl, and right away I could see that we were getting a lot of traffic from new users. Four times the traffic. Before you could say ZOMG! I was on Google Analytic and Woopra, checking out who the hell was hitting my site and the answer was surprising.

    Everyone. (Well, mostly FaceBook, AfterEllen and Twitter, but really, it was all over.)

    I’d accidentally broken news about three hot topics within a couple hours, and now everyone and their mother wanted to see JFO and, as many people have mentioned, WordPress was hemorrhaging under the ‘digg’ effect. Basically it was trying to serve up dynamic (generated on the fly) pages to too many people at once. If I was using static HTML, it would go faster, but WordPress doesn’t do that. Except … except it does if you use WP Super Cache.

    no10 As I mentioned before, I don’t (didn’t!) like that plugin. I want my app to behave correctly without it. I mean, the PM of Britain uses WordPress! I was sure they don’t need caching. They probably have a rack of servers on a co-located cluster. Except I viewed source and they were using it. The Library of Congress wasn’t, though, and neither were The Speaker of the House (Nancy Pelosi) or the Army. Honestly, I wasn’t sure how to take that, but after four hours of babysitting my server, I took a plunge and installed WP Super Cache for the fourth time.

    The first few times sucked, I admit. It was a lot of massaging and manual crap that, while I’m perfectly capable of doing, I didn’t like. This was easier. A chmod, an install, a click, another chmod and then I was done. And guess what? My loads dropped from an average of 3.45 to one of .35 by morning. On top of that, my memory had one spike since I turned it on, and that was right when I was running backups and the like.

    memoryspike

    So I’m keeping it on for now, especially with what I expect tonight, but I think that I can say … yeah. WP-Super-Cache does what it says.

  • Every Site should have a Favicon

    wikipedia-favicon Imagine summing up everything a website is about in a 16x16px square. That’s the goal of a favicon (short for favorites icon). Pretty much every site out there has one, and it’s a devil of a task to make one that looks appropriate, identifiable and understandable in such a small space. As much time as I spend tweaking a design I spend on a favicon because they are that important for the look and feel of a site. A site without one is nearly naked.

    Back in the days of IE 4 (yeah, 4, so 1997), Microsoft hit upon a great idea. If you made a teeny picture and saved the file as favicon.ico in your html root, their browser would pick it up and be the icon on your bookmarks menu. It didn’t take long for people to figure out microsoft.com was doing this, and they began implementing it all over for every site they could. As people got smarter, they figured out how to fake it, so you could have a different favicon for every page, just by manipulating the head of your html document.

    Back in the day, you had to use .ico (Microsoft Icon) files as your favicon, but these days most modern browsers pick up .png, .gif and .jpg happily enough. This allows people to make animated favicons, which need to be shot and killed. For maximum compatibility, though, most people still use .ico, since IE doesn’t like the others. Or it didn’t. Someone on IE 8 will have to check.

    The real problem boils down to size, for most people. At 16x16px, you don’t have a lot of room. This site actually has a non-recognizable icon (it’s the Xena/Gabby picture). Technically you can go up to 32×32 for an image, and I have one that’s 240×240, but in the end, they all render at 16×16 on 99.999% of browsers, so looking good at that size is your goal.

    If you think I’m being silly, about a year ago, Google changed their favicon and admitted that it wasn’t final. Right away they basically started open submissions for a better one. When they changed it in January, it became the favicon heard ’round the world. Eventually, Google stepped up to explain the change. It’s important to have an icon that matches your site, as Google explains, as well as a unified look for all aspects of your design. Should you have a different look for each app on your site, or an all in one? How does it affect the other aspects of your site, like the iPhone’s new icons for saved webpages?

    These aren’t simple answers, but to explain how I go about it, here are some favicons that I have made and use out there in the world. Not this site’s though. I need to come up with something better for it.

    jfo When I moved JFO from orange to green a year or so ago, I made a new favicon to reflect the design. The image is a cropped shot from the original header (which is now a full color photo, but still), and is a close up of Jorja’s face. It’s JUST recognizable as Jorja, I think.

    jfo2 Alternately, I came up with this image, which is a copy of the shot used on the header currently, done in greens to match the site. In a way, it’s both more and less recognizable, as the image is harder to make out (it’s a head and shoulders) but as it’s the same used in the header, people might make the connection. I’ve yet to use this on a live site, but it shows up on my test sites right now.

    For the website ‘SCA Jews’, I had gone with a slightly eastern feel of a website, that evoked both the idea of camping with the concept of days gone by. Evening Sun came from spectacu.la, and took minimal editing to fit my plan. The problem was I had no favicon. Originally I put a little sun up there, but then it struck me that the ‘meaning’ of the site was to promote the meal plan “Meals on Camels”. What better way to express this than with … a camel.

    yeast I also helped design (or rather optimize the design for) my friend’s site, The Yeast I Could Do. She had no favicon and I spent a couple hours scrounging for something bread-ish, and eventually picked this one, even though it’s questionable. It does look a bit like a loaf of bread, and she recognized it, so I think it went okay. In it’s .ico format, it has a transparent background.

    ponywars Finally there’s this one. Pony Wars is a joke site I made up with a friend for a “My Little/Pretty Pony RPG”. I mocked up the site because I was bored one day and finding an icon for it has been a bear. In the end, I went with this 33×33 (yes I know) icon of a pink pony. It doesn’t scale very well and looks weird on the site itself, but it’s a hard icon to shrink.

    If, in the end, you’re stumped at making one, there are a lot of favicon collections out there to help you. Be warned, they can take a LONG time to load:

    What are your favorite favicons?

  • I Haven’t Got Time For The Pain!

    Carly Simon and you should get the joke here Two months ago (give or take) I mused over photo gallery options for my sites. For Ipstenu, I’m now using WordPress and treating it like a photoblog. For JFO, however, I couldn’t answer it that easily.

    I really do like the Gallery project. I do! I learned a great deal about photography from it, and I’m thankful for it. But. I needed to move on as a user, a developer and a photographer. On that last one, I’m not a profession one, I’m just a goofy girl with a camera who likes to remember where she’s been. As a user, Gallery2 did the job well and without major issues. As a developer, it made me want to cry. Many times. Once I had to log into my friend’s server to fix his install. That just whomps.

    Even the developers admit that Gallery2 suffered from bloat:

    The code base is too complex and over-engineered because it was designed to fix every single thing that was wrong with Gallery 1 (Second System Effect) leaving its scope hazy and broad.

    The whole idea of it was “Your photos, your website.” And personally I love that. I hate having flikr or picasa in charge of MY photos. Let alone FaceBook. I have a blog on my domain for that same reason. But Gallery2 was too much. I never used half of it and it was 16+megs at its slimmest install. That the developers agreed with my feelings delighted me. And the Feature List was also exciting. As soon as G3 popped out, I grabbed a copy and started playing.

    With each version of Gallery3’s beta releases, I would get excited and then disappointed. Excited for the new toys and disappointed for how the overall effect felt. It just felt wrong for me. It wasn’t really Web2.0, even though it was, and the usage felt off. It didn’t make intuitively as much sense as G2, though it was still far better than Coppermine (which frankly I hate, and I know more people who argue with it than anything). At first I thought it was because I was so used to G1 and G2, but then I realized that over the last 10 years, I’ve used so many different systems that I’m fine with subtle differences. I’m savvy, I’m smart, I can code, so why did G3 feel wrong to me?

    It was too hard. Too much was built in and not plugable. Too much was hard coded in itself. Theming was impossible in the first release, and way too hard in the third. Understanding the theme system in G2 was easy, though implementing it was hard. Understanding it in G3 was hard and implementing was horrific. And before someone reminds me, AGAIN, that this isn’t even a beta product but an alpha, quite frankly that’s not an excuse. The basic things you need to be able to do with a first public release (be it beta, alpha or whatever) is to use it: Upload photos, change options, theme. That’s it. Those are the three things at it’s most basic that photo gallery software has to have, or you may as well be using an off-site solution.

    And while I may sound like I’m ranting, I’m not. I’m sad and frustrated and … You know, I really like Gallery! I really do. But it was starting to feel like Movable Type. They made a big shift and suddenly I wanted to know who peed in my coffee. The code felt wrong, it felt klunky, it felt raw. It was like starting over, and I didn’t like where it was going. And I realized the fact was that I was going to say goodbye to an old friend.

    Personally I’m all about the simplest, best, tool for the job. I wanted a way to update news on JFO and, when that was ALL I needed, I used CuteNews. When I realized the site was going to need something more, I weighed my options, tested software, and decided that while WordPress was a bit of overkill, I knew how to support it and customize it to be what I needed. In the end, that proved to be a perfect choice. When I had a forum (the first time around), it was IPB, which I liked, but it always felt too big. Now I use the very basic bbPress and it’s what I need and nothing more.

    If WordPress had PhotoPress, I’d probably have snagged that. Instead, I shopped around. I installed Coppermine, again, to test. I put up G3-alpha3 and then 4. I went to WikiPedia and dug out the compares and ended up in a head to head battle between ZenPhoto and Gallery3.

    ZenPhoto won by feeling better.

    Seriously, it’s asthetics at this point. There are only two features I miss: Being able to re-upload a picture and keep it’s MetaData, and having ‘new’ images show up with a different background color. But I can live without those.

  • Plug It In, Plug It In

    vilcus-plug-it-inI am not a great programmer by any means. I can hack around and muddle my way through with the best of the great net scapegraces. I’m not the genius who invents a brand new way of doing things. That said, I do, eventually, get annoyed with things enough that I force myself to learn how to code.

    Yesterday I was pissed off at WordPress because of it’s user management tools, and no plugins really did what I wanted. See, I have open registration. It lets me sync my blog and forum and let people post. But where it fails is that I can’t set users as ‘banned’ in WordPress. This is a simple thing, I feel. A user role that has no rights and is just banned from commenting. They can read all they want, but no comment. I’ve tried just about every tool out there, but they never work. In addition to that, spammers sign up to my blog.

    Since creating a ‘bozo’ user role is outside my ability, I decided what I wanted was a plugin to prevent people from registering if they were on my blacklist, similar to how I can prevent them from commenting on my comment blacklist. At first I was using TimesToCome Stop Bot Registration, which (among other things) uses StopForumSpam’s list of spammers as a stop-gap.

    The problem with TTC is that if you register with a bad email (jane132@gmail.com instead of jane123@gmail.com) and then try to register with the RIGHT email, it notes that the IP is the same and bans both emails and the IP. Which caused a couple people no end of problems on my site. It had to go.

    From there, I tried No Disposable Email, which checks against a list of known baddies. That was nice, but it was a text file list that you had to update by hand. But it got me thinking.

    I quickly converted it into Ban Hammer, which allowed me to update and edit the text file from a submenu inside my admin session. But that wasn’t enough. Why did I have to have two places to keep my jerk list? If someone was on my WordPress Comment Blacklist, I didn’t want them to comment. That implies they’re just not welcome at all. So why don’t I make Ban Hammer pull from that list. Which I did.

    I still have things I want to do to the code, like put in an option to use StopForumSpam’s list, and a way to edit the error message. But for now, Ban Hammer sits by my other plugins, Recently Registered (lists the last 25 registrations) and my bbPress plugin Spoiler Bar (adds in spoiler ‘code’ to bbPress) on my Google Code site. It’s not for ‘public’ release, but it’s there so my friends who have been helping me test out my ideas can easily download. What? I have nerd friends!

  • You’re not the boss of me

    After having my domains on three different servers for a long time, I mathed it out that it’d cost me the same to put ’em all on one VPS (virtual private server). After calling up my ISP (the fanfreakintastic LiquidWeb) they had me all moved over without me having to fuss! Combine two shared accounts into one VPS? Sure, done. I suspect my next bill will look … weird, but that’s okay. I’m sure that even if it’s all messed up, I can call them and get it sorted out.

    The first thing I did was make sure everything was running and then I left it alone for a day. Did anyone notice? No? Good, the fix was in!

    Then I started fiddling. I didn’t know a lot about VPS, having only mucked about with a RedHat distro before, and LiquidWeb provided me with cPanel and WHM, which I’d never used before. They also had the very familiar shell world for me to jump into. Google being what it is, I quickly found a VPS Optimization Guide that gave me some ideas to start.

    What I’ve Done So Far
    My memory usage, with one beefy site and two baby sites, was hitting 50% which, in my mind, was bad. Now the beefy site runs off WordPress which is known to have these issues. My CPU was barely passing 0.01 (yes, that’s right) though, so that was good. My first thought was to try WP-Super-Cache again, except last time I did that, CPU went through the roof and stayed there. Also, you lose dynamic feeds etc (unless you use AJAX) and I’ve heard great things about WP-Super-Cache but the fact that it’s not a locked in part of WP has always made me wonder as to it’s viability. If it really was that good, or the only solution, it would be built in. Not to knock it, but I consider it only one option.

    While I know I need to optimize WP, my first stab was to optimize the server. Except that I didn’t. I switched from Zend to APC. Now, I’m not really sure if that was the best thing to do. I find a lot of people clamoring that APC is better and since I’d had weird issues with Zend before (outright borking MediaWiki if not configured specially), I decided to give APC a shot. If someone has info on some benchmarks or a good link to why APC is better than other PHP cache tools, I’d like to see them.

    Then I removed Clamd (and ClamAV). Yes, I know it’s virus scan software, but I’ve never actually seen it catch anything. What I run on the server, and what my ONE (yes one) resold client will run, aren’t going to get caught by it. We run the same stuff. So call it a calculated risk. I also turned off EntropyChat (never gonna use it), MailMan (resource hog), Analog Stats and Webalizer (leaving AW stats, though personally I use Woorpa and Google for stats). Gave the server a bounce after all that and my memory dropped from the 50-th percentile to the 30s. I consider that a success.

    My only issue is that my phpinfo page looks weird… No idea what happened there.

  • Google’s Blog Search is Irrelevant

    Google is a great search tool to find a website or general information about a topic, but quite frankly I’ve come to despise their blog search engine and I’m seeing serious flaws in their ranking app. In specific, they now search blog links (aka the blogroll) and when you search blogs about a topic, you get unrelated posts.

    If you search for Laurence Fishburne because you saw him on an episode of MAS*H recently as a soldier with a racist CO, Google gives you two hits for IMDb, one for Wikipedia, one about news (GoogleNews that is), and then, finally, his official website. While Google claims they don’t adjust ratings (that is, they don’t give more or less weight to a website on their own) and allow their PageRank algorithm to sort all this out, it seems to me that any official website should be ranked first. Also, IMDb shouldn’t be listed twice. But that depends on what people are looking for and what Google offers.

    We stand alone in our focus on developing the “perfect search engine,” defined by co-founder Larry Page as something that, “understands exactly what you mean and gives you back exactly what you want.”

    With that in mind, as I look at their tech overview for people who aren’t super geeky, I think that they come to the process a little flawed. PageRank is a great idea, don’t get me wrong. The more pages that link to a site, the higher the site is ranked (in essence). Okay, that’s great! Until you have those damn splogs. You know the ones. Spam blogs that promise you information about a person/place/thing, but is nothing more than a ton of links and 100 popups.

    Why would I search blogs? Easy, a lot of news sites are using blogs these days, and I want to read those too. It’s not rocket surgery, it’s how news is disseminated in 2009, folks. And I, personally, like to search by ‘date’ because I want to know what’s newest.

    Our search engine also analyzes page content. However, instead of simply scanning for page-based text (which can be manipulated by site publishers through meta-tags), our technology analyzes the full content of a page and factors in fonts, subdivisions and the precise location of each word. We also analyze the content of neighboring web pages to ensure the results returned are the most relevant to a user’s query.

    This looks like it should take care of spam blogs, but if you’ve ever done a search on blogs about someone (let’s use Mr. Fishburne again), you know it’s a crap-shoot.

    A news search is actually pretty helpful. I get some articles of interest right up front. If I flip the bit and sort by date it’s still pretty useful. When I go to blog search (which is a sidebar link off news), it’s still mostly beneficial.

    But I dare you, I dare you, to make sense of the articles when you click sort by date. Three posts on that first page might actually be something worth reading. Good luck finding them, and I hope they actually are what you want. But at the end of the day, those spam blogs aren’t the problem that makes me hate the blog-search.

    No, the problem, as I see it, are the posts like this:
    splog-1

    That bit I circled for you means that the ‘label’ (tag, category, whatever) for ‘Laurence Fishburne’ has been used 4 times. Go to that post and you will not find a single thing on the page of use. 99.999% of these blogs are blogspot and, while I don’t begrudge them their posts, they’re getting false promotion! And your post that you lovingly crafted about how totally amazing Fishburne is, and how he acted the hell out of that scene last night is now 10th on the list, and bound for page 2 any second now.

    The only official Google respons I can find on the matter is a post by Jeremy Hylton in their google forums (dated November 2008).

    We expected some problems from blogroll matches, but may have
    underestimated the impact on searches using the link: operator or
    where the query matches a blog or blogger’s name. We do expect to fix
    the problem you’re seeing. We’ll use the full page content, but
    exclude the content that isn’t really part of the post. I’m not sure
    if we’ll be able to make the change before the end of the year, but we
    are working on it and are pretty confident that it can be solved.
    We’ll post an update here when we’ve got a solution.

    And no, there is no update to that post.

    The hoopla from other blog sites has died down, but as this is still a prevalent problem on the blog search, I would really like to see it heat up again. Google’s blog search is pretty much dead useless to me if I can’t find information I want. As finding what I want is the whole point of Google (they said it first), they’ve made themselves irrelevant.