Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: troubleshooting

  • Backtrack to Clean Code

    Backtrack to Clean Code

    I was watching The Bletchley Circle, about four women who were part of the code breakers in World War II, and how they stumbled upon a serial killer because only they could see the patterns. In the third episode of the first season, the main character is trying to explain why understanding the killer, Crowley, from before he started killing, and she says the following:

    At Bletchley, when we came across corrupted data, we had to backtrack till we hit clean code. That’s how you find an error in the pattern. All Crowley’s giving us is corrupted data.We need to backtrack to before he was killing. We need to start from there. That’s how we’ll find him.

    I’d never thought of it in those words, but that’s exactly right.

    When we debug code, when we find errors, we always backtrack to clean code. Most of us aren’t trying to find psychopaths and serial killers, of course. What we’re trying to do is find the patterns and understand what went wrong. And many times, we’re trying to find patterns when the telling of the breaking doesn’t lend itself to any patterns.

    Think about how you describe a situation, how you explain what’s broken. You start with your part. “I was trying to do X.” Then you explain what you expected to happen. “Normally that makes the color blue.” Next you say what did happen. “Instead, it made the color red.”

    That’s all well and good, except there’s a great deal missing. Some of it will be pertinent and some won’t. Some will be overkill and useless signal to noise, and some minutiae will be just what is needed to solve a problem. The difficulty is that you may not know what happened that is important. If all you know is ‘I upgraded WordPress’ for example, then you may not be aware of all the changes that went into the WP_Http API. You may not know about the new Multisite functions.

    If you’re not a developer, reading the field guide for WordPress 4.6 RC1, and all the linked posts, and did a compare of 4.5.3 to 4.6-RC1, then maybe you’d be surprised when your plugin breaks. And while you thought well of yourself for testing on the release candidate, you’re stunned at how much changed, and not sure what on earth happened.

    So you backtrack. You know that the magic sauce is in the requests sent to the server. And you know you’re using wp_remote_request() to do it. So you look at anything related to that. What does it call? Did that change? You step back and back until you find as much as you can, and when you’ve determined it’s ‘something,’ you reach out for help.

    In WordPress, this is why we tell people to switch to default themes or disable plugins. We’re asking people to backtrack to code we know is clean. We can’t read minds and know the little things. So we ask people to backtrack in the most obvious ways. “Does it happen with all the other plugins off?”

    Backtracking to clean code.

  • Turning It Off And On Again

    Turning It Off And On Again

    Apple’s watchOS 2.0 came out on Monday Sept 21, and I was one of the unlucky ones who had a problem. All of the new ‘native’ apps crashed.

    If you’re unfamiliar with the concept, the original Apple Watch didn’t allow apps to really run on the watch. They ran on the phone and you had to use Bluetooth to connect for data. Now, with watchOS 2, the apps can load locally and use wifi on the Watch itself, making them faster. Exciting times for all. As I explained to my wife, all the Apple default native apps worked fine. The 3rd party ones did not. They all crashed.

    Also my battery life went to shit. So I did what one logically does. I rebooted my Watch. That didn’t help. So I went to Google and Reddit (yes, Reddit) and I dug around and found what everyone else had done to fix it:

    • Unpair and re-pair
    • restore from backup of 2.0
    • Setup as new
    • Let it sit overnight
    • Uninstall apps from phone, reboot phone and watch, reinstall apps on phone, reinstall apps on watch
    • Reinstall from my 1.0.1 backup

    None of that worked for me, so I filed a ticket with Apple support at about 7:15am. They called me back at 7:30 (which was nifty) and we discussed what I’d tried. They walked me through things, I confirmed I’d tried all of that, and detailed what I’d seen happen. Finally the woman apologized, said she didn’t know why it couldn’t work, and asked if I wanted to mail it in to Apple for a replacement.

    I didn’t. I was sure this wasn’t a hardware bug. I asked if I could take it to an Apple Store, and she said yes, making me an appointment at the store for the weekend (the earliest time) but I work .5 miles from a store so I planned to head down after lunch to have a go.

    I ended up not doing that.

    I work in tech. I’m used to troubleshooting. I went over everything I’d done. I checked and double checked that I was sure I did it right. I went back to the Reddit thread and looked to see if anything new had been posted. Sure enough, there was something. A Zen man in the MacRumors forum had an answer:

    • Doing a iphone backup with encryption of data on itunes.
    • Delete content of iphone.
    • Restore from a backup.
    • All native apps are working fine!

    While I couldn’t say that was a ‘great’ idea, I figured I had nothing left to lose. Since I always keep a spare cable for my phone and my watch in my bag, I connected them both and tried.

    And yes. It worked. Immediately I canceled the appointment with the Apple Help Gurus and started a live chat with them to explain how I fixed it. I also contacted the two app companies I’d been chatting with about it and made sure to confirm on Reddit that it worked for me. Because I will never be DenverCoder9.

    The debugging process with the Apple Watch is convoluted. I had a similar headache when I couldn’t get the WiFi working properly. I ended up having to disconnect WiFi from my phone and then re-add it for the Watch to pick it up. It’s not really the best experience, and there’s not a lot of ways to debug things.

    While I do like the Apple Watch, the black-box technology aspect of the iPhone is increased since it’s, literally, impossible to use the watch without a phone. You have to both attempt to fix things on the watch and the phone, without having a way to determine which is the broken one. And a ‘reinstall’ is not really the friendliest thing. Had I not had a handy laptop, I would have had to do an iCloud restore, which would kill my activity history (something I’d already accidentally wiped out).

    The problem comes back to meaningful error messages. All I could say was “The app crashes and kicks me back to the home screen.” Apple faces the same issues we all do with errors. How do we explain things in an informative way that allows people to react to the errors and know what to do next, when there is no way to gauge their skill set? Sadly, Apple’s route is “Take it to a professional.”

    We can’t all do that with our products, and more often than not it leads to frustration and things like ‘Bendgate,’ where people just rant and make a product seem worse when it’s really only a very small percentage of those who are impacted.

    Is there an answer? No. But it’s just one more thing to consider when we discuss elegant failures.

  • Mailbag: Debugging the Dread “No Permissions”

    Mailbag: Debugging the Dread “No Permissions”

    Sometimes customers hunt me up here. That’s okay. I’d rather you opened a ticket since I like to eat, sleep, and hang out with my wife, but in this case they also had an interesting problem.

    When I log in, I get “You do not have sufficient permissions to access this page.”

    I found the support ticket and solved it, and then I mailed the DreamPress Ninja’s with this methodology. It relies on wp-cli, so bear with me.

    The first thing to do is check the user list:

    user@server:~/example.com$ wp user list
    +----+-----------------+--------------------------------+-----------------------------+---------------------+---------------+
    | ID | user_login      | display_name                   | user_email                  | user_registered     | roles         |
    +----+-----------------+--------------------------------+-----------------------------+---------------------+---------------+
    | 2  | bobby           | Bobby Done Nightly             | bobby@example.com           | 2014-02-12 17:44:28 | administrator |
    | 3  | darren          | Darren Done Rightly            | darren@example.com          | 2014-09-29 17:49:11 | contributor   |
    | 4  | ethan           | Ethan Done Wrongly             | ethan@example.com           | 2014-10-27 21:01:07 | subscriber    |
    | 5  | jimmybear       | Jimmy Eaten By A Bear          | jimmy@example.com           | 2013-12-16 14:45:18 | author        |
    | 1  | iamempty        | Admin Account                  | admin@example.com           | 2015-03-05 01:30:09 |               |
    | 6  | sonnyboy        | Sonny Boyd                     | sonny@example.com           | 2014-10-27 22:13:08 | contributor   |
    +----+-----------------+--------------------------------+-----------------------------+---------------------+---------------+
    

    Notice how imaginet has NO role? That’s the problem. So let’s give it a role!

    user@server:~/example.com$ wp user add-role 1 administrator
    Success: Added 'administrator' role for imaginet (1).
    

    Check again and all is happy!

    user@server:~/example.com$ wp user list
    +----+-----------------+--------------------------------+-----------------------------+---------------------+---------------+
    | ID | user_login      | display_name                   | user_email                  | user_registered     | roles         |
    +----+-----------------+--------------------------------+-----------------------------+---------------------+---------------+
    | 2  | bobby           | Bobby Done Nightly             | bobby@example.com           | 2014-02-12 17:44:28 | administrator |
    | 3  | darren          | Darren Done Rightly            | darren@example.com          | 2014-09-29 17:49:11 | contributor   |
    | 4  | ethan           | Ethan Done Wrongly             | ethan@example.com           | 2014-10-27 21:01:07 | subscriber    |
    | 5  | jimmybear       | Jimmy Eaten By A Bear          | jimmy@example.com           | 2013-12-16 14:45:18 | author        |
    | 1  | iamempty        | Admin Account                  | admin@example.com           | 2015-03-05 01:30:09 | administrator |
    | 6  | sonnyboy        | Sonny Boyd                     | sonny@example.com           | 2014-10-27 22:13:08 | contributor   |
    +----+-----------------+--------------------------------+-----------------------------+---------------------+---------------+
    

    If the roles had be totally empty, it’s a case where the database is looking for the ‘wrong’ table prefix and that’s a little messier. There, you’ll want to grab the table prefix from the wp-config.php file:

    $table_prefix  = 'wp_hsy671e_';
    

    Then go into the database and look at wp_hsy671e_options for a field called wp_hsy671e_user_roles – if you don’t see one, that’s the problem. Check for one named wp_user_roles and rename it.

    If you DO see the right user_roles, then the problem is all the users are pointing to the wrong table. You can just run wp search-replace wp_user_roles wp_hsy671e_user_roles to force it back.

  • Don’t Say WordPress

    Don’t Say WordPress

    This time I’m absolutely 100% serious. Yes, I can be sarcastic and humorous when I talk about WP, but in this case, I’m being honest, and I promise you serious. I work for DreamHost as a WordPress Guru. I’ve been training people, and teaching them one at a time, and in doing so, confirmed a bias I’ve had for years: Tech Support goes blind sometimes.

    Man with tape over his mouthI don’t think this is really their fault. They have to handle 60 to 100 tickets a day about everything from “How do I reset passwords?” to “My Database is speaking in R’lyehian. HALP!” In order to get through that volume, they look for the key words, the important ones that tell them that this is the problem. And one of those keywords is “WordPress.”

    This is not great, because sometimes the problem isn’t WordPress. Like a PHP isn’t running, or the DB is missing, or a hundred other ‘It’s not WP’ problems. Naturally, that means a handful of tickets escalated to me aren’t WordPress at all, and I have to dig into it, and explain why.

    Before my coworkers think I’m pointing figures or blaming them, I really don’t. It’s a volume thing, and it’s got to do with how the customer presents the error. If they tell you “My WordPress site is down, I’m getting an error 500 on all pages!” you think “Oh, it’s probably .htaccess or they’re using too many resources.” Those are the most common causes after all. After that, you start getting messy and into weird things like “PHP memory is set too high, causing WP to crash” (which I didn’t even know you could do to be honest until November). And sometimes it really takes someone who knows how WordPress works to put the pieces together and determine “Oh! This is it!”

    However, hands down, when I’m working with Multisite and I see someone say “My wildcard subdomain isn’t working!” and the ‘error’ page they get is not a WordPress styled 404, I will tell them “DO NOT mention ‘WordPress’ or ‘Multisite’ to your host. Tell them this:” and here’s my copy/pasta:

    I’m trying to set up a wildcard subdomain, so anything.mydomain.com will pull the files from mydomain.com, however I’m having problems. I’m getting a server error instead of seeing the content on my site. Is there a trick to setting this up on this server?

    Now some hosts will look and say “Oh well you’re using WordPress, that’s why.” and I want to kick them a little. No, that’s not why. When you go to a subdomain and get the server error (like subdomain not found) or worse a DNS error (like Google saying the domain doesn’t exist), then the problem is not, and cannot be WordPress.

    That’s why it’s really important to present your error in the best way possible. The most accurate to the actual problem. Of course, if you have no idea, then you should just be honest and say what you did. If you really, truly, didn’t do anything, though, be prepared for someone to ask “Are you sure? You didn’t change a setting on the dashboard?” And sadly this is because a lot of people lie, a lot of people misrepresent the facts, and a lot of people play dumb. There is a very small percentage of people who will come back and say “You know, I may have done something, but I cannot remember what I did.” I like those people a lot. They’re my people. They admit they may have, but they can’t recall.

    WordPress FauxGo
    WordPress FauxGo (yes, this is the FAKE logo)
    Sadly all those people who aren’t quite as truthful screw it up for the rest of you, which is why there’s a time and a place to point at WordPress, and there’s a time and place to not do so.

    How do you know the difference? Well you have to think. Is what you’re trying to do something you do with a plugin or theme? Did it happen after you made a change to your site’s settings? It’s probably WordPress. However if you’re trying to do something outside of WordPress, like domain mapping or wildcard subdomains or creating a database? Then don’t mention WordPress.

    It’s counter-intuitive, I know. I’m telling you to be honest and say what you did or what you’re doing, but at the same time I’m telling you to leave out what might be important information. And that’s why you have to think. Is the error a WordPress error? Learning that takes a long time, so for a lot of rookies, the easier question is “Does the error happen without WordPress involved?”

    Let’s go back to that subdomain thing. Turn off Multisite. Does the same problem happen? Probably not WordPress. So don’t bring it up just yet. Now if they ask “What are you trying to do?” or why, tell them. “I’m trying to setup wildcard subdomains so I can use it with WordPress, but at this point, I’m not even getting a WordPress error.”

    Of course, it’s not always that simple. Like what if I told you that, on Multisite, not getting the CSS to display on subsites could be a server error? That’s when you get to say:

    My complex .htaccess rules don’t seem to be honored by my server. Is AllowOverride set to either All or Options All in the httpd.conf (or equivalent) file?

    Notice how I didn’t mention WordPress? This is because I know that if my .htaccess rules are right, the problem’s not me. Unless of course my host blocks that on purpose because they don’t want to let me run Multisite on a shared box.

    It’s not cut and dried, it’s not ‘If this, then that!’ But what it is, is education and thinking. As long as you can learn what is and is not WP, you’re on your way to knowing when you ask about WordPress problems, and when you ask about server problems.

  • WordPress and the Erroneous Update Message

    WordPress and the Erroneous Update Message

    DebuggingIt’s time for a little example in debugging!

    This domain is running WordPress trunk.  When I say that, what I mean is I’m running the very latest SVN, no more than four hours behind, thanks to a cron job.  At the moment I’m writing this, I’m on revision 18690.  I did this so that I could get off by butt and actually test thing without having to think about it.  To a degree, it tells you how much trust I have in WordPress and the core commit team.  My whole site runs because they know what they’re doing.

    This doesn’t mean there aren’t errors, though.  So far I’ve been very “helpful” breaking the responsive CSS on the admin dashboard.  I’m sure Helen, Andrew and Andrew just adore me right now.  Yes, that was sarcasm.  My methodology is pretty straight forward.  Just Use WordPress on ipstenu.org.  If I find a problem, make a note and bring up my local install.  I can only do this at home, on my Mac, so usually I come home with three or four notes.  Update SVN manually on ipstenu.loc (yes) and ipstenu.org.  Is the problem still there?

    Most of the time the problem goes away.  When it doesn’t, I take a screenshot and make a trac ticket (though perhaps I should add them all to the one ticket… if any of you core folks wants to tell me, please comment away!).  I’ve also taken to popping onto the IRC channel -ui and chatting with people there before trac’ing.  Last night I found one, told someone working on the project, and she patched it right then and there.  Teamwork!

    Yesterday, I noticed my ipstenu.org site had a weird problem.  I have a subsite on the network called test.ipstenu.org (feel free to check it out).  It’s just there so when someone says “When I use this theme/plugin…” and I can quickly go look.  I make fake posts, etc etc.  It’s quite seriously just for testing.  At one point, I’d spun up bbPress on that site.  I’d since turned it off, but it was on RC4.  RC5 just came out this week, and I had an update message on only that site, telling me to upgrade it.  Instead I deleted it.  That made the update notice go away on my network dashboard, like it should, but not on test.ipstenu.org.

    Say what?

    I tried to reproduce this like mad.  I installed bbPress RC4 on my local box, activated it, left it active, deleted it, and pretty much every which way I could think of to break it.  The error only happened on that site, even though bbPress had been running on another subsite as well!  I checked on multiple browsers, and wiped cache, logged out, nuked cookies, etc. Multiple computers even! Finally I gave up and said “I need help.”

    Weird WordPress MultiSite Question

    I have a multisite and one site on that network is showing that I need to update a plugin. Every other site correctly says “No Updates!” This one doesn’t.

    I’ve poked around, but I had assumed (bad me) that the admin bar would cache that network wide, somehow. But then why is it only on one site? So I wiped and rebuilt the wp_10_options table and it still shows up.

    I haven’t the foggiest idea why it’s happening. Luckily this is only on my test site, test.ipstenu.org, but it’s maddening.

    I don’t find a huge amount of use for Google Plus, but that was great for me.  I posted, I tagged it with my WordPress circles, and went to catch my train.  It was too long for a tweet, too weird for the forums as I didn’t want people to get fussy – I’ve noticed if I raise a post, it scares people.  I guess I have ‘street cred’ on WP and some people worry when I have a problem I can’t fix right away.  Flattering.

    I got replies from Raincoaster, Brad, Ron & Andrea, and Otto, who all said “That is weird.”  Andrea pointed out that it could be caching.  Brad asked about plugins.  Otto and Ron said that if it was cached, it was network wide (which made it even weirder to only see it on one site), and then Ron told me to look in wp_sitemeta table.  I was, I admit, already looking there, but I’d gotten distracted when I found the “Add a Link” page was broken on trunk.

    Fancy Pants ManAfter Ron’s comment kicked my pants, I went to that table and thought to myself “Where are the caches?”  I knew this from ages back, that anything named _transient… was a cache.  There are tons of transient feeds in your wp_options table because the RSS feeds you see on the dashboard are cached.  Cool, right?  Well, what if, just what if one of them was corrupted?  You can delete them without hurting your site!  So I hovered my mouse over the update alert and noticed the mouseover said “1 Plugin Update.”  Then I looked at the transients and found this:

    _site_transient_update_plugins

    And I deleted it.

    And my error went away.