Half-Elf on Tech

Thoughts From a Professional Lesbian

Category: How It Is

Making philosophy about the why behind technical things.

  • Data Deletion May Not Be What You Think

    Data Deletion May Not Be What You Think

    So you’re handling GDPR and you have a privacy doc and policy and a plan for people requesting data and, yes, deleting it.

    Eventually someone is going to ask you to delete their content from your site. This is the scary part for most people. Remember, you get 30 days to reply, so don’t panic. Next, figure out what they’re asking for, and if you can say no.

    This is the fun part. You can say no. Sometimes.

    When You Can Say No

    In general, yes, you should delete people’s information if they ask. But if your website stores complicated information this is not actually as black and white as all that. The right to erasure does not apply if retaining is necessary for one of the following reasons:

    • exercising your right of freedom of expression and information
    • meeting any legal obligations
    • performing a task for and in the public interest or in your legal authority
    • archiving information of public interest or for research where deletion would impair the work significantly
    • related to and legal claims you have (or may have)

    This helps you balance out the problem of being told to delete things you need to keep for tax reasons. It also keeps sites that may collect public data for the general public (like wikipedia or a website that tracks queer characters on TV) from losing everything. It won’t protect you from other lawsuits, of course.

    It’s that last one I feel is really important to everyone. That’s the one that means if I block you, I may not have to delete your data, even if you ask, because I may need it for the establishment of legal claims. But that has to be a legit claim.

    You can also just say no for any reason you feel is justified. Now again, do not use this flagrantly. You still have to turn around and tell someone that you’re not deleting their data, so you need to be serious about this.

    Self Protection

    And speaking of being serious, you can actually say no to protect yourself. You see, people can only ask for deletion if the data is no longer needed for the reason it was collected. So if they want to delete their account but keep shopping at your store, you can say no since the information is needed to keep shopping!

    So remember why you track the data in the first place. When people leave a comment, for example, you track their username, email, and IP (and web address if they provide it) in order to know who they are and prevent spam, but also abuse.

    Here’s an excerpt from one of my privacy policies:

    Comments: When visitors leave comments on the website, the collected data shown in the comments form, as well as the visitor’s IP address and browser user agent string are saved in order to help spam detection and abuse.

    Since I retain data to prevent abuse, that is serial internet harassers, you can ask me all you want for me to delete any data I save about you, but I can say no to protect myself.

    When You Say No

    If you decide to tell someone no to a deletion request, you must:

    • provide the reason
    • inform them of their rights to make a complaint
    • inform them of their right to a ‘judicial remedy’

    That last one means yes, they can sue you to delete the data. If they’re abusing you (harassing etc) and you’ve saved all that, you’ll probably win. Which is one reason you should actually save and document people’s actions. I hate having a whole folder on my laptop that documents a bunch of people hating on me, but I need it.

    Basically if you’re going to say no, have a damn good reason, document it, and be prepared for a fight.

    Say Yes If You Can

    Most of the time, it’s no skin off your ear to delete a comment or edit a post. But sometimes it’s going to be a huge deal. And in fact, you can turn around and tell people “If I delete all your data, I will retain information required to identify you in order to prevent you from returning to this site. Deletion requests means you will not be welcome back.”

    If that sounded harsh, well, it can be. Because for most small blogs, consider what they’re asking. When someone asks to delete the content of a personal blog, it’s most likely going to be for a pretty petty reason. Unless they’re asking you to remove information that shouldn’t be public (like their phone or email – and yes, someone’s asked me to delete that before), it’s probably going to be someone asking you to remove a comment that makes them look foolish. Or at least it has been in my experience.

    Make Your Life Easier

    Keep this in mind too. Make your life easier. If you don’t need comments on your site, don’t have them. Turn off that contact form too. But there’s no law that says you need to let people talk to you on your blog. 

    This won’t be true for all situations, but do as much as you can and save yourself that GDPR headache.

  • Consent and Awareness

    Consent and Awareness

    GDPR.

    It’s the bane of many headaches for many web developers, web admins, and in general anyone who uses the internet.  If you’re reading this, it’s probably a headache for you too. So let’s have a real, non-lawyer talk about what’s going on and why you need to care.

    Notice: I’m not a lawyer. This post is not legal advice. Please read the EU GDPR Information Portal and research your specific situation.

    Everyone Needs to Care

    If you thought this only has to do with people who use eCommerce products, think again. The centre of the GDPR is data privacy. That is, the right to have your data removed from websites, when you want. The point to all this is if you have a website, and people visit, you need to care because the following reasons:

    • You have ads on your site
    • You allow comments
    • You use custom avatars (Gravatar)
    • You track visitors (Jetpack, Google, etc)
    • You embed content (Twitter, YouTube, etc)

    Does any of that sounds like you? It sounds like pretty much every public website in existence. And congratulations you need to care about GDPR.

    What You Need

    There are a lot of moving parts here, but the pared down version is this:

    • Know what 3rd party services you use
    • Know what your CMS tool tracks
    • Have a privacy policy
    • Have a way for people to request data deletion

    The first two are surprisingly complicated because, in the case of WordPress,  you might be tracking a lot more than you think. Remember all those things I mentioned above? They all are common situations where your CMS might be tracking people. But what if I told you that a lot of plugins you use also add on tracking? Or record more data than WordPress knows about?

    Like. I wrote a plugin that adds in the IP address used to register an account to the user meta. This means WordPress now records more data. Thankfully that gets deleted when you delete a user account, and it’s generally covered under the broad disclosure that you track users IPs (which every website does). But I have to make sure people who use the plugin know that, and communicate to others.

    That’s a very simple example. Take a plugin that logs user activity for, oh, let’s say security. Now you have to tell everyone about exactly what it tracks (browser information etc) and what you use it for. And you get to figure that out for every single plugin you use.

    This won’t be easy. Unless you read every single plugin you use, you’re going to be at the behest of developers who may not be aware of exactly what they need to disclose.

    Privacy Policies Are a Must

    Every site should have a privacy policy. While for most smaller blogs, the odds are low that anything will happen, you should have one anyway. The problem is that no one can tell you exactly what yours needs to have. I try to cover the four basics:

    • Terms of Use: all the things you agree to by using this site
    • Data Collection: what situations result in my tracking your data, including details on 3rd party services regularly used
    • Data Usage: what I do with data and how long I keep it – also how to request it
    • Policy Changes: a CYA that they’ll likely change

    There are a lot of details in those four sections, especially the Terms, which exculpate me if I get information wrong, allow me time to handle a DMCA, and a whole lot of things. And yes, it’s super daunting, I know. I mean, the privacy policy here isn’t half as robust as some of my other sites.

    The Bottom Line

    You can distill all this into consent and awareness. People need to know what they’re getting into on your site (or at least be able to know – you can’t help people who refuse to read). And you need to understand exactly what your site does. You need to be aware, as a website owner and a user.

    All those terms you ignored when signing up for Google Adsense and Analytics? Now is the time to knuckle down and read, because you need to cover that. All those extensions (plugins and themes) you added? Read up on them too. If they don’t explain what they do with data, ask the developers.

    Developers? Step up. Document exactly what data you save. If you allow for the saving of different kinds of data, based on what the user picks, explain that. But you have to tell people what’s being saved and how to delete it. Most CMS apps now have tools to hook into to aid deletion, so research.

    GDPR kicked in four days ago, but it’s not to late to fix things.

  • FUD: The Sky Is Not Falling

    FUD: The Sky Is Not Falling

    Every day it seems like there’s a new Zero Day vulnerability about our websites. SSL is being deprecated, PHP is out of date, the CMS we use has a critical vulnerability, security isn’t all that safe, and OMG we’re all going to get our identities stolen and our lives hacked.

    Making matters worse are those myriad security tools we use to keep ourselves from getting hacked or attacked, and they alert us to horrible things. I say worse because they terrify people without actually explaining and educating them, so the uninformed users come running to complain the sky is falling. And when those people are told an answer by other experts, they don’t know who to believe.

    Can you blame them?

    Responsible Disclosure

    It’s four years now, and Nacin’s post about how security is nuanced is still required reading.

    The problem we face is that telling the world about a security issue is complicated. We definitely need to tell people who are responsible for fixing it, and in a perfect world we should trust that they’ll push out that fix in a reasonable time frame. We also should be able to trust they’ll tell the appropriate people.

    But who, exactly, is the appropriate person to tell about a Drupal patch? Not the hack, the patch. In a different light, who are the right people to tell that a new security fix for an operating system has been released?

    There are millions of users. How do you get to all of them quickly, with the right amount of information so they can understand how important this patch is to them, and how quickly they should apply it?

    Enter Security Companies

    Many companies make their milk and meat off being the people who monitor and announce security releases. There’s nothing wrong with this. In fact, I laud them for being a much needed service. With so much data flowing, it’s important to have a service that can help users winnow down what’s critical to them and their setups.

    But… That’s not what’s happening.

    Security companies face the same problem we do. There’s just too much data, and it’s being updated all the damn time, and there’s no way to keep up with all of it. Which means that they do what I tend to do when I’m trying to explain things to a wide variety of people. They simplify as much as possible.

    The problem with simplification is that you have to skip over things and leave out the nuances that help people understand what’s actually going on. They have no idea what they actually need to worry about. And we’re back to zero.

    To Know When To Worry …

    You have to actually understand context to know what to worry about.

    There’s literally no other way around it. There’s no shortcut, there’s no cheat sheet, there’s just knowing what your site is doing.

    Let’s taken OpenSSL as an example. Back in 2014, a serious issue called HeartBleed was discovered. The bug was phenomenal in that it allowed people to steal and read secure data. If you ran a website, this was a massive issue. For your webhost.

    Was it a huge issue to you? Well. Maybe.

    A lot of people sounded the alarm and declared this a crisis, and we should all grab our web hosts and asks what they were doing and when would we be fixed. And the rest of us said “Hang on. Webhosts are aware. See if they have an announcement, which most will, and if they say they’re working on it, trust them.”

    Sounds like I’m passing the buck, but the reality is that unless I’m using my site for privileged data (like a private blog, or a store), then the odds are for my individual site … I don’t need to panic. Especially if I use unique passwords and take regular backups.

    This doesn’t mean Heartbleed wasn’t a huge problem, and that I didn’t want to see my host putting this as their number one priority, but it means that I’m aware of the risk (private data being stolen) and the likelihood of it happening (moderate to high) and the level of risk. That last one is the most important.

    What’s the worst that could happen, today, on this site if someone stole private data? Well. They’d see my password maybe, and some draft posts, and have access to my API keys for a couple services. Nothing I can’t fix relatively quickly. They can’t log in to those API services and they can’t destroy my life.

    If I was still running a store (like I was at the time of the initial vulnerability), I paid close attention to the fixes being released and the moment one was out for my system, applied it. But there was no need to panic or rush about. I understood what was going on.

    If You Don’t Know …

    If, however, you have no idea how it all works and what it means, then I recommend the following checklist:

    1. Do I have good passwords?
    2. Do I have good backups?
    3. Does my web host have a reliable track record for fixing this stuff?
    4. Do I run any private/privileged data on my site that could be dangerous to release to the public?

    If that last item is 4, then I better be paying my host (or an expert) a lot to protect me ASAP. If you’re still on budget web hosting, it’s time to move up to something managed, or hire someone to manage for you.

    Otherwise, if the first three are all ‘yes’ then I’m not going to panic. I’m going to trust in the experts to do their job.

  • Facebook to Automation: Nuts

    Facebook to Automation: Nuts

    Massive hat tip to Amanda Rush for pointing this out to me.

    Facebook is dropping support for apps to publish. I quote their recent post on policy updates:

    The publish_actions permission will be deprecated. This permission granted apps access to publish posts to Facebook as the logged in user. Apps created from today onwards will not have access to this permission. Apps created before today that have been previously approved to request publish_actions can continue to do so until August 1, 2018. No further apps will be approved to use publish_actions via app review. Developers currently utilizing publish_actions are encouraged to switch to Facebook’s Share dialogs for webiOS and Android.

    What is a Publish Action?

    Facebook uses actions to do ‘things’ within Facebook itself. A publish action is, logically, an action that triggers a publish of a post. When you create a Facebook app, you grant it special permissions to do specific actions, in order to prevent people from posting to your Facebook feed when they shouldn’t. If you’ve ever seen one of those popups like this, Facebook is asking you to confirm permissions:

    Screen asking to connect an app to your facebook account

    Most common are things that read content, like your posts and your friends, and so on. A publish_action would be like having your WordPress site automatically make a post on Facebook when you publish a post on your blog.

    Why Are They Doing This?

    The argument is that Facebook is maturing and “taking user privacy seriously” because the majority of people never read what permissions they’re granting, or who they’re going to be spamming with the cross posts. The reality? They’re locking down Facebook so if you want to get traffic from Facebook and your articles, you have to manually post them.

    Now. I hate artificial (and real) monopolies as much as the next nerd, and I do think this is a really cretinous move. But at the same time, by preventing auto-posting, they actually now have a way to combat fake news.

    If you watch The Good Fight, then you may have seen an episode where a bot script auto-generated posts, purporting to be someone, using fake news sites that people spun up. In season one, that was used to discredit Maia on Twitter. In season two, they took it to Facebook and demonstrated how the fake news sites could be used to target jurors and ensure they got the news.

    Seriously everyone needs to watch The Good Fight. They’re brilliant.

    But the point is this, by restricting people from auto-posting, then someone has to log in and make connections and it’s much easier to track behaviour. Facebook can block a VPN, but they can’t block Amazon AWS servers, after all. And those auto-posts are going to show as coming from your server, not your personal account.

    Do I Need to Care?

    Do you use Jetpack’s Publicize to post to your personal account? Then yes. Maybe. I don’t actually know what Jetpack’s going to do about this. My contact (i.e. my friend) just said they were on it, but I imagine there’s a lot of cursing in the background.

    Now, notice how I said personal account? And maybe?

    I noticed that Buffer, an app that auto-posts tweets and Facebook posts, said they’d be fine. On the other hand, Bridgely said they’re killing off their Facebook publish because of this. And on Facebook’s documentation for the APIs, the post to personal timelines information is gone, but the post to pages is still there.

    Which means I have no idea how horrible this will be. An incomplete block means spammers and fake reporters will move to posting to pages, which many users can post to. I can’t see how that will move the needle very far at all.

    Overall, I hate this and I think it’s a good thing.

  • Organization

    Organization

    In September 2005, Lorelle wrote what I consider to be the definitive piece on tags vs categories. In 12 years, my opinions have not changed and I still feel her explanation is correct. That said, there is room for improvement at scale.

    The Gist

    Her advice boils down to this:

    • Categories are a table of contents
    • Tags are index words

    By this we mean that categories are the high-level, big ticket items, and tags are the smaller, more precise terms. This is, I feel, the heart of understanding the two.

    Further down, Lorelle states that at around 25 posts, a tag is ‘big enough’ to be a category, and that if a category dominates a blog, it should perhaps be a separate blog. And that’s where I disagree.

    On Beyond Zebra

    When she wrote her post, the concept of custom taxonomies was barely a gleam in someone’s eyes. Multisite was still WPMU, and a separate installation. Today we have the ability to add our own taxonomies (either in category or tag styles) and we can create a network of related sites on our own. All we need is a little more technical know-how.

    When we add on custom taxonomies, we afford ourselves a new way to classify posts, so to the above I would add this:

    • Custom Taxonomies are critical but exceptionally unique index words that must be grouped together

    Okay that was long, I know, but a Custom Taxonomy is in essence a new subdivision of your site. You can either make it a new table of contents or a new index … or a combination of the two. It’s a little wild, especially when you factor in custom post types.

    Overwhelming Category? Custom Post Type!

    Instead of making a new blog when your category gets too large and unwieldy, I would recommend making a new custom post type. If I use my helpful example of LezWatchTV, we currently have three custom post types: Shows, Actors, and Characters.

    While we could have made them into posts, and used categories to index them, having them be their own post type means instead of a table of contents, I’ve made an appendix. This gives me access to all the cool WordPress features, like archives and sorting and organization, but it does so outside the realm of posts which restricts crossovers. Unless you’re really clever with cross-related content.

    A custom post type keeps it all on one blog, but separates them like your laundry.

    Too Many Tags? Custom Taxonomy!

    If you find yourself having too many tags, it’s time to consider a custom taxonomy. Again, pointing to LezWatchTV, actors have two custom taxonomies: gender identity and sexuality. While those are the same as we use for characters, by having them separate and only applicable to the actor post type, we are able to give a list of all trans female actors with a click. In other words, we’re using WordPress’s native features.

    But if we look at the custom post type for TV shows, we have a lot more taxonomies, including two that are constantly being added on to: nations and stations. Every time a new station airs a show, we have to add it in. And there, as of April 1, we end up having 29 nations and 168 TV stations.

    Which brings up the next problem, and one that Lorelle does indeed address, but not the way I would.

    When Tags Go Rogue

    Can tags still go too large? Yes. Oh my lordy, yes.

    Recently I saw a site that used unique tags on every single post. I physically flinched when I realized that.

    You see, they had around 30,000 posts and 48,000 tags, and for the life of me I couldn’t understand why until I read the site and looked. For every single post there was a commensurate tag for the post title and the date. After 365 dates they thankfully started to repeat, so you might have 10 posts for the march-25 tag. Except they weren’t consistent and someone else used 25-march and now you can see the rabbit hole fall into infinity and beyond.

    Now that said, I have 168 tags for TV stations, each TV show has one, maybe two if they’re lucky or weird, and some tags only have 1 show listed. Others, like ABC, NBC, and CBS, have around 60. Do I think any of those are ‘too large’?

    I don’t. Because the number of 25 posts to a tag only holds up at a smaller scale. With 100 to 200 posts, yes, that starts to make sense. At 600 to 3000 posts, suddenly having 198 posts tagged with “Bury Your Queers” doesn’t sound so out of place. It’s about the percentages, somewhat, and also the use-case.

    If I know people are looking for a smaller tag (say they really want to see the 10 shows that have the ‘Fake Relationship’ tag), then for the purpose of this site, it’s important. On the other hand, if only one character was tagged cougar, I might not keep the tag as it’s too small to make the data useful.

    Optimal Organization

    There is no magic number of tags to categories to custom post types to taxonomies. It all comes down to understanding the goal of your site, the way users look for data, and what is maintainable to you.

    In the case of the site with 48k tags, I would have them delete all the date ones, as well as the ones with the same names as posts, and stick to using topical tags. After all, if a tag is only used once, or duplicates some feature already found in WordPress, it’s perhaps not the best idea.

  • A Name is Not A Description

    A Name is Not A Description

    One day, you found a app or plugin or add-on for something. It was a feature you always wanted, did exactly what you needed, was well written and supported. It was that panacea of perfection. You loved it. Then you had a computer crash, or a house fire, or moved, and you forgot what the name was. All you could remember was the name was something about what it did. So you decided to Google for it, and quickly found a billion things that fit the bill.

    SEO vs Generic

    When you’re naming your product or company, you work very hard to think of a name that encapsulates what you are, what you do, and what makes you unique. For example, you don’t name yourself “Shoe Company” and expect people to be able to find you. With very few exceptions (and really only No Name comes to mind), if you want to stand out, you pick a good name where you are prominent.

    This directly relates to SEO, and people’s ability to find you. Ever used Apple Pages or Sheets and tried to Google something? Like “How do I make Pages Templates” perhaps. You often feel damn lucky when you get the right result immediately:

    A google search that has useful results!

    But you’re not Apple, are you? So if you named your product “Foods,” you’d probably have a devil of a time getting ranked so people could find you in search!

    Unique vs Memorable

    Take a look at WordPress. Pretend you’re looking for a slider plugin. Hush, just come with me here. Now. You remember a really cool slider plugin, but all you remember is it was named something like “Best Slider Plugin.” Yeah. You ain’t gonna find it. Probably ever. But what if you were looking for a lightbox plugin, and you remembered the name as “Foobox Lightbox” … Hang on a second. That’s one you’re going to be able to find. It has a unique name, but better than that, it has a memorable name!

    The only reason Apple Pages actually works is that Apple is huge and also the fact that most of us Google “Apple Pages whatever” and not just “Pages.” It’s the same with the Apple Watch. It’s nice they call it “Watch.” We call it the “iWatch” because we have to be able to find it, and they picked stupid generic names. Being Apple, they can get away with it.

    To their credit, the name is memorable. It’s not unique, but you will remember it. Even if you remember it as “That stupid Pages app Apple made.” You remember Microsoft Word, but you also will remember WordPerfect, and possibly WordStar. But if you listed four Twitter apps, could you remember what differentiates each one without looking? Definitely unique names, like Tweetbot and Twitterific, and certainly memorable, but in the wrong way.

    Names vs Descriptions

    Many people make a common mistake. They remember the tools they use on their computers, like “TextEdit” and “Notepad” and they think that in order to be found, the name must be short and descriptive. That’s why we get Notepad++ and iTerm. To an extent, this works. LastPass and OnePassword are going to be memorable and unique and descriptive names. But the longer a product, or suite exists, the more likely they are to corner a market and make it harder for the little people.

    Let’s go back to WordPress. You’ve made a great popup plugin and you want everyone to know it. There are roughly 500 plugins that use ‘popup’ or ‘popups’ as a tag. There are 2500 or so plugins that show up for a search on ‘popup’ in the directory. Besides the fact that you really should use the ‘popup’ tag in your plugin, there’s no way in the world you’re going to get your new popup plugin to the top of the list in a day.

    But … users don’t look for ‘popup’ or even ‘best popup plugin.’ They look for something else. “WordPress popup plugin with call to action on page exit.” They may simply that to “wordpress popup plugin call to action page exit” but they’re going to look for what they need. And they’re going to remember the plugin named “Wait Don’t Go! Popups” that has a nice plugin description of “Grab your visitors’ attention one more time before they leave your page forever.”

    Humans vs Robots

    Putting a million buzzwords in your product’s name, the description, and the URL aren’t ever going to make you popular. The only thing that does is bring people in the yard. If they see your website is fill with upsell and hyperbole, they’re going to walk right out again. If they see features and explanations and proof that you are, indeed, the bees knees, they’ll stay. If you have a catchy or unique name, they’ll remember and recommend you to their friends.

    And then, then you will be a success.