Category: How It Is

Making philosophy about the why behind technical things.

Phishing Games
One night, not so very long ago, I got a scam email that reminded me how few people actually pay attention to these things. It’s funny, but we’re all pretty lazy about these scams. You rarely get them from your real banks and money places anymore, or they’re very obviously not real, so you ignore them. Far more people fall for cold-calls on their cell, you know, like ‘This is Mary from Cardmember Services, calling about…’ And I always just hang up. But with so many emails, a lot of us blindly follow them. We click a link, we go to the site, and we don’t think.

This not thinking lead to a few WordPress developers being phished. This is not being ‘hacked’, this is a simple ‘You accidentally gave someone your password’ type mistake. While sites do the best they can to protect you from yourself, we can’t stop you from posting with your real name and not your handle (someone did this recently and asked me to remove the post, which I did), and we can’t stop you from not paying attention.

So we repeat this over and over again. Read the email, look at the site you end up on, use your brain.

Here was the email I got (one of three copies, from two separate email addresses):

Dear WordPress Plugin Developer,

Unfortunately, a plugin you are hosting has been temporarily removed from the WordPress repository. We’are going to manually review your plugin because it has been reported for violating our Terms of Service. If your plugin is not approved by this review then your plugin will be permanently removed from the WordPress repository.

You can check if your plugin has been approved or rejected at

http://wordpress.org/extend/plugins/my-plugins-status/

Four things were wrong with this.
1. The email did not come from plugins@wordpress.org – the only official email for plugin yanks.
2. The email didn’t come from someone I know on the pluginrepo ‘team.’
3. None of my friends who work for WP poked me directly (and I’m fairly sure Otto, Nacin or Mark would have).
4. The email source showed the wrong URL.
I quickly did a few checks on the email source, traced it back, verified it wasn’t WordPress, posted on the forums, and alerted the masses. Because ignorance is where this sort of thing festers. I’m a little impressed, though, since I’ve not seen a phishing attempt aimed at WordPress like this before.

Clearly it’s time to go over a quick reminder about what phishing is, it’s goals, and how it works.

Phishing is when you try to get someone else’s login credentials by mimicking a real site, so you can log in as them and do naughty things. It works by having people not pay attention to a URL when they go to a site. PayPal was an early hit on this, and they finally said “We will never send you an auto-login link or a link to our site in our emails. Just go to paypal.com and log in.” I don’t know if they still do it, but it was a very smart idea.

Too often we blindly trust our emails. The email appears to come from our bank? We click the link, log in, and … hey, it didn’t work? Banks are a huge target for this, and as I work for one, I’m very familiar with making sure we’re not phished. I mean, an email like this looks pretty safe right?

That link, if you’d clicked on it, would take you to a fake site. Now some of those fake sites are smart, and when you enter your ID and password, will redirect you to the real site’s bad password page! That would make you think you only typoed, which happens to all of us.

You may have noticed that most banks and money-type places have you enter your username and then take you to a page with a picture and a passphrase. As long as those are yours, you know you’re on the right site. That helps take care of a lot of attempts, but when you’re faced with something like a phishing attempt on WordPress, there’s less security because there’s less at stake. A bank can make it annoying and inconvenient to log in and get your money and you’ll put up with it because, well, it’s your money. You’ll put up with a lot to get to your money.

But if you have to jump through the same hoops to log in to a forum, or check in code to open source, you’d probably walk away. This is a complicated problem, trying to balance out the needs of the many and the protection of all. I’m not going to delve into possible answers, since they’re always going to be specific to your community.

Also, you can usually easily spot the fake emails. Here’s one I got today:

This came from “Delta Air Lines – support_8312@delta.com” which looks ‘legitish’, but as soon as you look at the email body, it seems weird. No real airline sends out your tickets in a ZIP file for one. Without looking any further, I know this is fake and I can delete it. But what if they’d sent a link? Would I have clicked on it? Again, no, since I’ve only been to Newark twice in my life, and I know I’m not going any time soon, but that’s not the point. The point is the email would have been less off if there’d been a link. If I’d really been concerned, I would have looked at the email headers, but before we jump into that, let’s review what you can do!

The rules to not be phished:
1. Look at the URL before you enter your password and ID.
2. Copy and paste those URLs, never click.
3. If the email looks ‘off,’ don’t click.
4. If there’s an attachment and there isn’t normally, delete the email.
That’s really the best you can do for most people. The rest of us, though, can go the extra level. When you get that weird email, the one that looks just a little off and hits your spider sense, view the email source, which looks like this:(This is the actual header from the phising email, by the way. You can see the whole thing here)
```
Return-path: 
Envelope-to: ipstenu@ipstenu.org
Delivery-date: Sat, 24 Mar 2012 18:14:57 -0500
Received: from blu0-omc4-s14.blu0.hotmail.com ([65.55.111.153]:4132)
	by gamera.ipstenu.org with esmtp (Exim 4.77)
	(envelope-from )
	id 1SBaAh-0001wn-Sk
	for ipstenu@ipstenu.org; Sat, 24 Mar 2012 18:14:56 -0500
Received: from BLU0-SMTP348 ([65.55.111.135]) by blu0-omc4-s14.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675);
	 Sat, 24 Mar 2012 16:14:54 -0700
```
By the way, notice how this came from Hotmail? 1990 called, it wants Nirvana back. WordPress emails don’t come from Hotmail, and I really hope that I’m not about to get a comment from someone at Automattic about how they still use it. Hotmail is like an AOL account. You’re not old school, you’re living in the icky past.

Now in that email, once you have the raw source, you scroll down to the body of the email and see this:
```
<HTML><HEAD>
<META name=GENERATOR content="MSHTML 8.00.7601.17744"></HEAD>
<BODY>
Dear WordPress Plugin Developer,
Unfortunately, a plugin you are hosting has been temporarily removed from&amp;nbsp;the WordPress repository. We&amp;nbsp;are going to manually review your&amp;nbsp;plugin because it has been reported for violating our Terms of Service. If your plugin does not get approved then it will be permanently removed from the WordPress repository.
You can check if your plugin has been approved or rejected at
<A href="http://wordpresss.comule.com/bb-login.php">http://wordpress.org/extend/plugins/my-plugins-status/</A> 
&amp;nbsp;</BODY></HTML>
```
I don’t know about you, but something’s fishy in that email. comule.com has nothing to do with WordPress, we have a winner.

How do you see your raw source? For most email apps, select the message, go to the view menu and look for ‘message’ or ‘message source.’ If there are further options, like in mail.app, you want ‘Raw Source.’ Outlook has it under options, I believe. Once you get that view, just take the time to look at the ‘content’ of the email. If you’re extra paranoid, you can even turn off your email’s ability to automatically render HTML, so you’d see that right away (I actually no longer do that because of the values in HTML emails).

Now you know how to protect yourself just a little bit more. What are your best anti-phish-tips?
4 April, 2012
Once GITten, Twice SVN
I’ve been putting off learning git. I’m lazy, svn works fine what what I need, and … oh. Look. MediaWiki switched to git. This is a problem for me, because I upgrade my MediaWiki install via svn. When it comes time for a new version, I just run something like svn sw http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_18_1/phase3/ ~/www/wiki/ and I’m off to the races. But now I have to git off my ass and learn it.

Before someone asks, I use svn to update my code because I like it better than wget-ting files, unzipping, and copying over. I can certainly do that, but I prefer svn. It runs fast, it lets me see what changed, and it’s easy for me to back out, just by switching back to the previous version. It never touches my personal files, and I’m happy. The more I looked at git, and how it works, I started to wonder if this was a practical idea for upkeep anymore.

Now, I actually like git for development when you’re working with a team. It’s great. So any posts about how I’m stupid for wanting to do this weird thing will just be deleted. Constructive or GTFO. The rest of this post details the journey of going from ‘This is what I do in svn, how do I replace it with git?’ The tl;dr of it is that I can’t.

What I want to do

I want to check out a specific branch (or tag) of code and upload it to a folder called ~/www/wiki/

That’s it.

The road…

First up, I have to install git on my server. Unlike SVN, which has been bog-standard on servers for a hog’s age, git is the new kid in town, and the odds are it’s not on your server. Thankfully, my host had note perfect directions on installing git on CentOS 5.(I’ll probably upgrade to CentOS 6 this summer, when my sites are less busy.)

Second I have to ‘install’ the git repository. This was weird. See, I’m used to svn, where you check out code, use ‘svn up’ to upgrade it, or ‘svn sw URL’ to switch to a new branch/tag. The official directions for gitting MediaWiki were: git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git

That doesn’t tell me what version I’m getting, and I only want the latest stable release. It was unclear at this point what this cloning did, but I was told over an over that I had to clone to get the version I wanted. Thankfully, git knows a lot of us are in the WTF world, and have a nice crash course for SVN users. Most important to me was this:

So git clone URL is svn co URL and that’s how I first installed MediaWiki. But… I know MediaWiki is working on a million versions and I’m not specifying one here. The git doc goes on to say that for git, you don’t need to know the version, you use the URL and it’ll pull the version named ‘master’ or ‘default.’ That basically means I’m checking out ‘trunk,’ to use an svn term. This is exactly what I do not want. I have no need for trunk, I want this special version.

After more reading I found git checkout --track -b branch origin/branch which they say is the same as svn switch url which is excellent. I can also use git checkout branch to check out the specific version I want!

I’m very used to SVN, where I can browse and poke around to see the tags, branches, and so on. When I go to https://gerrit.wikimedia.org however, all I see are the latest revisions. The directions said I should get the file /r/p/mediawiki/core.git which makes me presume that’s what tells git the version I’m checking out, but I can’t see that. I don’t like obfuscation when I’m checking out code. After some digging, I found the URL I wanted was https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git and that shows me the files, and their tags.

The tags were important for the command I wanted to run: git checkout tags/NAME where NAME is the tag. I want version 1.18.2, which is the latest stable, so I want to do is git clone -b tags/1.18.2 https://gerrit.wikimedia.org/r/p/mediawiki/core.git

I tried it with pulling a copy of 1.18.1, and it ended up pulling about 89 megs and put it in a subfolder called ‘core’. Oh and it told me it couldn’t fine tags/1.18.2 and used ‘HEAD’ instead, which put me on bleeding edge. It took a while to find something that looked a little better:
```
cd ~/www/wiki/
git init
git remote add -t REL1_18 -f origin https://gerrit.wikimedia.org/r/p/mediawiki/core.git
git checkout REL1_18
```
In theory, that would get me the latest and greatest of release 1.18, be that 1.18.1 or 1.18.2 and then, when 1.19 was released, I could switch over to that by doing this: git checkout REL1_19

It was time to test it! I pulled REL1_17 and … ran out of memory. Twice. fatal: Out of memory? mmap failed: Cannot allocate memory happened right after resolving deltas. Everyone said ‘You need more swap memory’ but when I looked, it was fine (not even using half). So that was when I said ‘screw it, I’ll go back to tarballs.’

Just as I hit that wall, some Twitter peeps stepped in to explain that, no matter what, I have to clone the whole archive first. I’m not really happy about it. I like SVN, since even when I switch, it’ll only update my changed files. I wanted to be able to do that in git without needing to pull down the whole repo, since I only use it to update, and not develop.

And that there is my one and only issue with git. It’s entirely for developers. Sometimes, some of us just want to pull down the latest version to play with, and not the bleeding edge. With git, I have to clone the repository, then export the version I want to play with. That’s not optimal for what I want to do. On the other hand, I totally get why it’s done that way. It’s akin to the basic svn check out, where I get all the branches and tags, only much much cleaner. And that’s cool! Smaller and leaner is fantastic.

And yet I’m not a developer in this instance. I don’t need, or want, the whole release, just a specific stable release. So now my workflow would be…

Install

On Git:(Of note, every time I try to run checkout I get “fatal: Out of memory, malloc failed (tried to allocate 1426604 bytes)” and I get the same thing when I try to archive. If I use the tool to change memory allocation, it also fails … out of memory. I’m taking this as a sign.)
```
mkdir ~/mediawiki/
cd mediawiki
git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git .
git checkout 1.18.1 -f
cp -R ./mediawiki /home/user/public_html/wiki/
```
On svn:
```
mkdir ~/www/wiki/
svn co http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_18_1/phase3/ ~/www/wiki/
```
Upgrading
On Git:
```
git pull /home/user/mediawiki/
git checkout 1.18.2 -f
cp -R ./mediawiki /home/user/public_html/wiki/
```
On svn:
```
svn sw http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_18_2/phase3/ ~/www/wiki/
```
You see how that’s much easier on svn?

Now, someone wanted to know why I like this. Pretend you’re trying to test the install. You often need to test a specific install version, say version 1.17.1, against some code to see how far back something was broken. Being able to quickly pull that, without coding any extra steps, is crucial to fast testing and deploying. The tool we use at my office required me to reverse engineer and script a way to do that with an ant script, and certainly I could script all that here. It’s not even that hard. But what’s one line in svn is three in git, on it’s best day, and you’re always pulling the latest and reverting back to the one you want, which seems a little silly.

Now, someone pointed out I could use this: git checkout-index -a --prefix=/home/user/public_html/wiki/ When I read the man page, it doesn’t say you can pick a tag/branch to push, though. If you have some advice on how to fix that, I’ll give it a shot, but I suspect my memory issues will block me.

I will be using git for a lot of things, but a fast way to update, spawn versions, and test will not be one. For development, however, I totally adore it.
23 March, 2012
Duplication Dillution

A common enough request in the WordPress forums is people wanting to have two sites on a MultiSite network have the same content. Usually I take the time to ask why they’re doing this, and see if there are better ways around it. Like if you want to have a collection of ‘all’ posts, then you want WordPress MU Sitewide Tags Pages (weirdest name ever) or Diamond MultiSite Widgets, both of which help you do that admirably. And if you want to have the same ‘about’ page on multiple sites so they can keep their per-site branding, I point out that best companies don’t do that, they link home to mama. Amazon.com’s subsites all link back to amazon.com’s about page, after all. That’s good business practice!

Now, before someone gets all het up with an angry comment, there are good reasons to duplicate content, but it’s never 100% in the same language. If you have a site that needs to be bilingual, chances are you’re going to have a copy in English and (let’s say) French that, if translated would be pretty much the same. Not 100%, because of what idioms are, but basically the same. That’s a great reason for a duplicated site. Regional stores, where most of the content is the same, but some are different, is another good one. And those really aren’t ‘duplicated’ content.

But imagine the guy who wants to have 50 sites with different domains (no problem), and they’re all the ‘same.’ Well. What? You mean I can go to example.com and example2.com and example3.com and they’re 100% the same? Same theme, same posts, same content? That seems silly, doesn’t it?

So why do they persist in doing this? Some people cite SEO reasons (“My site ranks better for this domain…”) and other say its regional (“I need a separate domain for Canada!”) and they’re pretty much all wrong. Unless your domains are offering up different content, you are going to lose SEO ranking and readers by having multiple, identical, domains.

In D&D (yes, I play it), we always say ‘Don’t split the party!’ For the gamers, this is good because everyone gets to play with everyone together. For the person running the game (GM or DM, depending on your flavor of game), they can talk to everyone at once. Splitting the party means half the time, some of your players have to leave the room and kill time while the other people get to do fun things. And then the game rummer has to repeat things for the next group when you switch places! It’s annoying and boring 50% of the time, and it causes duplication of effort.

Splitting up your visitors means you have to figure out how to push content that is identical. This is not difficult, but it can cause problems. Every time you edit a post, the PHP calls your database and overwrites things. Multiply that by however many places you’re overwriting, and that could slow down posting. But then you think about using something like Content Mirror, which pulls post data in from across sites. Sounds great, until you remember that the blog switching code isn’t efficient (i.e. slows things down), and that all the smart people tell you switch_to_blog() is rough on caching.

All that aside, there are major reason you don’t want to duplicate content. The foremost is that Google hates it. So do you, by the way. Duplicating content is what spammers and splogs do. (The Illustrated Guide to Duplicate Content in the Search Engines.) For those who go “Hey, but I have the same content on my archive pages for dates and categories!” you should read Ozh on the ‘Wrong SEO Plugins.’. The tl;dr takeaway is that good themes already do this for you!

Second only to SEO is now you’ve given your users multiple places to have the same conversation. Cross-posting is never a good idea. You dilute your content by have multiple conversations about the same thing. Should I post on foobar.com or foobaz.com to talk about my foo? The more time your readers spend thinking about where to comment, the less time they’re engaging with your site. This is, by the way, one of my nagging dislikes about BuddyPress. With groups and forums and blogs, you can dilute your message. Thankfully, you can use the group page to pull in all those conversations to one place where people will see them, which helps a lot.

I put the question to my friends. Why would you want 100% content duplication on multiple sites, in the same language, just with different URLs? Here are there answers:

@Ipstenu uh. misguided attempts at SEO in certain geographics?

— Paul Wong-Gibbs (@pgibbs) March 2, 2012

@ipstenu Plagiarizer working very diligently?

— Liv Moreno (@Adm_Hawthorne) March 2, 2012

@Ipstenu – Bizarro websites? They look the same, but one is EVIL?

— Kristin (@_KMcC_) March 2, 2012

@Ipstenu only if the different domains are country specific. otherwise, NO GOOD REASON

— quiltingshed.bsky.social (@andrea_r) March 2, 2012

@Ipstenu the only reasons I can think of are for search purposes (re: the URL would pull in different results) and keeping an online backup.

— Brian Crawford (@briancrawford) March 2, 2012

http://twitter.com/jan_dembowski/status/175590010555342848

http://twitter.com/bluelimemedia/status/175606857967214593

@ipstenu Maybe they don't realise they can redirect one domain to the other & think they have to duplicate to use 2 different domains?

— Taryn Wallis (@phenomenoodle) March 2, 2012

What’s yours?

19 March, 2012
Pretty URLs Matter
Just over a year ago, Lifehacker changed their site and we all learned about the hashbang. I believed that pretty URLs mattered:

I would posit that, since the web is based on look and feel, the design of your site still relies, in part, on the ease of someone in understanding the URL.

Well Dan Webb agrees with me: (It’s About The Hashbangs)

After quite a lot of thought and some attention to some of the issues that surround web apps that use hashbang URLs I’ve come to conclusion that it most definitely is about the hashbangs. This technique, on its own, is destructive to the web. The implementation is inappropriate, even as a temporary measure or as a downgrade experience.

This Dan guy happens to be in charge of switching Twitter from the #! back into nice, normal, readbale URLs:

@timhaines plus, now I'm in charge of undoing twitters hashbang URLs I can confirm that all the issues in that article are very real.

— Dan Webb (@danwrong) February 20, 2012

You can read the whole Twitter convo on Storify. But the take-away is, I admit, a little dance from me of ‘I told you so.’

In the comments of my previous post, someone pointed out that Google uses Hashbangs in a search. I have not seem them in my searches (here’s an example of what I do see on page 3 of a search for ‘sadasdas’):

https://www.google.com/search?sourceid=chrome&ie=UTF-8& q=sadasdas#q=sadasdas&hl=en&prmd=imvns&ei=8CthT86EEoWDgwerpqDwDA& start=20&sa=N&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.,cf.osb&fp=6b8f6c2ef22f8dde& biw=1324&bih=879

The thing is, Google doesn’t matter in this instance. Actually, none of Googles URLs ever really matter when it comes to the reasons we want Pretty URLs. Remember our credos?

URLs are important. Yes, they are, but they’re important when you’re looking for something. They should be important on, say, YouTube, and they certainly are important on Google Code pages. But for a normal Google search? You don’t type in google.com/ipstenu or even google.com/search/ipstenu to find pages about ipstenu (Though, really, that would be really awesome if you could! At the very least, the 404 should have a ‘Do you want to search for that?’ options. Get on it, Google Monkeys!) Which is the point I was trying to make. When you go to a site for specific content, you want people to say ‘Go to ipstenu.org/about’ and be done with it.

URLs are forever. Again, yes they are, except when they’re not. For Google? This doesn’t matter. They don’t search themselves, and rarely do I know people who bookmark a Google search. But… Did you know Googles search URL formats are backwards compatible? That’s right. If you had an old URL saved, it would still be usable.

Cool URLs don’t change. Except for search URLs. But Google knows if you’ve gotta change ’em, make ’em backwards compatible. Ben Cherry has a nice article on why it’s good, too, which I found enlightening. I still don’t like them on the front end of websites, mind you, for a reason Ben points out:

The hashbang also brings a tangible benefit. Someday, we will hopefully be rid of the thing, when HTML5 History (or something like it) is standard in all browsers. The Web application is not going anywhere, and the hash is a fact of life.

That’s the problem right there. We’re building something we know is going to go away! We just broke a credo!

Are there ever reasons for using Hashbangs? Sure, but most of us will never need them. They’re great for loading AJAX which is awesome because you can load new parts of your site faster. Heck, WordPress has some AJAXification, but it’s primarily on the backend. They’re fantastic for web apps and they have some attractive functionality. But they don’t work for everyone, and they never will.

The ‘replacement’ is HTML5, which introduces pushState, which lets you use a JavaScript library with HTML5 history.pushState and … well basically you write an app with that instead of AJAX and the Hashbang, your URLs stay pretty and your page can still have that nice ‘click to add the new tweets’ functionality that we’ve all seen on the new-new-new-Twitter. And Facebook.

See, pushState is giving us these three things (among others):
- better looking urls: A ‘real’, bookmarkable, ‘server-reachable’ url.
- graceful degradation: Render correct pages when you don’t have JS enabled.
- faster loading: By only loading parts of the page, slow-net users can get the content faster.
The downside? It doesn’t work on IE right now. That’s okay, though, since it’s got graceful degradation, IE users will just have to manually refresh things. Which sucks for IE users, but perhaps will be the fire under Microsoft’s feet to get on that one.

Keep an eye out in your webapps. They’ll be moving over to this sooner than you think.
14 March, 2012
GPL – Oh Dear God
I come to code from a strange direction. I was a fangirl and I learned all about webpages because of that. Perhaps it’s because of that humble begining that I look at the GPL arguments much as I look at ‘shipping’ arguments in fandom. Shipping is when fans believe a relationship to exist between two characters, regardless of anything being scripted. A great example of this is Xena: Warrior Princess. Some people thought Xena and Gabrielle were a couple. Some people didn’t. When the two argued, there were fireworks. Now, with shipping, sometimes a miracle occurs and the couple do get together (see Mulder/Scully in The X-Files and Grissom/Sara in CSI: Crime Scene Investigation), but most of the time the arguments go on for eternity.

That’s pretty much how GPL arguments make me feel. Ad nasuem. I hate them, and I was sorely tempted to post this with comments turned off. But various people asked me my opinion and that they’d like to see this posted, so… you’re masochists. This all started with Rarst asking me what I thought about a GPLv2 vs GPLv3 in a trac ticket in WordPress to fix plugins about page license requirement. I should note that he and I don’t see eye to eye about GPL, and we’re still pretty friendly.

Here’s the thing. I don’t have a horse in this race. GPLv2, GPLv3, MIT, Apache, whatever. I don’t have a license I love beyond measure. You see, I work for The Man, and having done so for 13 years, I have an acceptance of things I can’t change. One of those things is ‘This is how we do things.’ You accept that, even if things aren’t the most efficient, or even if they’re not the way you’d do them if you could choose, this is what they are.

I view the GPL issue in WordPress with the same antipathy. It’s not that I don’t care, it’s that I accept the rules for what they are and I have no reason to rock the boat. WordPress says ‘If you want to be in our repositories, and you want to be supporting our official stuff, you have to play by our rules.’ This is good business sense, it’s good branding, and it’s self-protection. With that in mind, I recently changed a plugin of mine to use code that was MIT licensed.

Expat License (#Expat)

This is a simple, permissive non-copyleft free software license, compatible with the GNU GPL. It is sometimes ambiguously referred to as the MIT License. (MIT License compatibility with GPLv2)

That’s easy enough. But I’m sure you’ve heard people argue that GPLv3 and GPLv2 are incompatible, or Apache is, and those are true statements. Because they’re incompatible, however, does not mean you can’t use them together, it means you can’t incorporate them!

Let me explain. No. Let’s let Richard Stallman explain:

When we say that GPLv2 and GPLv3 are incompatible, it means there is no legal way to combine code under GPLv2 with code under GPLv3 in a single program. This is because both GPLv2 and GPLv3 are copyleft licenses: each of them says, “If you include code under this license in a larger program, the larger program must be under this license too.” There is no way to make them compatible. We could add a GPLv2-compatibility clause to GPLv3, but it wouldn’t do the job, because GPLv2 would need a similar clause.

Fortunately, license incompatibility only matters when you want to link, merge or combine code from two different programs into a single program. There is no problem in having GPLv3-covered and GPLv2-covered programs side by side in an operating system. For instance, the TeX license and the Apache license are incompatible with GPLv2, but that doesn’t stop us from running TeX and Apache in the same system with Linux, Bash and GCC. This is because they are all separate programs. Likewise, if Bash and GCC move to GPLv3, while Linux remains under GPLv2, there is no conflict.

What does that mean? It means I could write a plugin in GPLv3 and use it with the GPLv2 (or later) WordPress and not have any issues. However what I cannot do is take that code, put it into WordPress, and distribute it as a new version of WP Core. And that’s (most of) why we shouldn’t have the GPLv3 plugins in the WordPress repository. The fact is that often new core features get their start as plugins, and if we allow GPLv3 in there, we run the risk of breaking licensing.

Worse, if we get our ideas from a GPLv3 plugin and rewrite it to GPLv2, we still are at risk for violation. This is the same reason, to bring it back to fandom, why authors can’t read fanfiction. If they use your ideas, you can sue them. We have enough headaches with Hello Dolly, let’s not add to them. And before someone thinks I’m overreacting about that, I wish I was. I’ve watched copyright wars before, seen friends lose their websites over it, and I can’t imagine that a TV show would be less agressive than a software company when they feel their rights have been infringed. It’s terrible, but it’s the reality of the litigious society in which we live.

So why don’t we just move WordPress to GPLv3?

If it helps to think of software in a different way, pretend you have two versions of server software, Server 2.0 and Server 3.0. You can use the files from Server 2.0 on the box with Server 3.0, but you can’t use Server 2.0 files on Server 3.0. They’re not backwards compatible. That’s the first problem with GPLv3, we’d have to make sure every single bit of code in WordPress is able to be moved to GPLv3.

This is clarified in the GPLv3 FAQ: (GPLv3 FAQ Update – Converting GPLv2 to GPLv3)
I have a copy of a program that is currently licensed as “GPLv2, or (at your option) any later version.” Can I combine this work with code released under a license that’s only compatible with GPLv3, such as ASL 2.0?
Once GPLv3 has been released, you may do this. When multiple licenses are available to you like this, you can choose which one you use. In this case, you would choose GPLv3.

If you do this, you may also want to update the license notices. You have a number of options:
- You may leave them as they are, so the work is still licensed under “GPLv2, or (at your option) any later version.” If you do, people who receive the work from you may remove the combination with parts that are only compatible with GPLv3, and use the resulting work under GPLv2 again.
  If you do this, we suggest you include copies of both versions of the GPL. Then, in a file like COPYING, explain that the software is available under “GPLv2, or (at your option) any later version,” and provide references to both the included versions. If you have any additional restrictions, per section 7 of GPLv3, you should list those there as well.
- You may update them to say “GPLv3, or (at your option) any later version.” At this point, any versions of the work based on yours can only be licensed under GPLv3 or later versions. Include a copy of GPLv3 with the software.
- You may also update them to say “GPLv3” only, with no upgrade option, but we recommend against this. Include a copy of GPLv3 with the software.
That doesn’t sound terrible, does it? You can go from GPLv3 to GPLv2, so long as you remove all the GPLv3-and-up code. Backwards compatibility is complicated, and forward isn’t much better. The second problem is the bigger one. From the GPLv3 FAQ: (GPLv3 FAQ Update – permission to change release)

Consider this situation: 1. X releases V1 of a project under the GPL. 2. Y contributes to the development of V2 with changes and new code based on V1. 3. X wants to convert V2 to a non-GPL license. Does X need Y’s permission?
Yes. Y was required to release its version under the GNU GPL, as a consequence of basing it on X’s version V1. nothing required Y to agree to any other license for its code. Therefore, X must get Y’s permission before releasing that code under another license.

And why would WordPress want to switch to GPLv3 anyway? Other than having more licenses compatible with GPLv3 than GPLv2, the other main differences didn’t strike me as anything WordPress needs. You can read ifrOSS’s document and come up with your own opinion, of course. Basically in order to get WordPress moved to GPLv3, every single person who ever submitted code to WordPress has to be contacted and sign off on the change. Now, unless a clever lawyer type can argue that the act of submitted code to core reliquieshes your ‘ownership’ thereof, and the code becomes community

code maintained and manged by WordPress, and as such WordPress does not need to gain consent from all contributors, then maybe this can be done. But that’s a long, mess, legal conversation.

Finally, the real question is who are we protecting here? Remember, GPL is not about the developer but the user. The extra ‘freedoms’ that come in GPLv3 don’t really strike me as being about the user, and that worries me a little. Patent protection, permitting code to be kept from the end user, and tivoization are all great things for a developer who puts out the code, but as the person who wants to adapt it later, that sure feels pretty restrictive to me.

Will I use GPLv3 ever? Sure. But not in my WordPress code.
28 February, 2012
Risk vs Transparency
This was written without any special insider knowledge. I’ve simply watched, paid attention, and kept track for the last two years. Often when I report a plugin, Mark and Otto are nice enough to explain things to me, and I’ve listened.

Occasionally a plugin vanishes from the WordPress repository. As a forum mod I tend to see people freak about this more often than not, and the question that inevitably comes up is ‘Why doesn’t WordPress publicize these things?’

Let’s go down the list for why a plugin is removed first. This list is very short, and boils down to three:
1. It breaks the rules
2. It has a security exploit
3. The author asks for it to be removed
That’s pretty much it. The rules cover a lot, though Otto and I have been known to sum it up with ‘Don’t be a spamming dick.’ I actually had the chance to talk to folks about this before the ‘expanded guidelines’ went live, and I think I have a pretty good understanding of what the rules are. The majority of plugins, that I see removed, are done so for the most obvious reasons:
- Phoning home (i.e. sending the author data about you without your permission)
- Forward facing links (i.e. opt OUT links on the front of your site when you use the plugin)
- Affiliate links (i.e. the author gets revenue from the plugin without disclosure)
- Obfuscated code
None of those are security reasons, and most of them are ‘fixed’ by us reporting the plugin, the plugin repo mods contacting the author, the author making the fix, and all is well. When the author doesn’t reply, or in the case of a ‘phone home’, often the plugin is yanked from the repo pending review. So where are these ‘security reasons’ to yank a plugin, and why should WordPress disclose them. Phoning home is, sometimes, a security reason, depending on what’s actually being transmitted.Usually it’s a vulnerability or an outright backdoor that would be a reason to pull a plugin.

There’s an argument that ‘Trust requires transparency’ when it comes to security (see Verisign’s recent rigmarole) and that would mean WordPress needs to publish things like ‘This month, these plugins were removed for this reason.’ Except WordPress doesn’t, and in fact, if you look, rarely do companies do this until they have a fix. The ‘problem’ with WordPress is they don’t do the fix, the plugin devs do, and a surprisingly high amount of times, the plugin author fucks off like a monkey.

On the other side of this argument is FUD(Fear, Uncertainty and Doubt) which is something you never want to feed. Look at the plugin “ToolsPack,” helpfully shown up on Sucuri. Now that was never hosted on WordPress.org, but if it had been, it would have been removed for exploitation. But once the offending plugin is removed, should WP go ahead

In October of 2010, WordPress.org ‘introduced’ a kill switch for plugins. Not really, but kind of. BlogPress SEO was spam. Yoast, one of the few true ‘SEO experts’ I know of, caught it and decided to fix it the best way he knew how. See, this plugin was never on the WordPress repository and so WP could do little about it. Yoast registered a plugin with the same name, gave it a newer version of the plugin, and everyone saw that as an ‘update’ and thus were saved. Sort of. Now, even Yoast admits this is abuse of the system, and I’ll leave the coulda/woulda/shoulda to someone else.

The reason I bring it up is this shows there is a way to handle bad plugins. But it’s not very efficient, it’s not very friendly, and it doesn’t promise that it will work. First off, not enough people run updates, and secondly it’s putting a lot of work on a very small group of people. While the theme reviewers have a lot of folks helping out, the plugins do not. Should they? Yes, but the number of people who understand all the code that could be in a plugin is far smaller than for a theme. I suppose it’s saying ‘plugins are harder than themes.’ I may be wrong, but it’s how I feel.

To fix all this, you’d need to basically reboot the plugins directory, turn them all off, review each of the 18,000+ plugins, and turn them back on. Then you need an Otto or Nacin going through each one to make sure every check in is okay, every update and every change isn’t spamming. Oh yes, that’s what happens to theme devs, didn’t you know? All releases are approved before they go live. Can you see the plugin developers agreeing to that? That’s a nonsense complaint of mine, actually. If tomorrow the rules changed, maybe half the plugins in the repo would vanish and never come back, but most of the rest would be fine. Of course, we would need a dedicated team of people to do nothing but review and approve plugins to keep up with the traffic.

So accepting what we have today, the wild west, why isn’t there a running list of all plugins yanked from the repo, and why? The list itself isn’t a bad idea. Having a list to say ‘This plugin was disabled on this date’ would be nice for a lot of us, and more so, having the plugin page show ‘This was disabled.’ would be nice. I can even think of a couple code ways to do it, but all of them need a full time person to go through the ‘removals’ and put up a splash page with ‘If you used this plugin, please consider alternatives: .’ and ‘If you wrote this plugin, please contact plugin support.’ Also, this would increase emails to the plugins support account, not from the authors, but from people who want to know why a plugin was removed. And what about a day when a plugin is removed because of a bad thing, but the authors fix it? Did we create a false feeling of doubt in a plugin that had a typo?

On paper, it all sounds like we should be keeping a public list for this still, though. Put it all up there for the public, disclose everything.

Every time I write that sentence, I wince.

It sounds nice on paper, and all I can think is about the people who will cry foul, complain, and want to know more. “Why was this plugin removed and not that one?” Well, most of the time it’s because no one mentioned that plugin. Right now, the plugins that get yanked are ones people stumble across or report.

But why worry about a simple list of removed plugins? Because the first thing I would do, if I was a nefarious hacker, would be to script a pull from that list and scan the web looking for sites that use the plugins, thus implementing a vector for attack. See, we already know people don’t update plugins as often as they should (which is why Yoast’s ‘fix’ isn’t as good an idea as we’d hope), but now not only are we leaving people at risk, we’re opening them to even more risk. If we email them to tell them the plugin’s risky, we have the same problem.

There’s no safe way to inform people without putting anyone who’s not up to date at risk. Given that the most dangerous day to have an unpatched system is the day of disclosure, the only way WordPress, or anyone, could keep a list like that would be if, like Chrome, WP auto-pushed updates right away, forcing anyone who opened the site to upgrade. And that’s fraught with it’s own issues.

Until then, I can’t advocate anyone keeping a list of removed plugins. It’s too risky.
22 February, 2012

Category: How It Is

Phishing Games

Once GITten, Twice SVN

Duplication Dillution

Pretty URLs Matter

GPL – Oh Dear God

Risk vs Transparency