Year: 2012

Once GITten, Twice SVN
I’ve been putting off learning git. I’m lazy, svn works fine what what I need, and … oh. Look. MediaWiki switched to git. This is a problem for me, because I upgrade my MediaWiki install via svn. When it comes time for a new version, I just run something like svn sw http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_18_1/phase3/ ~/www/wiki/ and I’m off to the races. But now I have to git off my ass and learn it.

Before someone asks, I use svn to update my code because I like it better than wget-ting files, unzipping, and copying over. I can certainly do that, but I prefer svn. It runs fast, it lets me see what changed, and it’s easy for me to back out, just by switching back to the previous version. It never touches my personal files, and I’m happy. The more I looked at git, and how it works, I started to wonder if this was a practical idea for upkeep anymore.

Now, I actually like git for development when you’re working with a team. It’s great. So any posts about how I’m stupid for wanting to do this weird thing will just be deleted. Constructive or GTFO. The rest of this post details the journey of going from ‘This is what I do in svn, how do I replace it with git?’ The tl;dr of it is that I can’t.

What I want to do

I want to check out a specific branch (or tag) of code and upload it to a folder called ~/www/wiki/

That’s it.

The road…

First up, I have to install git on my server. Unlike SVN, which has been bog-standard on servers for a hog’s age, git is the new kid in town, and the odds are it’s not on your server. Thankfully, my host had note perfect directions on installing git on CentOS 5.(I’ll probably upgrade to CentOS 6 this summer, when my sites are less busy.)

Second I have to ‘install’ the git repository. This was weird. See, I’m used to svn, where you check out code, use ‘svn up’ to upgrade it, or ‘svn sw URL’ to switch to a new branch/tag. The official directions for gitting MediaWiki were: git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git

That doesn’t tell me what version I’m getting, and I only want the latest stable release. It was unclear at this point what this cloning did, but I was told over an over that I had to clone to get the version I wanted. Thankfully, git knows a lot of us are in the WTF world, and have a nice crash course for SVN users. Most important to me was this:

So git clone URL is svn co URL and that’s how I first installed MediaWiki. But… I know MediaWiki is working on a million versions and I’m not specifying one here. The git doc goes on to say that for git, you don’t need to know the version, you use the URL and it’ll pull the version named ‘master’ or ‘default.’ That basically means I’m checking out ‘trunk,’ to use an svn term. This is exactly what I do not want. I have no need for trunk, I want this special version.

After more reading I found git checkout --track -b branch origin/branch which they say is the same as svn switch url which is excellent. I can also use git checkout branch to check out the specific version I want!

I’m very used to SVN, where I can browse and poke around to see the tags, branches, and so on. When I go to https://gerrit.wikimedia.org however, all I see are the latest revisions. The directions said I should get the file /r/p/mediawiki/core.git which makes me presume that’s what tells git the version I’m checking out, but I can’t see that. I don’t like obfuscation when I’m checking out code. After some digging, I found the URL I wanted was https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git and that shows me the files, and their tags.

The tags were important for the command I wanted to run: git checkout tags/NAME where NAME is the tag. I want version 1.18.2, which is the latest stable, so I want to do is git clone -b tags/1.18.2 https://gerrit.wikimedia.org/r/p/mediawiki/core.git

I tried it with pulling a copy of 1.18.1, and it ended up pulling about 89 megs and put it in a subfolder called ‘core’. Oh and it told me it couldn’t fine tags/1.18.2 and used ‘HEAD’ instead, which put me on bleeding edge. It took a while to find something that looked a little better:
```
cd ~/www/wiki/
git init
git remote add -t REL1_18 -f origin https://gerrit.wikimedia.org/r/p/mediawiki/core.git
git checkout REL1_18
```
In theory, that would get me the latest and greatest of release 1.18, be that 1.18.1 or 1.18.2 and then, when 1.19 was released, I could switch over to that by doing this: git checkout REL1_19

It was time to test it! I pulled REL1_17 and … ran out of memory. Twice. fatal: Out of memory? mmap failed: Cannot allocate memory happened right after resolving deltas. Everyone said ‘You need more swap memory’ but when I looked, it was fine (not even using half). So that was when I said ‘screw it, I’ll go back to tarballs.’

Just as I hit that wall, some Twitter peeps stepped in to explain that, no matter what, I have to clone the whole archive first. I’m not really happy about it. I like SVN, since even when I switch, it’ll only update my changed files. I wanted to be able to do that in git without needing to pull down the whole repo, since I only use it to update, and not develop.

And that there is my one and only issue with git. It’s entirely for developers. Sometimes, some of us just want to pull down the latest version to play with, and not the bleeding edge. With git, I have to clone the repository, then export the version I want to play with. That’s not optimal for what I want to do. On the other hand, I totally get why it’s done that way. It’s akin to the basic svn check out, where I get all the branches and tags, only much much cleaner. And that’s cool! Smaller and leaner is fantastic.

And yet I’m not a developer in this instance. I don’t need, or want, the whole release, just a specific stable release. So now my workflow would be…

Install

On Git:(Of note, every time I try to run checkout I get “fatal: Out of memory, malloc failed (tried to allocate 1426604 bytes)” and I get the same thing when I try to archive. If I use the tool to change memory allocation, it also fails … out of memory. I’m taking this as a sign.)
```
mkdir ~/mediawiki/
cd mediawiki
git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git .
git checkout 1.18.1 -f
cp -R ./mediawiki /home/user/public_html/wiki/
```
On svn:
```
mkdir ~/www/wiki/
svn co http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_18_1/phase3/ ~/www/wiki/
```
Upgrading
On Git:
```
git pull /home/user/mediawiki/
git checkout 1.18.2 -f
cp -R ./mediawiki /home/user/public_html/wiki/
```
On svn:
```
svn sw http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_18_2/phase3/ ~/www/wiki/
```
You see how that’s much easier on svn?

Now, someone wanted to know why I like this. Pretend you’re trying to test the install. You often need to test a specific install version, say version 1.17.1, against some code to see how far back something was broken. Being able to quickly pull that, without coding any extra steps, is crucial to fast testing and deploying. The tool we use at my office required me to reverse engineer and script a way to do that with an ant script, and certainly I could script all that here. It’s not even that hard. But what’s one line in svn is three in git, on it’s best day, and you’re always pulling the latest and reverting back to the one you want, which seems a little silly.

Now, someone pointed out I could use this: git checkout-index -a --prefix=/home/user/public_html/wiki/ When I read the man page, it doesn’t say you can pick a tag/branch to push, though. If you have some advice on how to fix that, I’ll give it a shot, but I suspect my memory issues will block me.

I will be using git for a lot of things, but a fast way to update, spawn versions, and test will not be one. For development, however, I totally adore it.
23 March, 2012
Duplication Dillution

A common enough request in the WordPress forums is people wanting to have two sites on a MultiSite network have the same content. Usually I take the time to ask why they’re doing this, and see if there are better ways around it. Like if you want to have a collection of ‘all’ posts, then you want WordPress MU Sitewide Tags Pages (weirdest name ever) or Diamond MultiSite Widgets, both of which help you do that admirably. And if you want to have the same ‘about’ page on multiple sites so they can keep their per-site branding, I point out that best companies don’t do that, they link home to mama. Amazon.com’s subsites all link back to amazon.com’s about page, after all. That’s good business practice!

Now, before someone gets all het up with an angry comment, there are good reasons to duplicate content, but it’s never 100% in the same language. If you have a site that needs to be bilingual, chances are you’re going to have a copy in English and (let’s say) French that, if translated would be pretty much the same. Not 100%, because of what idioms are, but basically the same. That’s a great reason for a duplicated site. Regional stores, where most of the content is the same, but some are different, is another good one. And those really aren’t ‘duplicated’ content.

But imagine the guy who wants to have 50 sites with different domains (no problem), and they’re all the ‘same.’ Well. What? You mean I can go to example.com and example2.com and example3.com and they’re 100% the same? Same theme, same posts, same content? That seems silly, doesn’t it?

So why do they persist in doing this? Some people cite SEO reasons (“My site ranks better for this domain…”) and other say its regional (“I need a separate domain for Canada!”) and they’re pretty much all wrong. Unless your domains are offering up different content, you are going to lose SEO ranking and readers by having multiple, identical, domains.

In D&D (yes, I play it), we always say ‘Don’t split the party!’ For the gamers, this is good because everyone gets to play with everyone together. For the person running the game (GM or DM, depending on your flavor of game), they can talk to everyone at once. Splitting the party means half the time, some of your players have to leave the room and kill time while the other people get to do fun things. And then the game rummer has to repeat things for the next group when you switch places! It’s annoying and boring 50% of the time, and it causes duplication of effort.

Splitting up your visitors means you have to figure out how to push content that is identical. This is not difficult, but it can cause problems. Every time you edit a post, the PHP calls your database and overwrites things. Multiply that by however many places you’re overwriting, and that could slow down posting. But then you think about using something like Content Mirror, which pulls post data in from across sites. Sounds great, until you remember that the blog switching code isn’t efficient (i.e. slows things down), and that all the smart people tell you switch_to_blog() is rough on caching.

All that aside, there are major reason you don’t want to duplicate content. The foremost is that Google hates it. So do you, by the way. Duplicating content is what spammers and splogs do. (The Illustrated Guide to Duplicate Content in the Search Engines.) For those who go “Hey, but I have the same content on my archive pages for dates and categories!” you should read Ozh on the ‘Wrong SEO Plugins.’. The tl;dr takeaway is that good themes already do this for you!

Second only to SEO is now you’ve given your users multiple places to have the same conversation. Cross-posting is never a good idea. You dilute your content by have multiple conversations about the same thing. Should I post on foobar.com or foobaz.com to talk about my foo? The more time your readers spend thinking about where to comment, the less time they’re engaging with your site. This is, by the way, one of my nagging dislikes about BuddyPress. With groups and forums and blogs, you can dilute your message. Thankfully, you can use the group page to pull in all those conversations to one place where people will see them, which helps a lot.

I put the question to my friends. Why would you want 100% content duplication on multiple sites, in the same language, just with different URLs? Here are there answers:

@Ipstenu uh. misguided attempts at SEO in certain geographics?

— Paul Wong-Gibbs (@pgibbs) March 2, 2012

@ipstenu Plagiarizer working very diligently?

— Liv Moreno (@Adm_Hawthorne) March 2, 2012

@Ipstenu – Bizarro websites? They look the same, but one is EVIL?

— Kristin (@_KMcC_) March 2, 2012

@Ipstenu only if the different domains are country specific. otherwise, NO GOOD REASON

— quiltingshed.bsky.social (@andrea_r) March 2, 2012

@Ipstenu the only reasons I can think of are for search purposes (re: the URL would pull in different results) and keeping an online backup.

— Brian Crawford (@briancrawford) March 2, 2012

http://twitter.com/jan_dembowski/status/175590010555342848

http://twitter.com/bluelimemedia/status/175606857967214593

@ipstenu Maybe they don't realise they can redirect one domain to the other & think they have to duplicate to use 2 different domains?

— Taryn Wallis (@phenomenoodle) March 2, 2012

What’s yours?

19 March, 2012
Pretty URLs Matter
Just over a year ago, Lifehacker changed their site and we all learned about the hashbang. I believed that pretty URLs mattered:

I would posit that, since the web is based on look and feel, the design of your site still relies, in part, on the ease of someone in understanding the URL.

Well Dan Webb agrees with me: (It’s About The Hashbangs)

After quite a lot of thought and some attention to some of the issues that surround web apps that use hashbang URLs I’ve come to conclusion that it most definitely is about the hashbangs. This technique, on its own, is destructive to the web. The implementation is inappropriate, even as a temporary measure or as a downgrade experience.

This Dan guy happens to be in charge of switching Twitter from the #! back into nice, normal, readbale URLs:

@timhaines plus, now I'm in charge of undoing twitters hashbang URLs I can confirm that all the issues in that article are very real.

— Dan Webb (@danwrong) February 20, 2012

You can read the whole Twitter convo on Storify. But the take-away is, I admit, a little dance from me of ‘I told you so.’

In the comments of my previous post, someone pointed out that Google uses Hashbangs in a search. I have not seem them in my searches (here’s an example of what I do see on page 3 of a search for ‘sadasdas’):

https://www.google.com/search?sourceid=chrome&ie=UTF-8& q=sadasdas#q=sadasdas&hl=en&prmd=imvns&ei=8CthT86EEoWDgwerpqDwDA& start=20&sa=N&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.,cf.osb&fp=6b8f6c2ef22f8dde& biw=1324&bih=879

The thing is, Google doesn’t matter in this instance. Actually, none of Googles URLs ever really matter when it comes to the reasons we want Pretty URLs. Remember our credos?

URLs are important. Yes, they are, but they’re important when you’re looking for something. They should be important on, say, YouTube, and they certainly are important on Google Code pages. But for a normal Google search? You don’t type in google.com/ipstenu or even google.com/search/ipstenu to find pages about ipstenu (Though, really, that would be really awesome if you could! At the very least, the 404 should have a ‘Do you want to search for that?’ options. Get on it, Google Monkeys!) Which is the point I was trying to make. When you go to a site for specific content, you want people to say ‘Go to ipstenu.org/about’ and be done with it.

URLs are forever. Again, yes they are, except when they’re not. For Google? This doesn’t matter. They don’t search themselves, and rarely do I know people who bookmark a Google search. But… Did you know Googles search URL formats are backwards compatible? That’s right. If you had an old URL saved, it would still be usable.

Cool URLs don’t change. Except for search URLs. But Google knows if you’ve gotta change ’em, make ’em backwards compatible. Ben Cherry has a nice article on why it’s good, too, which I found enlightening. I still don’t like them on the front end of websites, mind you, for a reason Ben points out:

The hashbang also brings a tangible benefit. Someday, we will hopefully be rid of the thing, when HTML5 History (or something like it) is standard in all browsers. The Web application is not going anywhere, and the hash is a fact of life.

That’s the problem right there. We’re building something we know is going to go away! We just broke a credo!

Are there ever reasons for using Hashbangs? Sure, but most of us will never need them. They’re great for loading AJAX which is awesome because you can load new parts of your site faster. Heck, WordPress has some AJAXification, but it’s primarily on the backend. They’re fantastic for web apps and they have some attractive functionality. But they don’t work for everyone, and they never will.

The ‘replacement’ is HTML5, which introduces pushState, which lets you use a JavaScript library with HTML5 history.pushState and … well basically you write an app with that instead of AJAX and the Hashbang, your URLs stay pretty and your page can still have that nice ‘click to add the new tweets’ functionality that we’ve all seen on the new-new-new-Twitter. And Facebook.

See, pushState is giving us these three things (among others):
- better looking urls: A ‘real’, bookmarkable, ‘server-reachable’ url.
- graceful degradation: Render correct pages when you don’t have JS enabled.
- faster loading: By only loading parts of the page, slow-net users can get the content faster.
The downside? It doesn’t work on IE right now. That’s okay, though, since it’s got graceful degradation, IE users will just have to manually refresh things. Which sucks for IE users, but perhaps will be the fire under Microsoft’s feet to get on that one.

Keep an eye out in your webapps. They’ll be moving over to this sooner than you think.
14 March, 2012
mu-plugins: What is it good for?
If you’ve been using WordPress seriously for a while, or Multisite for more than twenty minutes, you know about this weird thing called MU Plugins. That used to mean ‘Multi User’, as in WordPressMU, the predecessor to WordPress Multisite. The short version is that any plugin you put in there is ‘Automatically On’ for your site. On a Multisite Network, this means it’s on for all sites. On a single site, it means it’s on and you can’t turn it off.(Must use plugins and 3.0 – WPMU Tutorials)

So why use it?

In a theme, often people will add code snippets to a functions.php file to customize things, and that works great. It gets called via the theme and you’re happy. But what happens when you change a theme? Or if you want to apply the changes to multiple themes? Then you use mu-plugins! I use it for all my ‘non-interface’ code, that is stuff I want to set and forget.

Here’s an example of a block of code I have:
```
// Header Additions
add_action('wp_head', 'fansite_head');

function fansite_head() {

        echo '<link type="text/plain" rel="author" href="http://fansite.tld/humans.txt" />';
        echo '<script type="text/javascript" src="http://apis.google.com/js/plusone.js"></script>';
        echo "<script type='text/javascript'>function plusone_vote( obj ) {_gaq.push(['_trackEvent','plusone',obj.state]);}</script>";
        echo '<meta property="og:type" content="fansite"/>';
        echo '<link href="https://plus.google.com/xxxxx/" rel="publisher" />';
}
```
This adds in my humans.txt (yes, I have one), the Google +1 tracking, and a link for the publisher so G+ knows the site is really itself. Now this code is on every site in my network, without lifting a finger.

Now there are some things that are very theme specific. Like I have a bit of code that adds similar data like my head example, but right below the HTML tag. Since that’s using a theme-specific call, I leave it in my theme functions.php instead. I even use this on my non-Multisite installs, since it just makes it easier to separate my perma-code from my theme code. In addition, when I’m making a site for someone else, and I’m not doing their theme (it happens), I can tuck things they want all the time, on every theme, in there, and they can’t ‘accidentally’ turn them off.

I’ve gotten in the habit of naming the files things like sitename-cpts.php (which has all my Custom Post-Type data for one single-site) and sitename-functions.php (which has everything else) and even sitename-style.php (which turns on the visual editor styling to match my site, and changes the html editor font, as well as a small fix for a fixed position header that looks nasty with the admin bar when logged in).

What do you love using mu-plugins for?
5 March, 2012
GPL – Oh Dear God
I come to code from a strange direction. I was a fangirl and I learned all about webpages because of that. Perhaps it’s because of that humble begining that I look at the GPL arguments much as I look at ‘shipping’ arguments in fandom. Shipping is when fans believe a relationship to exist between two characters, regardless of anything being scripted. A great example of this is Xena: Warrior Princess. Some people thought Xena and Gabrielle were a couple. Some people didn’t. When the two argued, there were fireworks. Now, with shipping, sometimes a miracle occurs and the couple do get together (see Mulder/Scully in The X-Files and Grissom/Sara in CSI: Crime Scene Investigation), but most of the time the arguments go on for eternity.

That’s pretty much how GPL arguments make me feel. Ad nasuem. I hate them, and I was sorely tempted to post this with comments turned off. But various people asked me my opinion and that they’d like to see this posted, so… you’re masochists. This all started with Rarst asking me what I thought about a GPLv2 vs GPLv3 in a trac ticket in WordPress to fix plugins about page license requirement. I should note that he and I don’t see eye to eye about GPL, and we’re still pretty friendly.

Here’s the thing. I don’t have a horse in this race. GPLv2, GPLv3, MIT, Apache, whatever. I don’t have a license I love beyond measure. You see, I work for The Man, and having done so for 13 years, I have an acceptance of things I can’t change. One of those things is ‘This is how we do things.’ You accept that, even if things aren’t the most efficient, or even if they’re not the way you’d do them if you could choose, this is what they are.

I view the GPL issue in WordPress with the same antipathy. It’s not that I don’t care, it’s that I accept the rules for what they are and I have no reason to rock the boat. WordPress says ‘If you want to be in our repositories, and you want to be supporting our official stuff, you have to play by our rules.’ This is good business sense, it’s good branding, and it’s self-protection. With that in mind, I recently changed a plugin of mine to use code that was MIT licensed.

Expat License (#Expat)

This is a simple, permissive non-copyleft free software license, compatible with the GNU GPL. It is sometimes ambiguously referred to as the MIT License. (MIT License compatibility with GPLv2)

That’s easy enough. But I’m sure you’ve heard people argue that GPLv3 and GPLv2 are incompatible, or Apache is, and those are true statements. Because they’re incompatible, however, does not mean you can’t use them together, it means you can’t incorporate them!

Let me explain. No. Let’s let Richard Stallman explain:

When we say that GPLv2 and GPLv3 are incompatible, it means there is no legal way to combine code under GPLv2 with code under GPLv3 in a single program. This is because both GPLv2 and GPLv3 are copyleft licenses: each of them says, “If you include code under this license in a larger program, the larger program must be under this license too.” There is no way to make them compatible. We could add a GPLv2-compatibility clause to GPLv3, but it wouldn’t do the job, because GPLv2 would need a similar clause.

Fortunately, license incompatibility only matters when you want to link, merge or combine code from two different programs into a single program. There is no problem in having GPLv3-covered and GPLv2-covered programs side by side in an operating system. For instance, the TeX license and the Apache license are incompatible with GPLv2, but that doesn’t stop us from running TeX and Apache in the same system with Linux, Bash and GCC. This is because they are all separate programs. Likewise, if Bash and GCC move to GPLv3, while Linux remains under GPLv2, there is no conflict.

What does that mean? It means I could write a plugin in GPLv3 and use it with the GPLv2 (or later) WordPress and not have any issues. However what I cannot do is take that code, put it into WordPress, and distribute it as a new version of WP Core. And that’s (most of) why we shouldn’t have the GPLv3 plugins in the WordPress repository. The fact is that often new core features get their start as plugins, and if we allow GPLv3 in there, we run the risk of breaking licensing.

Worse, if we get our ideas from a GPLv3 plugin and rewrite it to GPLv2, we still are at risk for violation. This is the same reason, to bring it back to fandom, why authors can’t read fanfiction. If they use your ideas, you can sue them. We have enough headaches with Hello Dolly, let’s not add to them. And before someone thinks I’m overreacting about that, I wish I was. I’ve watched copyright wars before, seen friends lose their websites over it, and I can’t imagine that a TV show would be less agressive than a software company when they feel their rights have been infringed. It’s terrible, but it’s the reality of the litigious society in which we live.

So why don’t we just move WordPress to GPLv3?

If it helps to think of software in a different way, pretend you have two versions of server software, Server 2.0 and Server 3.0. You can use the files from Server 2.0 on the box with Server 3.0, but you can’t use Server 2.0 files on Server 3.0. They’re not backwards compatible. That’s the first problem with GPLv3, we’d have to make sure every single bit of code in WordPress is able to be moved to GPLv3.

This is clarified in the GPLv3 FAQ: (GPLv3 FAQ Update – Converting GPLv2 to GPLv3)
I have a copy of a program that is currently licensed as “GPLv2, or (at your option) any later version.” Can I combine this work with code released under a license that’s only compatible with GPLv3, such as ASL 2.0?
Once GPLv3 has been released, you may do this. When multiple licenses are available to you like this, you can choose which one you use. In this case, you would choose GPLv3.

If you do this, you may also want to update the license notices. You have a number of options:
- You may leave them as they are, so the work is still licensed under “GPLv2, or (at your option) any later version.” If you do, people who receive the work from you may remove the combination with parts that are only compatible with GPLv3, and use the resulting work under GPLv2 again.
  If you do this, we suggest you include copies of both versions of the GPL. Then, in a file like COPYING, explain that the software is available under “GPLv2, or (at your option) any later version,” and provide references to both the included versions. If you have any additional restrictions, per section 7 of GPLv3, you should list those there as well.
- You may update them to say “GPLv3, or (at your option) any later version.” At this point, any versions of the work based on yours can only be licensed under GPLv3 or later versions. Include a copy of GPLv3 with the software.
- You may also update them to say “GPLv3” only, with no upgrade option, but we recommend against this. Include a copy of GPLv3 with the software.
That doesn’t sound terrible, does it? You can go from GPLv3 to GPLv2, so long as you remove all the GPLv3-and-up code. Backwards compatibility is complicated, and forward isn’t much better. The second problem is the bigger one. From the GPLv3 FAQ: (GPLv3 FAQ Update – permission to change release)

Consider this situation: 1. X releases V1 of a project under the GPL. 2. Y contributes to the development of V2 with changes and new code based on V1. 3. X wants to convert V2 to a non-GPL license. Does X need Y’s permission?
Yes. Y was required to release its version under the GNU GPL, as a consequence of basing it on X’s version V1. nothing required Y to agree to any other license for its code. Therefore, X must get Y’s permission before releasing that code under another license.

And why would WordPress want to switch to GPLv3 anyway? Other than having more licenses compatible with GPLv3 than GPLv2, the other main differences didn’t strike me as anything WordPress needs. You can read ifrOSS’s document and come up with your own opinion, of course. Basically in order to get WordPress moved to GPLv3, every single person who ever submitted code to WordPress has to be contacted and sign off on the change. Now, unless a clever lawyer type can argue that the act of submitted code to core reliquieshes your ‘ownership’ thereof, and the code becomes community

code maintained and manged by WordPress, and as such WordPress does not need to gain consent from all contributors, then maybe this can be done. But that’s a long, mess, legal conversation.

Finally, the real question is who are we protecting here? Remember, GPL is not about the developer but the user. The extra ‘freedoms’ that come in GPLv3 don’t really strike me as being about the user, and that worries me a little. Patent protection, permitting code to be kept from the end user, and tivoization are all great things for a developer who puts out the code, but as the person who wants to adapt it later, that sure feels pretty restrictive to me.

Will I use GPLv3 ever? Sure. But not in my WordPress code.
28 February, 2012
Risk vs Transparency
This was written without any special insider knowledge. I’ve simply watched, paid attention, and kept track for the last two years. Often when I report a plugin, Mark and Otto are nice enough to explain things to me, and I’ve listened.

Occasionally a plugin vanishes from the WordPress repository. As a forum mod I tend to see people freak about this more often than not, and the question that inevitably comes up is ‘Why doesn’t WordPress publicize these things?’

Let’s go down the list for why a plugin is removed first. This list is very short, and boils down to three:
1. It breaks the rules
2. It has a security exploit
3. The author asks for it to be removed
That’s pretty much it. The rules cover a lot, though Otto and I have been known to sum it up with ‘Don’t be a spamming dick.’ I actually had the chance to talk to folks about this before the ‘expanded guidelines’ went live, and I think I have a pretty good understanding of what the rules are. The majority of plugins, that I see removed, are done so for the most obvious reasons:
- Phoning home (i.e. sending the author data about you without your permission)
- Forward facing links (i.e. opt OUT links on the front of your site when you use the plugin)
- Affiliate links (i.e. the author gets revenue from the plugin without disclosure)
- Obfuscated code
None of those are security reasons, and most of them are ‘fixed’ by us reporting the plugin, the plugin repo mods contacting the author, the author making the fix, and all is well. When the author doesn’t reply, or in the case of a ‘phone home’, often the plugin is yanked from the repo pending review. So where are these ‘security reasons’ to yank a plugin, and why should WordPress disclose them. Phoning home is, sometimes, a security reason, depending on what’s actually being transmitted.Usually it’s a vulnerability or an outright backdoor that would be a reason to pull a plugin.

There’s an argument that ‘Trust requires transparency’ when it comes to security (see Verisign’s recent rigmarole) and that would mean WordPress needs to publish things like ‘This month, these plugins were removed for this reason.’ Except WordPress doesn’t, and in fact, if you look, rarely do companies do this until they have a fix. The ‘problem’ with WordPress is they don’t do the fix, the plugin devs do, and a surprisingly high amount of times, the plugin author fucks off like a monkey.

On the other side of this argument is FUD(Fear, Uncertainty and Doubt) which is something you never want to feed. Look at the plugin “ToolsPack,” helpfully shown up on Sucuri. Now that was never hosted on WordPress.org, but if it had been, it would have been removed for exploitation. But once the offending plugin is removed, should WP go ahead

In October of 2010, WordPress.org ‘introduced’ a kill switch for plugins. Not really, but kind of. BlogPress SEO was spam. Yoast, one of the few true ‘SEO experts’ I know of, caught it and decided to fix it the best way he knew how. See, this plugin was never on the WordPress repository and so WP could do little about it. Yoast registered a plugin with the same name, gave it a newer version of the plugin, and everyone saw that as an ‘update’ and thus were saved. Sort of. Now, even Yoast admits this is abuse of the system, and I’ll leave the coulda/woulda/shoulda to someone else.

The reason I bring it up is this shows there is a way to handle bad plugins. But it’s not very efficient, it’s not very friendly, and it doesn’t promise that it will work. First off, not enough people run updates, and secondly it’s putting a lot of work on a very small group of people. While the theme reviewers have a lot of folks helping out, the plugins do not. Should they? Yes, but the number of people who understand all the code that could be in a plugin is far smaller than for a theme. I suppose it’s saying ‘plugins are harder than themes.’ I may be wrong, but it’s how I feel.

To fix all this, you’d need to basically reboot the plugins directory, turn them all off, review each of the 18,000+ plugins, and turn them back on. Then you need an Otto or Nacin going through each one to make sure every check in is okay, every update and every change isn’t spamming. Oh yes, that’s what happens to theme devs, didn’t you know? All releases are approved before they go live. Can you see the plugin developers agreeing to that? That’s a nonsense complaint of mine, actually. If tomorrow the rules changed, maybe half the plugins in the repo would vanish and never come back, but most of the rest would be fine. Of course, we would need a dedicated team of people to do nothing but review and approve plugins to keep up with the traffic.

So accepting what we have today, the wild west, why isn’t there a running list of all plugins yanked from the repo, and why? The list itself isn’t a bad idea. Having a list to say ‘This plugin was disabled on this date’ would be nice for a lot of us, and more so, having the plugin page show ‘This was disabled.’ would be nice. I can even think of a couple code ways to do it, but all of them need a full time person to go through the ‘removals’ and put up a splash page with ‘If you used this plugin, please consider alternatives: .’ and ‘If you wrote this plugin, please contact plugin support.’ Also, this would increase emails to the plugins support account, not from the authors, but from people who want to know why a plugin was removed. And what about a day when a plugin is removed because of a bad thing, but the authors fix it? Did we create a false feeling of doubt in a plugin that had a typo?

On paper, it all sounds like we should be keeping a public list for this still, though. Put it all up there for the public, disclose everything.

Every time I write that sentence, I wince.

It sounds nice on paper, and all I can think is about the people who will cry foul, complain, and want to know more. “Why was this plugin removed and not that one?” Well, most of the time it’s because no one mentioned that plugin. Right now, the plugins that get yanked are ones people stumble across or report.

But why worry about a simple list of removed plugins? Because the first thing I would do, if I was a nefarious hacker, would be to script a pull from that list and scan the web looking for sites that use the plugins, thus implementing a vector for attack. See, we already know people don’t update plugins as often as they should (which is why Yoast’s ‘fix’ isn’t as good an idea as we’d hope), but now not only are we leaving people at risk, we’re opening them to even more risk. If we email them to tell them the plugin’s risky, we have the same problem.

There’s no safe way to inform people without putting anyone who’s not up to date at risk. Given that the most dangerous day to have an unpatched system is the day of disclosure, the only way WordPress, or anyone, could keep a list like that would be if, like Chrome, WP auto-pushed updates right away, forcing anyone who opened the site to upgrade. And that’s fraught with it’s own issues.

Until then, I can’t advocate anyone keeping a list of removed plugins. It’s too risky.
22 February, 2012

Year: 2012

Once GITten, Twice SVN

Duplication Dillution

Pretty URLs Matter

mu-plugins: What is it good for?

GPL – Oh Dear God

Risk vs Transparency