Half-Elf on Tech

Thoughts From a Professional Lesbian

Category: How It Works

  • SEO “Experts” Are Lying To You (About Backlinks)

    SEO “Experts” Are Lying To You (About Backlinks)

    Stop me if you’ve heard this one.

    “For just $19.95, we offer hundreds of certified backlinks!”
    “Quality backlinks for your site!”
    “In just one week, we can make your site in Google Searches!”

    I see people ask, a lot, what the best WordPress plugin is to generate backlinks. And I always reply something like this: “The best way to get backlinks is to write good posts that people will link to and share.”

    But what is a backlink anyway? As obvious as it sounds, a backlink is a link from someone else’s site back to yours. So when I say things like “Yoast’s explanation on how BlogPress SEO Plugin generates spam is an invaluable resource”, I’ve linked back to his site and made a backlink. If he has pings on, he’ll see my remark and link, and it’ll show up on his site in the comments section.(I actually turn pings off, because of the high number of spammers and the low value it was giving me. If the only reason you’re linking to someone is to get the link BACK to your site, you’re doing something wrong, but that’s another blog post.) Backlinks, honest ones between two good sites, are great. I love getting linked to from CNN (it happened once) or other sites who like my writing. It’s a great compliment.

    However, people seem to think that backlinks are going to ‘generate SEO.’ First off, they’re not using the words correctly. SEO stands for ‘Search Engine Optimization.’ My first grown-up job, where I wasn’t just fiddle-farting around on the computer, was to optimize meta-data for sites to get them ranked first on AltaVista, so yes, I do know what I’m talking about here. Due to that early work, I’ve got pretty awesome Google-Fu, because I used to spend hours going over the specs for search engines, and reading up on how they worked, what their algorithmic engines were, and how to get legitimately good results for my key words. I also learned what keywords are useless.

    Back in the day, search engines would rate your site based solely on your self-contained content. One of the ways we would promote our sites would be to use hidden text or meta keywords that only the search engine would see. We’d list all the keywords related to our site about dog biscuits, and awesomely, we’d get rewarded. Naturally some people would shove totally irrelevant keywords in, to game the system for other searches. Which is why sometimes you’d search for ‘free range catnip’ and get a link for ‘wetriffs.com'(Note: wetriffs.com is NOT SAFE FOR WORK!). Today, no search engine relies on keyword meta data because of that (though most sites still include it).

    Nothing can ‘generate’ SEO, because by it’s nature, optimization isn’t something you generate. It’s something you can leverage and build on, but we don’t generate it. Backlinks are, certainly, a component in getting your site highly ranked on Google for your keywords, but you’re really not optimizing your site for backlinks by doing anything other than making good posts, if you think about it. Maybe I’m splitting hairs, but your page rank (i.e. how cool Google thinks you are) is going to be build on a few things, and while backlinks are one of them, it’s not everything.

    Here’s how you make a good site that’s highly ranked in Google:

    1. Write good content
    2. Include decent meta data in your site’s HTML (If you need help with that, check out Google’s page on Meta Tags.)
    3. Network with other (related) sites to share links
    4. Advertise

    So why do people get all fired up about backlinks? Google themselves say:

    Your site’s ranking in Google search results is partly based on analysis of those sites that link to you. The quantity, quality, and relevance of links count towards your rating. The sites that link to you can provide context about the subject matter of your site, and can indicate its quality and popularity.(Link Schemes – Google.com)

    Sounds great, doesn’t it? If a lot of people link back to me, like WikiPedia, then my content is proven to be good, and I win! You knew it wasn’t that simple, right? Google’s smart. They actually care about the quality and relevance of people linking to you! Heck, Google actually agrees with me when we both say the best way to get a good page ranking is to make good content. More to the point, those get-backlink-quick tools are going to engage in what basically amounts to spam, which will adversely impact your page ranking.

    Of course, there are good backlinks. Like mine to Yoast’s (not that he needs the ‘link juice'(The term ‘link juice’ is what we use to call the ‘value’ of a link coming back to our site. If I link to you, I give you ‘juice’ which boosts your page rank. In Yoast’s case, he doesn’t need any help, but I give it anyway.)). But the best way to get those is to get yourself known in your arena. People don’t link to new sites because they don’t know about them, so you need to get out there and get known. Talk to a site you admire (or people you admire) and ask them if they’ll read and review your site. Post your articles on twitter/facebook/digg/whatever and basically put in the sweat equity to make your site shine. And if that sounded like a lot of work for you, then you’re right. It is work. It’s hard work.

    The obvious question now is that if these so-called experts are telling you that they can generate hundreds of backlinks, what are they actually doing? They’re ripping you off. There’s no automatic, auto-generated, way to create backlinks. So if someone tells you that they can do it for $19.95, they’re not lying, but they are cheating you out of money, and giving you something useless. If you’ve fallen for one of those scams, I’d cancel that credit card ASAP. I have a horror story about a guy who got scammed and then ripped off for a couple grand.

    The lessons learned from this are pretty simple: There is no quick fix, no magic bullet, no perfect tool that will make you popular. You have to find your audience and pitch good content to them. You have to work hard and yes, this takes a lot of time and effort. Anyone who says differently is selling something. Of course, optimizing the hell out of your site (with caching software and minification and CDN) is a great thing to speed your site up, but at the end of the day, all advice in the world boils down to this: If there’s nothing here for people to read and find beneficial, your site is useless.

    Before you get depressed and think there’s nothing you can do to improve your site, I refer back to Joost de Valk. When people tell me they’re an SEO expert, I compare their website and work to Yoast, because in my opinion, he’s the example of what an SEO expert looks like and he doesn’t call himself an expert. He says he’s a ‘SEO and online marketer.’ Sounds to me like a guy with his head on staight. Pretty much everyone else I ignore. And he’s written the Ultimate SEO Guide and it’s free for anyone to use.

    SEO Folks I Would Hire (culled from my ‘Folks I’d Hire’ list):

  • Website Viewability

    Website Viewability

    The goal to make your site look cool, be easy for people to use, and be available for all, is a holy grail trifecta that is rarely achieved. Many times, you have to sacrifice one leg of the tripod in order to achieve your goals.

    The advent of Typekit has led to a lot of websites using cool custom fonts in a way that is supposed to solve that age old problem of what happens when you design your site with a font the end-user doesn’t have. For a very long time I couldn’t understand what the big deal was, since I often read these sites from work, and their fonts were all jaggedy and ugly. Then I fired up a site from home and was astounded at the difference.

    This is what I normally see when I go to TypeKit:

    This is what you’re supposed to see:

    I know it doesn’t look too bad, but basically what I don’t get are the nice, smooth, edges on fonts, so when I read a whole page like that, it’s hard on the eyes. TypeKit works by javascript, so arguably, it should work on all browsers with JS enabled (which is to say all modern browsers). I’m using Chrome (latest and greatest) and I get crap.

    That’s from Ed Jeavons’ Beyond web-safe fonts with Typekit, which is a great article. But the whole thing is unreadable to me because of that.

    So where is the break down here? TypeKit’s goal is to make their fonts work on every site, regardless of if you have the font installed on your server. Jeavons says “Typekit degrades gracefully so that anyone without JavaScript, or with a browser that doesn’t support the necessary features, will simply revert to your standard CSS rules.” If that was the case, shouldn’t I be seeing a better site?

    According to TypeKit, the problem is that the sites I’m seeing didn’t make good standard CSS rules. My anecdotal evidence suggests otherwise. After all, every site I go to has the exact same problem. So I turned off javascript and went back to the site:

    Now that looks like you’d expect graceful degradation! At this point, my answer is that something TypeKit does is unwelcome on my office computer. Or more likely, my office firewall. That’s a whole new kettle of fish. I can’t reasonably expect everyone to go find an office behind a firewall made of adamantium and test their site. But clearly this is not the fault of the individual site. Is it reasonable to expect TypeKit to look into this? I went on a search and found they have a cool little checker too typecheck which says I’m fine:

    There’s nothing in their FAQ or help desk that mentions Firewalls having this issue, so I decided to check out Google Web Fonts. Lo and behold, I get the same problem. Some more digging and I found someone who ‘fixed’ the problem using css. My Twitter friend @cgrymala suggested I also try ClearType, since I’m on Windows XP at work. That actually helped a lot (seriously, I cannot tell you how much nicer things look) but the main problem is still there.

    Where’s my problem? My problem is that TypeKit and Google Web Fonts, while they purports to be a one-size-fits-all/degrades-nicely app, are not. If you’re not on the forefront of technology, if you’re behind a firewall, if you’re on a weird setup, these things are not going to work. This is not really TypeKit’s or Google’s fault. They’ve done an amazing job setting things up so it works most of the time. At best, they could have their javascript detect browser and OS (yes, you can do these things) and if it’s IE 6 or Windows XP (for example), revert to the javascriptless version of the site.

    It’s nigh impossible to solve the firewall problem. You can’t detect the firewall easily, and part of the point of them is they obfuscate who and what they are. And if the problem is a combination of OS, browser and firewall, then the best you might be able to do is somehow detect if any one of those three are on the known ‘possible’ trouble list, and shunt them off to a non-js version. And now you’ve added a lot more load to your server.

    The best you can do is to avoid using these cool systems and features until they’re more supported, which is where the whole concept of sacrifice comes in. If it’s more important for you to have your site look cool than to work for everyone, you have to find a way to degrade better. For a long time I had an alert bar on my site to tell you that if you were using IE 6, you needed to upgrade. Going back further, we used to regularly make sites that said ‘Best viewed in Netscape Navigator.’ Thankfully sanity struck, web standards started to stick, and we began to design sites that looked good in most browsers.

    I cannot advocate a return to ‘Best viewed in…’, but I can suggest that if you’re relying heavily on cool, cutting edge, features, you also have a printer-friendly version of your site that runs without any of the bells and whistles.

  • Software Freedoms

    Software Freedoms

    copyleft image Like a million other posts, I’m starting this with a warning: I Am Not A Lawyer.  Sure, my mom is, but that qualifies me for a cup of coffee, if I have the cash.  Personally, I support open-data and open-code because I think it makes things better, but there are a lot of weird issues when you try and pair up software licenses, explain what ‘freedom’ means, and where it’s applicable. For the record, I am not getting into the ‘is a plugin/theme derivative software or not’ debate. I will wiggle my toe and point out it is a point of contention.

    I’m presuming you are already familiar with the idea of what GPL is. If not, read the GPL FAQ.

    Why are WordPress and Drupal GPL anyway?

    The people who built WordPress took an existing app (b2) and forked it.  Forking happens when developers take a legally acquired copy of some code and make a new program out of it.  Of the myriad caveats in forking, you have to remember that the fork must be a legal copy of the code.  In order to create WordPress, Matt et al. were legally obligated to make WordPress GPL.  No one argues that.  The only way to change a license from GPL is to get everyone who has ever committed any code to the project to agree to this, and you know, that’s like saying you’re going to get everyone in your house to agree to what pizza to order.

    WordPress and Drupal is GPL because it must be.  There is no other option.

    So why is this a problem?

    GPL poses a problem because of interpretations of what ‘derivative works’ are.  It’s very clear cut that if you take or use WordPress’ or Drupal’s code, you are taking code built on GPL, which means you must keep your code GPL.  The definition of ‘code’ is a bit squidgy.  A generally accepted rule of thumb is that if your code can exist, 100%, without WordPress or Drupal’s support, then it’s not a derivative.  By their very nature, plugins and modules are seen as derivative.  Both Drupal and WordPress have long since stated that this is, indeed, the case.

    Themes, modules and plugins are GPL because they must be.  There is no other option.

    Except…

    The GPL GNU. If you don't know, don't ask! Except there is.   Only the code that relies on the GPL code have to be GPL.  Your theme’s CSS and your images actually can be non-GPL (though WordPress won’t host you on their site if you don’t).  Also, if you have code that lives on your own server, and people use the plugin to help the app talk to that code, only the code that sits on WordPress or Drupal has to be GPL.  Your server’s code?  No problem, it can be as proprietary as you want!  Akismet, a product made by Automattic (who ‘makes’ WordPress, in a really broad interpretation) works like this.  So does Google Analytics (most certainly not owned by WordPress), and there are many plugins to integrate WordPress and Google.  This is generally done by APIs (aka Application programing interfaces), and are totally kosher to be as proprietary as you want.

    Themes, modules and plugins are GPL where they need to be, and proprietary (if you want) where they don’t.

    So what is GPL protecting?

    As we often carol, the GPL is about freedom.  And “free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer.”  Freedom is a tetchy subject, misunderstood by most of us.  For example, freedom of speech does not mean you get to run around saying what you want wherever you want.  Free software is a matter of the users’ freedom to run, copy, distribute, study, change and improve the software.  This is pretty much the opposite of what you’re used to with the iOS, Microsoft and Adobe.  Free software may still charge you, but once you buy the software, you can do what you want with it.  Your freedom, as a user, is protected.

    WordPress’s adherence to GPL is for the user, not the developer.

    What’s so free about this anyway?

    The term ‘free’ is just a bad one to use in general. Remember, freedom of speech, as it’s so often used in inaccurate Internet debates, does not mean you can say whatever you want. ‘Free speech’ means ‘You have the right to say what you want, but I have the right to kick you out of MY house if I don’t like it.’ So what are these GPL freedoms anyway? In the GPL license you have the four freedoms: (1) to run the software, (2) to have the source code, (3) to distribute the software, (4) to distribute your modifications to the software. Really they should be ‘rights’ and not ‘freedoms’ if you want nit-pick, and I tend to think of the freedom of source code to be similar to data freedom. The freedoms of open-whatever are for the people who use the whatever, not those who come up with it.

    Software freedoms are for the user to become the developer.

    So if GPL is for the users, what protects the developer?

    Every post about software freedom requires Stallman's image! Not much, and this is where people get pissed off.  If anyone can buy my software and give it away for free (or pay), why would I even consider releasing something GPL?  The question, as Otto puts it, really should be ‘What exactly are you selling in the first place?’ What are we selling when we sell software?  I work on software for a living, and I never sell my code.  I’m hired to write it, certainly, and I do (not as often as I’d like).  Most of what I do is design.  It’s part math, and part art.  My contract doesn’t allow me to keep ownership of my art, which sucks, but if I was a painter, I’d sell the painting and lose the ownership anyway, so what’s the difference?  That painting can get sold and resold millions of times for billions of dollars.  And most artists die starving.

    Software Freedom doesn’t stop people from being dicks (though they should).

    So what good is the GPL to the developer trying to make a buck?  

    It’s not.  But that’s not the point.  GPL isn’t about the guy who wrote the code, it’s about the guy who gets the code (again, legally) and says “You know, this is great, but it should make milkshakes too!” and writes that. GPL is all about the guy who uses the code and the next guy who takes the code and improves on it. If you have an open community where everyone has the privilege and right to use, view, share and edit the code, then you have the ability to let your code grow organically. If you want to watch some staid, tie-wearing, Dilbert PHB lose his mind, try and explain the shenanigans of Open Source development. “Develop at the pace of ingenuity” versus “Develop at the pace of your whining users.”

    Software Freedom isn’t about making money, it’s about making the next thing.

    Why would I want to use GPL?

    Other, more famous, Communists If you use WordPress, you use it because you have to. I prefer the Apache licenses, myself, but the purpose of using any software freedom license is, at it’s Communist best, a way to make software all around the world better for everyone. You stop people from reinventing the wheel if you show them how to make the axle in the first place! Did you know that Ford and Toyota independently came up with a way to make your brakes charge your hybrid battery? They latter opened up and shared their tech with each other, only to find out how similar they already were! Just imagine how much faster we could have had new technology if they’d collaborated earlier on? With an open-source/free license, my code is there for anyone to say “You know, this would work better…” And they have! And I’ve made my code better thanks to them.

    I use ‘free software’ open source licensing on my software to make my software better.

  • Limitations on Sortable Columns

    Limitations on Sortable Columns

    As of WordPress 3.1 you can add new columns to admin pages and sort them. You could always add them, but being able to sort the columns is new! This was very much welcome from pretty much everyone who makes extra columns, for whatever reason. As someone who likes them (they make sorting so much easier), I pinged out there to ask how it was done?

    My Twitter friends bailed me on and found these very helpful links:

    These both told you how to add columns to the posts table. But what I wanted to do was add a column to the USERS table. When you see what I wanted to add, I’m sure it’ll be obvious:

    1. Show the DATE someone registered in the users menu
    2. Allow the column to be sortable

    Adding in the date was really painless, but I could not, for the life of me, get it to sort! Small problem. Again, I appealed to Twitter, and bless his little black heart, Otto bailed me out and explained to me why that one could work, but others would not.

    What I learned

    Unless your data in the header is stored in the database, you cannot sort by it. End of story. No further discussion needs to happen. This makes sense, as you cannot sort by dynamically generated content.

    Unlike the wp_posts table, you cannot sort by user generated headers in the User Table because of how it draws from the database. Your plugin would have to write to the wp_users table (making new columns instead of using wp_usermeta which is prefered) and even then, the sortables are hard coded. That one broke my head a little. The sortables in the posts table are pluggable (that is, you can make your own). This just isn’t the same becuase WordPress knows what columns are in wp_users. After all, we add stuff to wp_usermeta. This created a circle of ‘Auuuuugh!’ for me.

    Thankfully, Otto pointed out that since ‘registered’ is listed for MultiSite, you can leverage that on SingleSite and MultiSite (and also explained why my tweaks worked on MultiSite!).

    In the end, I was able to take all that and tweak my “Recently Registered” plugin for WordPress 3.1, and made it look hella cool. If you’re using WordPress MultiSite, you don’t need this at all.

  • WordPress MultiSite – New Dashboards

    WordPress MultiSite – New Dashboards

    Back in the WordPress MU and the recent WordPress Multisite 3.0.x days, we had something called a ‘Dashboard Blog.’ This was the ‘main’ site of your install, and ostensibly was the default blog to sign users up to and control them from. This was also where you, the admin, had the Super Admin menu. So what were those things for and why were they moved? After all, a lot of people will tell you they worked just fine.

    The simplest answer is that it’s considered good design to separate the ‘user’ interface from the ‘admin’ interface. That’s why, when a regular user with the lowest role possible logs in to a regular (non-MultiSite) WordPress install, they see a very limited site. They see a dashboard, their profile, and that’s it. You want to keep the subscribers out of your meat and potatoes. Pursuant to that, there are plugins like WP Hide Dashboard that kick users to just their profile. I love that plugin, because it hides the man behind the curtain. If the Dashboard of WordPress is not a part of your desired experience (and really, it only is for the people who run the site), then you keep Dorothy, Toto, the Scarecrow, the Tin Man and the Cowardly Lion out, Ruby Slippers or not.

    When WordPress 3.0 came out, it was a bit of a chimera. We’ve got all sorts of weird parts where we call things blogs instead of sites, and from the back end, it’s really confusing. The sad thing is we cannot declare fiat, fix it all, and move on, because that would break backwards compatibility. Did you know WordPress is backwards compatible, nearly all the way to the start of WordPress 1? (17 Reasons WordPress is a Better CMS than Drupal – Mike Schinkel, Dec 1st, 2010) In order to be able to upgrade from WordPress MU (which was a fork – i.e. a totally separate version – of WordPress), the fold-in of MU to regular WordPress was a lot of work and duplication. There are some things I’m sure the devs would have chosen to do differently in a perfect world, but they decided the headache for them was worth it because it was beneficial to the users. For that alone, I laud them and owe them beers and coffee.

    One of the many drawbacks of that mentality is the users are very much used to getting what they ‘want.’ The users think ‘This worked before, it will always work, therefore, it’s cool to do it now.’ Take (not for random example) the issue with the /blog/ folder in the main site of any subfolder install. (Switching to WordPress MultiSite Breaks Links – Mika Epstein, 14 July, 2010) Back in the 3.0 days, we had a work-around to fix this, but that was a ‘bug.’ We were all taking advantage of a flaw in the system, and that flaw was plugged (mostly) in 3.1. Of course, fixing the flaw meant breaking things, and those people who were not up to speed on the dev channels (which in this instance included me) went ‘Hey, what the hell!?’ We were angry, we were upset, and then Ron told me that it was bug and I stepped down.

    A lot of people are still annoyed by this, and while there is still a buggy workaround, it’s not something I would generally suggest be used for my clients (myself, yes). Then again, the original tweak wasn’t something I considered using for clients, since I was always aware that WordPress’s stated intent was to make that /blog/ slug customizable. And I hope they do.

    What does this have to do with the new dashboards? It’s another change WordPress implemented to ‘fix’ things people didn’t see as broken. The people are wrong.

    Now don’t get all het up, thinking I’m drinking the WordPress Kool-Aid. There’s a vast difference between being ‘WordPress is always right, WordPress can do no wrong’ and the acceptance that what WordPress did was for a good, understandable, reason. In software development, I’ve learned to distance myself from the all too personal feelings of investment in my product. Many times, the product needs to be designed in a certain way to work better for the majority of people, and many times, I am not that person. Look at JetPack. This is a fantastic plugin for people moving off WordPress.com and onto self-hosted WordPress. It has absolutely no meaning to me, and I won’t be using it. But it’s great for the target audience. I accept that I am not that audience, and I look at the product with as unbiased an eye as is possible.

    I have to look at the Network Admin and User Dashboard the same way.

    The Network Admin was moved from a Super-Admin sidebar menu to it’s own section, in order to provide a clearer delineation between Site Admin (in charge of one site) and the Network Admin (in charge of all sites). (Network Admin – Trac Ticket) (Network Admin – WordPress MustUse Tutorials, October 21, 2010) This is a basic, normal, every-day bit of separation in my everyday life. For one app I use, I even have a totally separate ‘Admin App’ to use when I want to control the whole network, versus just one part of it. It’s done for security, but also to kick our brains over and go ‘Hey, moron, you’re in the Network admin section!’ Our brains need that kick, and it lessens the human errors. In doing this, we also found the plugin management was separate. Per-site admins saw the non network-activated plugins only. The Network Admin had to go to the Network Admin section to see the network-activated plugins and the must-use plugins, though many plugins needed to be recoded to handle this move. (Adding a menu to the new network admin – WordPress Must Use Tutorials, November 30, 2010) While this is annoying and takes a little time to get used to, this is good, sound UI/UX. It’s called “Separate of Duties” in the buzzwords game, and it’s really a blessing.

    Once they moved the Network Admin, the devs took a shot at getting rid of the Dashboard Blog. (Personal Dashboard – trac ticket) Once you moved the super users off to their own network, there’s no need to sign-up users to a main blog. I assume this was originally done becuase you had to hook them in somewhere with 3.0, to make them be a ‘user.’ Well, now WordPress.org Multisite now behaves like WordPress.com. You sign up for a blog but unless you get assigned a role to the blog, you’re not a ‘member’ of the blog. And you know… that’s sensible. You have no real role as a psudeo-subscriber. Nor do you need on.

    As I pointed out, part of the goal with moving the menus to Network Admin is that the whole ‘Dashboard Blog’ concept was a massive annoyance to everyone code-wise and UI wise. Having to say “Oh yeah, the main site is the master site and it’s where I control the universe” is logistically unsound. Much like you cannot in-line edit posts, you should not be mixing up Admin and User areas. So to further that separation, your users are not assigned to any site when they register. I find I need to repeat, a lot, that in most cases, this has no effect on usability. It doesn’t affect my BuddyPress site at all, because the users are the users. They just don’t have blog access. They can comment, which is all they need to do for me, and they’re happy. If they need to make posts, I can add them if I want to. But now I have security, knowing they can’t accidentally get in and poke around.

    Like it or not, it’s not going away. And most of us won’t need it to come back. I do know that some people do need it, and are struggling to find a way to auto-assign users a role on their main site at ID creation, so if you know of a fix for 3.1, please share it!

  • Google vs Splogs – Part 2

    Google vs Splogs – Part 2

    Now that you know all about the myth of the duplicate content penalty, we can look into spam.

    This year, Google got slammed because the quality of their search was being degraded by spammers. Mostly splogs, I will admit, but Google rightly points out that their ability to filter out spam and splogs in all languages is actually much better than it was five years ago. (Google search and search engine spam – 1/21/2011 09:00:00 AM) No, Google isn’t getting worse, there are just more spammers out there. They also take the time to differentiate between “pure webspam” and “content farms.”

    “Pure webspam” is what you see in a search result when a website uses meta data or hidden content in order to bully their way into being highly ranked in unrelated searches, or just basically game the system. A decade ago, this was horrific. Now it’s nearly negligible. This type of spam grew pretty organically out of people trying to understand the algorithm behind search engines and manipulate it legally. As we gained greater understanding of meta keywords and in-context content, we came up with more and more tricks to legitimately make our sites more popular. There was a point in time where having hidden text with as many keywords related to your site was not only common place, but lauded. It didn’t last long, as shortly after the good-guys sorted that out, the bad-guys did too.

    “Content farms” are the wave of the future, and Google calls them sites with “shallow or low-quality content.” The definition is vague, and basically means a content farm is a website that trolls the internet, takes good data from other sites, and reproduces it on their own. Most content farms provided automatically inserted data. There is no man behind the scenes manually scanning the internet for related topics and copy/pasting them into their site. Instead, this is all done via software known as content scrapers. The reasons why they do this I’ll get to in a minute, but I think that Google’s statement that they’re going to spend 2011 burning down the content farms is what’s got people worried about duplicate content again.

    A content farm is (partly) defined as a website that exists by duplicating content. Your site’s activity feed/archives/post tags pages are duplicating content for the users. Does that mean your site will be adversely affected because of this?

    No. It will not.

    Google’s algorithm is targeting sites of low content quality. While your stolen post is a beautifully written piece of art on its own, it’s the site as a whole that is used to generating a search ranking. As I’ve been touting for a decade, the trick to getting your site promoted in Google searches is to make a good site. Presuming you made a good site, with good content, and good traffic, and it’s updated regularly, there is very little risk that Google will peg your site as being of “low content quality.” Keep that phrase in mind and remember it well. Your site isn’t highly ranked because of low content, remember! It’s the reverse. If you’re being ranked for good behavior, good content, and good work, you will continue to be rewarded. In a weird way, content farms are actually helping Google refine their search so that it can tell the difference between good sites and bad! (Why The Web Needs Content Farms – by Eric Ward on February 16, 2011)

    The next Google algorithm update will focus on cleaning content farms from positions of unfair advantage in our index. This will likely affect websites with considerable content copied from other online sources. Once this update is complete, preference will be given to the originators of content. We expect this to be in effect in no less than 60 days. (Google search and search engine spam – 1/21/2011 09:00:00 AM)

    What Google is doing is not only laudable, but necessary. They are adapting to the change of how spam is delivered, and doing so in a way that should not impact your site. The only ways I can see this affecting ‘innocent’ sites are those blogs who use RSS feed scrapers to populate their sites. This is why anytime someone asks me how to do that, I either tell them don’t or I don’t answer at all. While I certainly use other news articles to populate my site, I do so my quoting them and crafting my own, individual, posts. In that manner I both express my own creativity and promotion the high quality of my own site. I make my site better. And that is the only way to get your site well-ranked. Yes, it is work, and yes, it is time consuming. Anything worth doing is going to take you time, and the sooner you accept that, the happier you will be.

    For most small to medium sites, there’s not a thing you need to do in order to maintain your ranking. There are no magic bullets or secrets behind the SEO, to manipulate your site to a better ranking. In point of fact, doing so can be seen as gaming the system and can downgrade your results! Once again. Make a good site and you will be rewarded. Certainly, as I said yesterday, optimizing your robots.txt file and getting a good sitemap will help, and I really do suggest a Google Webmaster Tools account to help you with that. In 2011, Google is still king, so once you get your site well listed within Google’s machine, you’re pretty much going to be tops everywhere.

    Why do splogs and content farm game the system in order to get highly ranked? Profit. Some do it to get their domain highly ranked and then sell it for a lot of money, others do it to infect your computer with a virus, and then there’s the rare hero who thinks this will get them money because of the ads on their site. Sadly, this still works enough to generate just enough of a profit to keep the splogs going. This is also true of spam emails. Yes, that means your grandmother and Carla Tallucci are still falling for the Nigerian Princess scam emails. The only way to stop all of that is to stop those methods from being productive money makers for the spammers, and that is something that will take us all a very long time and a great deal of education to the masses.

    Your take aways are pretty simple. Make a good site with good content. Update it regularly. Use a sitemap to teach search engines what’s important. You’ll be fine. Don’t sweat internal duplication.