Half-Elf on Tech

Thoughts From a Professional Lesbian

Tag: search engines

  • SEO: Impossible

    SEO: Impossible

    For someone who thinks SEO is crap, I sure talk about it a lot. Google’s got a new toy: Dissavow links.

    In the wake of Panda, a lot of sites got hit with bad SEO rankings from having crappy backlinks. In specific, I know many WordPress theme developers were hurt, including WPMUDev, because spammers and scammers used their themes. Basically their own popularity bit them in the ass, through no fault of their own save their success. After all, a pretty common question people have is “Do those crappy, low-quality inbound links hurt me?” And most of the time, the answer was no. Except when it did with Panda. At the time, it didn’t seem fair to anyone that your popularity would be detrimental to your SEO, and thus we have Dissavow. (Amusingly enough, Bing got there first.)

    But what does it do? Here’s Matt Cuts explaining this:

    For the rest of us, it lets you say ‘These links are crap and they’re not related to me, so please don’t let them impact my search ranking.’ Many of you are looking confused here, and wondering why they impacted you in the first place. After all, it’s not your responsibility to monitor the quality of sites on the Internet, is it? That’s why Google and Bing make the big bucks. And yet we all know how terrible search results can be, and frankly Google’s blog search is horrible. I have to hand it to Google, though. Search is hard, and crowdsourcing the work of teaching a computer what is and is not spam is actually a good idea.

    Google (and Bing’s) methodology rub me wrong. Now that Google has us doing the work for them, by picking out spammy sites and effectively reporting them, you’d think all is good for the theme world. Alas, not so. I’ve heard rumblings that Google is now asking theme developers to remove backlinks!

    While I don’t feel a theme developer will be broken for this, it will make it much harder for them to promote their works. On the plugin end of things, I’ve had people ask me to remove their plugins because we don’t permit WordPress plugins to show backlinks unless they’re opt-in, and this means the dev can’t make money. Part of why is that you can have hundreds of plugins, but only one active theme. The other part is we feel it looks spammy. Now, so does Google.

    But all that aside, if you want to disavow your backlinks, you can now do it, and the directions aren’t complicated. Click on the disavow link, upload a text file formatted in a certain way, reap benefits. Sounds great, right? What if I told you that Google sends you no confirmation at all? There’s no confirmation, no way to see if what you did worked or not, and worst of all, this could take weeks, if not months, for them to crawl, sort, and re-crawl your sites. During that time, you hear nothing. When it’s done, you hear nothing.

    You do all this work and end up in a vacuous hole of ‘well, there’s that then’ with no assurance of anything at all being done. That caught my attention in a bad way. How can I tell I’ve done the right thing? We’re already being killed by not being able to track encrypted search terms, and now we’re not going to be able to tell if removing the links from the bad people is going to help our SERP?

    This is why I think SEO is full of it. To one degree or another, it’s always been about gaming the system, and tricking search engines into letting you rise to the top. Meta tags trumped quality, and then it was links (because obviously if people link to you, you’re valuable). Now we know people game links, so we remove that, which actually doesn’t hurt as much as you think. See, a lot of your search engine ranking came from the quality of sites that linked back to you. But the most valuable sites (like MediaWiki) have stringent policies and rules about not linking, or linking and using nofollow, to prevent you from getting link-juice. In the case of MediaWiki, it makes sense since anyone can edit it.

    But…

    That just went to prove the system was broken. Blogs (WordPress included) nofollows comment links for the same reason. If the door was open, the spammers would use it and make themselves look more important. And as the tools got smarter and started making those links worthless, the spammers started scraping your quality content, which Google et al had to learn to filter. We’re at the point where links are valueless. It doesn’t matter who links to you anymore, because none of the good sites will give you a lot of value since they’re trying to get rid of the spammers. So why is Google giving any weight to these spammer links?

    If the state of link-relativity is so poor that search engines are asking us to remove backlinks from themes, and also to tell them which links to us are worthless, then all links are more trouble than they’re worth and we need to figure out a better way to measure the usefulness of our sites. What measuring sticks do you use?

  • Google’s Blog Search is Irrelevant

    Google is a great search tool to find a website or general information about a topic, but quite frankly I’ve come to despise their blog search engine and I’m seeing serious flaws in their ranking app. In specific, they now search blog links (aka the blogroll) and when you search blogs about a topic, you get unrelated posts.

    If you search for Laurence Fishburne because you saw him on an episode of MAS*H recently as a soldier with a racist CO, Google gives you two hits for IMDb, one for Wikipedia, one about news (GoogleNews that is), and then, finally, his official website. While Google claims they don’t adjust ratings (that is, they don’t give more or less weight to a website on their own) and allow their PageRank algorithm to sort all this out, it seems to me that any official website should be ranked first. Also, IMDb shouldn’t be listed twice. But that depends on what people are looking for and what Google offers.

    We stand alone in our focus on developing the “perfect search engine,” defined by co-founder Larry Page as something that, “understands exactly what you mean and gives you back exactly what you want.”

    With that in mind, as I look at their tech overview for people who aren’t super geeky, I think that they come to the process a little flawed. PageRank is a great idea, don’t get me wrong. The more pages that link to a site, the higher the site is ranked (in essence). Okay, that’s great! Until you have those damn splogs. You know the ones. Spam blogs that promise you information about a person/place/thing, but is nothing more than a ton of links and 100 popups.

    Why would I search blogs? Easy, a lot of news sites are using blogs these days, and I want to read those too. It’s not rocket surgery, it’s how news is disseminated in 2009, folks. And I, personally, like to search by ‘date’ because I want to know what’s newest.

    Our search engine also analyzes page content. However, instead of simply scanning for page-based text (which can be manipulated by site publishers through meta-tags), our technology analyzes the full content of a page and factors in fonts, subdivisions and the precise location of each word. We also analyze the content of neighboring web pages to ensure the results returned are the most relevant to a user’s query.

    This looks like it should take care of spam blogs, but if you’ve ever done a search on blogs about someone (let’s use Mr. Fishburne again), you know it’s a crap-shoot.

    A news search is actually pretty helpful. I get some articles of interest right up front. If I flip the bit and sort by date it’s still pretty useful. When I go to blog search (which is a sidebar link off news), it’s still mostly beneficial.

    But I dare you, I dare you, to make sense of the articles when you click sort by date. Three posts on that first page might actually be something worth reading. Good luck finding them, and I hope they actually are what you want. But at the end of the day, those spam blogs aren’t the problem that makes me hate the blog-search.

    No, the problem, as I see it, are the posts like this:
    splog-1

    That bit I circled for you means that the ‘label’ (tag, category, whatever) for ‘Laurence Fishburne’ has been used 4 times. Go to that post and you will not find a single thing on the page of use. 99.999% of these blogs are blogspot and, while I don’t begrudge them their posts, they’re getting false promotion! And your post that you lovingly crafted about how totally amazing Fishburne is, and how he acted the hell out of that scene last night is now 10th on the list, and bound for page 2 any second now.

    The only official Google respons I can find on the matter is a post by Jeremy Hylton in their google forums (dated November 2008).

    We expected some problems from blogroll matches, but may have
    underestimated the impact on searches using the link: operator or
    where the query matches a blog or blogger’s name. We do expect to fix
    the problem you’re seeing. We’ll use the full page content, but
    exclude the content that isn’t really part of the post. I’m not sure
    if we’ll be able to make the change before the end of the year, but we
    are working on it and are pretty confident that it can be solved.
    We’ll post an update here when we’ve got a solution.

    And no, there is no update to that post.

    The hoopla from other blog sites has died down, but as this is still a prevalent problem on the blog search, I would really like to see it heat up again. Google’s blog search is pretty much dead useless to me if I can’t find information I want. As finding what I want is the whole point of Google (they said it first), they’ve made themselves irrelevant.