Google’s Blog Search is Irrelevant

Google's blog search screws up SEO for everyone.

Google is a great search tool to find a website or general information about a topic, but quite frankly I’ve come to despise their blog search engine and I’m seeing serious flaws in their ranking app. In specific, they now search blog links (aka the blogroll) and when you search blogs about a topic, you get unrelated posts.

If you search for Laurence Fishburne because you saw him on an episode of MAS*H recently as a soldier with a racist CO, Google gives you two hits for IMDb, one for Wikipedia, one about news (GoogleNews that is), and then, finally, his official website. While Google claims they don’t adjust ratings (that is, they don’t give more or less weight to a website on their own) and allow their PageRank algorithm to sort all this out, it seems to me that any official website should be ranked first. Also, IMDb shouldn’t be listed twice. But that depends on what people are looking for and what Google offers.

We stand alone in our focus on developing the “perfect search engine,” defined by co-founder Larry Page as something that, “understands exactly what you mean and gives you back exactly what you want.”

With that in mind, as I look at their tech overview for people who aren’t super geeky, I think that they come to the process a little flawed. PageRank is a great idea, don’t get me wrong. The more pages that link to a site, the higher the site is ranked (in essence). Okay, that’s great! Until you have those damn splogs. You know the ones. Spam blogs that promise you information about a person/place/thing, but is nothing more than a ton of links and 100 popups.

Why would I search blogs? Easy, a lot of news sites are using blogs these days, and I want to read those too. It’s not rocket surgery, it’s how news is disseminated in 2009, folks. And I, personally, like to search by ‘date’ because I want to know what’s newest.

Our search engine also analyzes page content. However, instead of simply scanning for page-based text (which can be manipulated by site publishers through meta-tags), our technology analyzes the full content of a page and factors in fonts, subdivisions and the precise location of each word. We also analyze the content of neighboring web pages to ensure the results returned are the most relevant to a user’s query.

This looks like it should take care of spam blogs, but if you’ve ever done a search on blogs about someone (let’s use Mr. Fishburne again), you know it’s a crap-shoot.

A news search is actually pretty helpful. I get some articles of interest right up front. If I flip the bit and sort by date it’s still pretty useful. When I go to blog search (which is a sidebar link off news), it’s still mostly beneficial.

But I dare you, I dare you, to make sense of the articles when you click sort by date. Three posts on that first page might actually be something worth reading. Good luck finding them, and I hope they actually are what you want. But at the end of the day, those spam blogs aren’t the problem that makes me hate the blog-search.

No, the problem, as I see it, are the posts like this:
splog-1

That bit I circled for you means that the ‘label’ (tag, category, whatever) for ‘Laurence Fishburne’ has been used 4 times. Go to that post and you will not find a single thing on the page of use. 99.999% of these blogs are blogspot and, while I don’t begrudge them their posts, they’re getting false promotion! And your post that you lovingly crafted about how totally amazing Fishburne is, and how he acted the hell out of that scene last night is now 10th on the list, and bound for page 2 any second now.

The only official Google respons I can find on the matter is a post by Jeremy Hylton in their google forums (dated November 2008).

We expected some problems from blogroll matches, but may have
underestimated the impact on searches using the link: operator or
where the query matches a blog or blogger’s name. We do expect to fix
the problem you’re seeing. We’ll use the full page content, but
exclude the content that isn’t really part of the post. I’m not sure
if we’ll be able to make the change before the end of the year, but we
are working on it and are pretty confident that it can be solved.
We’ll post an update here when we’ve got a solution.

And no, there is no update to that post.

The hoopla from other blog sites has died down, but as this is still a prevalent problem on the blog search, I would really like to see it heat up again. Google’s blog search is pretty much dead useless to me if I can’t find information I want. As finding what I want is the whole point of Google (they said it first), they’ve made themselves irrelevant.

3 Comments

    • Jeremy how totally awesome of you to reply! Thank you. And I note that if I go to look for the last day of posts AND sort by date your suggestion still works. For Fishburne.

      When I try it for Jorja Fox I get the same old problem.

      I’m very happy you gays are working on it 🙂 What’s up next? forumsearch.google.com?

  1. Absently – I think WikiPedia and IMDb should always be the third and fourth sites that crop up on a search. Why? Because due to their high infiltration on the net (everyone and their mother points to them), they pop up above things like ‘official’ sites. If I look up SubVersion, the first link should be SVN’s webpage. Which it is. But you try it with any actor? Not so much.

Comments are closed.

%d bloggers like this: