Posts Tagged ‘seo’

People often say that I’m “the Web’s most famous blind user.” Well, let’s get this straight. I am not blind. Nor do I give a rat’s ass about your content.

My job is to collect content. And I can see. I see in code.

So let’s talk about a common practice with alternate content that irks me to no end. Even though I don’t care about your cat blog, I do read it. And you people do some pretty skeazy stuff. Sometimes it’s ignorant, and sometimes it’s not.

I read in the SEO forums that people puff their chests out proudly when they say that SEO helps with accessibility. But then they’ll go and pull crap like this:

<img src="/images/killer-bats.jpg" alt="Killer bats" />

How is that helping someone who can’t see a killer bat? Great, a picture of bats on the page about bats. Thanks for the information, douche.

What were the killer bats doing? Are they fluffy? How does it relate to the content? What job was the image doing on the page? Those are the questions that you should be answering in the alt attribute. Not, “What is it?” For those of us who can’t see, “what is it?” is about as helpful as your grandmother’s prophylactic stash.

Want proof that this is called good practice? Read Perfecting Keyword Targeting & On-Page Optimization by Rand Fishkin. Can you say “over-optimized?” I know what my cohorts do to content like this. We call it No Rank Town. Evidently management isn’t doing this guy so well.

And just because you have an image and the standards require an alt attribute, that doesn’t mean that you need to put something in the alt attribute. Again, how is this helping anyone out?

<img src="/images/spacer.gif" alt="spacer" />

C’mon, just do this so I can move along to actual content:

<img src="/images/spacer.gif" alt="" />

All in all, you guys suck at HTML. The next time you proudly call yourselves an “expert in accessibility” just because you know what an alt attribute is, you’ll know deep down how full of it you are. I hope that you whimper just a little bit at the end of your bragging. It may not be noticeable by others, but you’ll be aware, and that’s all that matters.

Technorati Tags: , , , , ,

A long time ago, in a galaxy far, far away, I was trained as a research scientist. Statistics, research methodology, hours in the library….all the things chicks dig, right? Although I’ve twice worked in the hallowed halls of academia, I’ve always leaned toward the applied side of science. I can theorize and hypothesize with the best of them, but I like my experiments conducted in the real world and I like to see the results that make a difference.

Page Hunt: A Bing GameI also appreciate it when I see clever research. And one of the most clever things I’ve seen recently is the Bing game: Page Hunt. Players are shown a series of web pages and asked to guess a keyword or words that can be used on the Bing search engine to make the site appear in the top 5 search results. Players are given 100 points for guessing a result that is in the #1 result slot, 90 points for #2 and so on. If the keyword(s) do not produce a top 5 result, you are given another chance. The game is timed (an interesting 2 minutes and 58 seconds) and there are a few interesting twists like bonuses for avoiding common terms. It’s an easy game and it’s already produced some interesting results. Why not play a round or two yourself?

According to Ars Technica, Page Hunt was developed by Microsoft Research Interns Chris Quirk and Raman Chandrasekar along with Hao Ma and Abhishek Gupta from The Chinese University of Hong Kong and Georgia Institute of Technology respectively. Before they unleashed the game on the unsuspecting and great unwashed of the web, they piloted the game internally. They found that the length of the page URL was negatively correlated to the ability of the player to correctly achieve a top 5 result. In other words, the shorter the URL, the easier it was to win.

Findability as a function of URL length from Ars Technica

While this is an interesting result, I seriously doubt that this is the real gold mine that is to be found in this research game. One of the things that has stuck with me from my graduate studies was the influential work of Karl Popper. Popper’s theories are a little domplex at times, but in a nutshell, he believed that an experimental result that contradicted a hypothesis was vastly more valuable than a result that confirmed the hypothesis. In other words, a single disconfirmation is worth a thousand confirmations.

What does this have to do with the price of tea in China (or Hong Kong for that matter)? Because I believe that the Page Hunt game’s core value is in using it for the results that players get wrong rather than the things they get right.

What is the value in having players guess a result that already appears in the top 5 results? Are you just confirming what is already working? Sure, it helps show that Bing is aligning well with player expectations. But again, these are all just confimatory results. What will be really interesting to mine are the results where there are clustered patterns of results produced by players that DO NOT appear in the top 5. What is it about a particular page that makes a player suggest a keyword they think best describes the page yet does not produce a top 5 result? That’s the real value here.

If you have ever effectively managed an internal search engine for a web site, this process is probably intuitively obvious to you. One of the best uses of internal search data is to look for keywords that users have entered that produce zero results (also referred to as null sets). Of course, this is typically only interesting if the web site in question actually has content that would be an appropriate “answer” for the null search query. The difference here is that the web site owner can typically “fix” this problem pretty directly. The search engine itself often has tools to allow the manual “promotion” of a page based on a specific query. But a search engine that indexes the entire web doesn’t have this luxury.

The Page Hunt game provides a crowd-sourced solution to this problem. Analyzing the keywords generated by players that do not produce a top 5 result could provide fertile ground for improving the Bing algorithm. It won’t be easy and it would certainly require Page Hunt/Bing to collect more page data than what is shown to players. It struck me that a similar game would be a useful tool for assessing the skills of SEO practitioners. If you allowed the person being tested to evaluate the page source, I believe that any SEO’er worth his or her salt should be able to deduce a top 5 search keyphrase at or above 80% of the time. It could be a more objective way to evaluate skill sets. Of course, the test itself would have to be normed first and it would require a large set of sites to avoid cheating. But it is still an interesting idea.

But I digress….

Using crowd-sourcing does carry some inherent risks. A motivated community could “game the game.” For example, let’s say the 4chan community decided to target the game and collectively distort the results of the pages that they are displayed during the game. They could all input the same irrelevant keyword for the same page. So, they could give the answer “Cleveland Steamer” for the page of a political candidate, for example. If enough of them provided the same keyword for the same page, it could carry some weight.

Of course, Microsoft isn’t likely to just automatically accept the results of the Page Hunt game. It’s just a research tool. It’s not a magic bullet answer to improving Bing’s algorithm. Human review is still a big part of the search engine industry. Remember the “miserable failure” Google bomb from a few years back? Ultimately, Google had to hand edit its results to eliminate the results of this prank. So I doubt there is any chance that we’ll see any “Cleveland Steamering” of Bing any time soon. But it would be amusing.

The only thing I can think of to improve the Page Hunt game is to increase the incentives to play. Right now, you can really only compete against yourself. A simple upgrade would be to have a community high scores page that encouraged players to compete against the community. Never under-estimate the human ego. Of course, they could up the odds with just a tiny prize incentive, like a t-shirt for high scorers. You wouldn’t believe the weird things people will do to win a cruddy little prize. Or maybe you would.

Penny Arcade

Technorati Tags: , , , , , ,

July 21, 2009

Googlebot Sad :(

I visit millions of sites a day like the freakin’ Santa Claus of the Web. I snoop around and make copies of everything that I run across. And I see tons of horrible shit.

I can’t tell who’s the real criminal. I steal copyrighted content all day for my company’s personal gain. But all of the “web professionals” out there charge an arm and a leg for crap HTML that even I can’t read. My company tells me to put it all in my bag and move on. We’ll sort it out later, they say. I’ll sort them out.

I do all of this, and my coworkers mock me with stupid blog posts. Someone cracked a joke that I “give good header” (on the first date, even!) after my bosses released one of those travesties of lost creativity. Where’s the respect? Do they know what I put up with?

I am Googlebot. And I am sad.

What makes me even more livid are the cheesy-ass photos on their blog posts.

Googlebot says: "Can I see your college thesis?" Website says: "Dearest Googlebot, the content hasn't changed in years. 304 not modified."

Do you think that’s how it really works? Does anyone really give me the convenience of a 304 header? Does anyone really know or care about what a 304 header is? NOOOOO…

Well, let me show you how it really works. This is how I spend 3/4 of my day.

Googlebot Porn

It makes me cry harder knowing that filthy bags of bolts like him serve the true demand.

I am Googlebot. And I am sad and abused. Will you be my friend? :(

Technorati Tags: , , ,