Robots.txt to Disallow All but Adsense Mediabot
May 7th, 2007 by Stefan JuhlSometimes I’m well on my way to make something way more complicated than it really is. Today I was on the verge to cloak my robots.txt file on some creative domain parking stuff.
My simple goal was to block all bots but the Google Adsense Mediabot (Mediapartners-Google/2.1), since I didn’t want to blast out millions of duplicate pages to the search engine spiders, yet I wanted to monetize the websites through Adsense.
My first thought was that if I just made the robots.txt files disallow all bots then Mediabot wouldn’t crawl them either, and that wouldn’t be good. The first solution that occurred to me was cloaking it. Not smart…
I figured it seemed a bit like overdoing it, so after using my extraordinary googling skills
I found the solution which is dead simple. Just start your robots.txt file with disallowing Mediapartners-Google* from “nothing”. Like the example below.
User-agent: Mediapartners-Google*
Disallow:
User-agent: *
Disallow: /content/
Today’s reminder: before coding something “really smart”, step back and consider if there’s already a simple solution to the problem.
Posted in White Hat SEO, Monetization |











July 24th, 2007 at 5:30 pm
Hi.. Not too hot on my robots.txt, but my logic tells me the wildcard would override the mediapartners allow?
August 9th, 2007 at 11:36 pm
Yes, one could see it that way. But I think the logic is, that if a robot finds specific instructions, then it won’t continue to apply additional general rules.
We should also remember that the search engines and especially Google tend to make their own standards