Surprise, surprise.. Google knows all you sites!
November 26th, 2006 by Stefan JuhlThere’s one thing that happened at PubCon which there has been quite a lot of buzz about. It’s the fact that Matt Cutts could lookup pretty much all the sites/domains you have. And so many people seem to be shocked about this. Seriously, what did those people expect?
First of all I’m pretty sure that this isn’t new. What Matt Cutts did under the ‘Interactive Site Reviews and SERP Quality Control Forum’ session was to “complain” about the guy, which site was being reviewed, having a bunch of semi-related domains. Apparently it came as a shock to many that Google could see this. I believe I saw Matt do almost the exact same thing at PubCon in Orlando back in February 2004. So why was there no buzz about it back then? It must be that the people attending PubCon in 2004 was of a higher technical level and had already realized that this is possible. (no offense..)
How come Google knows this?
What most are saying at the moment is that it must be because Google became a registrar a while back. So now they could be able to gather all the domain whois information. And there’s also a lot of talk about Google being able to gather whois info on private registrations.
I don’t know what Google are able to lookup and what they gather. But I do know one thing… Tracking down most of your websites isn’t that hard. Not even if I don’t have access to any whois information. Webmasters leave so many other footprints which will make one able to track them down.
As an example I can tell that the last time I found some very well made search engine spam I decided to dig a bit deeper into the sites this guy had. In just three hours I had found more than 100 of his sites. And this guy was actually quite careful e.g. using private domain registrations, different IP addresses, not interlinking etc.
There’s so many footprints…
If you’re using some of Google’s services it’s easy for them to determine sites possibly owned by you. The same applies to other search engines and them determining which sites are probably yours.
- Google Webmaster Central, Adwords / Analytics account, Toolbar etc.
Do you use Google’s webmaster central for you sites? Do you advertise your sites with Adwords under the same account? Do you track you sites with Google analytics? Do you have the Google toolbar installed and are you constantly visiting your own sites?
Footprints for search engines as well as others wanting to locate sites that are possibly yours.
- Templates, content etc.
Do you have your own custom templates and are you using them for multiple sites? Do you republish your content on multiple sites? Does the templates or content have unique footprints? - Links to your sites etc.
Are you linking between your sites? Do you tend to get links from the same sites, directories etc.? Does your sites link back to some kind of “main” site? Do you have a main site linking to all you other sites?
Footprints to determine how likely it is that sites are actually yours.
- Domain registration, IP address, DNS servers etc.
These are the simple ones. Does your domains have the same public whois information? Do you have your sites on your own dedicated IP’s? Do you have your own DNS servers? - Affiliate links, advertising networks, Adsense / YPN, payment systems etc.
If you’re using any kind of affiliate or advertising network across multiple sites of yours then it’s pretty easy to determine that the sites must be yours. You probably know that almost all these ads carry a unique id for your account. And if the same id is used on multiple sites it’s fair to say the sites belong to the same person since that person gets the revenue from them.
With all the above footprints in mind, imagine how easy it is to locate possible sites you own, and to determine if they’re really yours. And if you’re Google or any other search engine who constantly crawls the entire web, then you have so much data, that it won’t take more than a well made algorithm to determine almost all of your websites.
What you should look for when tracking down websites is the one kind of footprints which is impossible to hide. It’s all about who benefits from the website. This can be in many ways e.g. revenue, links etc. This can of course give some false positives but only very few I believe.
Posted in Black Hat SEO, White Hat SEO |











November 27th, 2006 at 11:29 am
I was not as PubCon but i already read about this on other forums and i must say that it is like a mit no one wanted to be true…
so now it is. just be more carefull on this stuff and will be ok
November 27th, 2006 at 12:09 pm
Hi RazvanG. My hole point with this is that it really wasn’t a myth or something new. For years it has pretty much been a fact. Most people just hadn’t realized it, apparently…
November 27th, 2006 at 9:53 pm
yes.
all must know that Google tends to become the next Microsoft. At the certain time probably people will tend to fill about it like they feel now on Microsoft.
They must be very carefull and smart not to get in that position and what Matt did at PubCon does not help them at all.
Anyway Google needs us and we need Google so … it is simple . Both parties work for each other
:) ( secret to success)
November 28th, 2006 at 9:12 pm
The point of my post wasn’t “omg, they have access to a list of all my domains”, it was more, why does Google care that much about other domains I have (that I may/may not be using) to bother making a point of mentioning it in that manner.
Further, just because they have access to the information to determine which domains are in a given person or corporation’s portfolio doesn’t mean there is reason to believe or assume that the data is being collected and analyzed. Matt’s comments at PubCon were, however, caused me to go “Oh hey, why are they keeping an eye on that?”
If a domain is just sitting idle, it shouldn’t count for or against another site just because the two domains happen to share the same registrant.
November 28th, 2006 at 9:56 pm
Hi cshel, I can see that from how my post is written it can be understood as I point you out as being shocked. That wasn’t my intent. You just had the best write-up of what happened!
You’re absolutely right to question whether they collect and analyze all this data. But we do know that they Google is insanely hungry for data
We also know they are working hard (struggling) to fight spam with their algorithms.
You’re also right when you say “it shouldn’t count for or against another site” because that would hurt many good sites and would give an opening for further “abuse” of Google’s algorithms.
November 29th, 2006 at 8:21 pm
[…] The buzz about Google knowing all your sites is still going strong. Some are still questioning if Google can see behind private registrations. Many has suggested that in some way or another Google is profiling webmasters. If you’re getting paranoid over all of this I strongly suggest you to continue reading this post. […]