Site 一个废弃的站,发现只有四个google收录了,分别是:
com/images/?DA
com/images/?SA
com/images/?ND
com/images/?MA
搞不懂
而另一个网站的访问日志也有问题,请求的浏览器是Chrome/4.1.249.1045
当然,也可能是其它的,请求的地址和上面的有关系:
“GET /?C=N;O=D HTTP/1.1″
“GET /?C=M;O=A HTTP/1.1″
“GET /?C=S;O=A HTTP/1.1″
“GET /?C=D;O=A HTTP/1.1″
也是这四个鬼东西
别外,见老外的问答:
Hi,
Googlebot/2.1 (IP is 66.249.72.134) try to get such following starnge URL at our site, which it always get 404 error,
get /sub1/tacmpa.h%20…
get /sub2/ke-es%20…
Also this googlebot/2.1 also try to get some picture at starange format like,
get /pictures/MA/?C=S;O=A HTTP/1.1 200 584
get /pictures/PJ/?C=S;O=A HTTP/1.1 200 5706
get /pictures/IQ/?C=N;O=D HTTP/1.1 200 5904
Here the googlebot get 200 status, we cannot understand why,
1) It try to get picture at such strange URL, we never have such links from external or internal.
2) why googlebot/2.1 to get image? Should have another googlebot which spider images, right?
Thanks.
Here the googlebot get 200 status
The first order of business is fixing that so that you serve a 404 Not Found. Otherwise the 200′s may sink you!
You may never know exactly where those URLs are coming from. Googlebot may have a coding error, or may be testing your server to see how you respond to bad URLs (that happens.) Some competitor might have noticed that you are vulnerable and is now posting those bad links somewhere. Or they may be surfing to those bad urls with the Toolbar turned on. Someone may be directly submitting those URLs. Or you may have a dynamic script on your site that is misfiring somehow.
So the urgent need you have is to start responding with a 404. Then, if you want, you can be a detective and find the source.
get /pictures/MA/?C=S;O=A HTTP/1.1 200 584
get /pictures/PJ/?C=S;O=A HTTP/1.1 200 5706
get /pictures/IQ/?C=N;O=D HTTP/1.1 200 5904
You get those if you rearrange directorylisting depending on Name,Last modified,Size or Description
不是很清楚,不像是漏洞,而更可能是google的一个测试,或bug
现在才知道,这是目录的索引排序而已,如果网站没有index,并且没禁用目录索引的话
