Hot For Coding
雅虎spider你在干嘛

这几天访问量突增,流量/IO一下上去了,于是检查Nginx日志,发现有好多都是Spider爬行记录。谷歌Spider与百度Spider居多,剩余的什么有道、微软Bing、Yahoo等七八个都在爬。而雅虎Spider貌似比其他更顽固。

我有一个go页面是做留言者网站跳转用的,所以Spider爬到该页面的时候多半是302,最后Spider都会绕行。唯独雅虎Spider不放弃,每天坚持N次才罢休

110.75.171.110 - - [20/Mar/2013:17:07:17 +0800] "GET /20120554.html HTTP/1.1" 200 7965 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:10:20 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:10:26 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.108 - - [20/Mar/2013:17:10:35 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.112 - - [20/Mar/2013:17:10:48 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.111 - - [20/Mar/2013:17:10:54 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:10:56 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:10:58 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:11:00 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.108 - - [20/Mar/2013:17:11:02 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.112 - - [20/Mar/2013:17:11:07 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.111 - - [20/Mar/2013:17:11:17 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.110 - - [20/Mar/2013:17:11:26 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:11:32 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:11:36 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.108 - - [20/Mar/2013:17:11:38 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.112 - - [20/Mar/2013:17:11:39 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.111 - - [20/Mar/2013:17:11:41 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.110 - - [20/Mar/2013:17:11:44 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:11:48 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:11:55 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.173.195 - - [20/Mar/2013:17:59:07 +0800] "GET /201302280.html HTTP/1.1" 200 6544 "-" "Yahoo! Slurp China" -

以上是昨天的记录,基本上每天都有这种情况,再这样顽固下去我真要找一个对策,禁雅虎Spider,反正雅虎搜索也没啥人使用

最后附上上雅虎中国搜索小样:

full

TITLE: 雅虎spider你在干嘛

LINK: https://www.qttc.net/293_yahoo_spider.html

NOTE: 转载内容请注明出处