Intro Response About Meta Name Robots Content Noindex NoFollow

Intro Response About Meta Name Robots Content Noindex NoFollow

這篇介紹Response About Meta Name Robots Content Noindex NoFollow.

Response

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
收到的Response:
<html style="height:100%">
<head>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<meta name="format-detection" content="telephone=no">
<meta name="viewport" content="initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
</head>
<body style="margin:0px;height:100%">
<iframe src="/_Incapsula_Resource?SWUDNSAI=xxxxx"
frameborder=0 width="100%" height="100%" marginheight="0px" marginwidth="0px">
Request unsuccessful. Incapsula incident ID: xxxxx</iframe>
</body>
</html>

關鍵屬性為<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
代表對方認為你是網路爬蟲的Response回應,限制該訪問。

解決方法

1
2
1.若是正常合作的Request & Response,請對方加入該domain白名單。
2.若是爬蟲Process,加入headers參數:User-Agent或cookie模擬人為操作。