使用requests爬取网站的调试过程
使用requests爬取网站的调试过程
目的检查是否跳转,cookies,headers的问题! 打开logging的调试开关
import logging
logging.basicConfig(level=logging.DEBUG)
例子:
In [34]: import logging
In [35]: import requests
In [36]: r = requests.get('http://www.xhj.com/ershoufang/pg1io71/')
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.xhj.com:80
DEBUG:urllib3.connectionpool:http://www.xhj.com:80 "GET /ershoufang/pg1io71/ HTTP/1.1" 302 258
DEBUG:urllib3.connectionpool:http://www.xhj.com:80 "GET /ershoufang/pg1io71/ HTTP/1.1" 200 None
由此可见,发生了跳转!
常用库
- fake_useragent
- requests