使用requests爬取网站的调试过程

使用requests爬取网站的调试过程

目的检查是否跳转,cookies,headers的问题! 打开logging的调试开关


import logging
logging.basicConfig(level=logging.DEBUG)


例子:


In [34]: import logging

In [35]: import requests

In [36]: r = requests.get('http://www.xhj.com/ershoufang/pg1io71/')
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.xhj.com:80
DEBUG:urllib3.connectionpool:http://www.xhj.com:80 "GET /ershoufang/pg1io71/ HTTP/1.1" 302 258
DEBUG:urllib3.connectionpool:http://www.xhj.com:80 "GET /ershoufang/pg1io71/ HTTP/1.1" 200 None


由此可见,发生了跳转!

常用库

  • fake_useragent
  • requests
Loading Disqus comments...
Table of Contents