python3的爬蟲庫與python2庫的區(qū)別較大,python3將urllib2和urllib直接合并成了一個庫——urllib,在其下面有四個模塊,分別為request,parse,error,robotparser模塊,在request之下的urlopen方法,方法原型如下:urlopen(url,data=none),這個方法返回的是一個response對象,其中參數(shù)url可以使一個request對象,也可以是一個字符串,該方法等價于:
res=urlib.request.Request(url)
response=urlli.request.urlopen(res)
再回過頭來說一下urlopen里面的data參數(shù),首先需要構(gòu)造一個字典,然后再用urllib.urlencode()進(jìn)行轉(zhuǎn)化為相應(yīng)的格式。
由于我的機(jī)器裝的是python2.7版本,所以需要將這個代碼進(jìn)行轉(zhuǎn)化,所以,我在網(wǎng)上查閱了一下相應(yīng)的對應(yīng)代碼:
Python 2 name
| Python 3 name | |
| urllib.urlretrieve() | urllib.request.urlretrieve() |
| urllib.urlcleanup() | urllib.request.urlcleanup() |
| urllib.quote() | urllib.parse.quote() |
| urllib.quote_plus() | urllib.parse.quote_plus() |
| urllib.unquote() | urllib.parse.unquote() |
| urllib.unquote_plus() | urllib.parse.unquote_plus() |
| urllib.urlencode() | urllib.parse.urlencode() |
| urllib.pathname2url() | urllib.request.pathname2url() |
| urllib.url2pathname() | urllib.request.url2pathname() |
| urllib.getPRoxies() | urllib.request.getproxies() |
| urllib.URLopener | urllib.request.URLopener |
| urllib.FancyURLopener | urllib.request.FancyURLopener |
| urllib.ContentTooShortError | urllib.error.ContentTooShortError |
| urllib2.urlopen() | urllib.request.urlopen() |
| urllib2.install_opener() | urllib.request.install_opener() |
| urllib2.build_opener() | urllib.request.build_opener() |
| urllib2.URLError | urllib.error.URLError |
| urllib2.HTTPError | urllib.error.HTTPError |
| urllib2.Request | urllib.request.Request |
| urllib2.OpenerDirector | urllib.request.OpenerDirector |
| urllib2.BaseHandler | urllib.request.BaseHandler |
| urllib2.HTTPDefaultErrorHandler | urllib.request.HTTPDefaultErrorHandler |
| urllib2.HTTPRedirectHandler | urllib.request.HTTPRedirectHandler |
| urllib2.HTTPCookieProcessor | urllib.request.HTTPCookieProcessor |
| urllib2.ProxyHandler | urllib.request.ProxyHandler |
| urllib2.HTTPPassWordMgr | urllib.request.HTTPPasswordMgr |
| urllib2.HTTPPasswordMgrWithDefaultRealm | urllib.request.HTTPPasswordMgrWithDefaultRealm |
| urllib2.AbstractBasicAuthHandler | urllib.request.AbstractBasicAuthHandler |
| urllib2.HTTPBasicAuthHandler | urllib.request.HTTPBasicAuthHandler |
| urllib2.ProxyBasicAuthHandler | urllib.request.ProxyBasicAuthHandler |
| urllib2.AbstractDigestAuthHandler | urllib.request.AbstractDigestAuthHandler |
| urllib2.HTTPDigestAuthHandler | urllib.request.HTTPDigestAuthHandler |
| urllib2.ProxyDigestAuthHandler | urllib.request.ProxyDigestAuthHandler |
| urllib2.HTTPHandler | urllib.request.HTTPHandler |
| urllib2.HTTPSHandler | urllib.request.HTTPSHandler |
| urllib2.FileHandler | urllib.request.FileHandler |
| urllib2.FTPHandler | urllib.request.FTPHandler |
| urllib2.CacheFTPHandler | urllib.request.CacheFTPHandler |
| urllib2.UnknownHandler | urllib.request.UnknownHandler |
新聞熱點(diǎn)
疑難解答