国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > Python > 正文

Python爬蟲基本使用

2019-11-08 03:04:43
字體:
供稿:網(wǎng)友

1、引入urllib庫。

2、發(fā)起請(qǐng)求。

3、讀取返回的內(nèi)容。

4、編碼設(shè)置。(b'為二進(jìn)制編碼,需要轉(zhuǎn)化為utf-8)

5、打印出來。

import urllib.requestresponse=urllib.request.urlopen("http://www.baidu.com")html=response.read()html=html.decode("utf-8")PRint(html)

二、下載圖片并保存到本地

import urllib.request#****this is the first way***#response = urllib.request.urlopen("https://img6.bdstatic.com/img/image/smallpic/weiju112.jpg")#****this is the second way***req = urllib.request.Request("https://img6.bdstatic.com/img/image/smallpic/weiju112.jpg")response=urllib.request.urlopen(req)cat_img = response.read()with open('aaaabbbbcccc.jpg','wb') as f:    f.write(cat_img)3、有道翻譯

import urllib.requestimport urllib.parseimport jsoncontent=input("Please input the content that you will translate:")url='http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&smartresult=ugc&sessionFrom=https://www.baidu.com/link'data={}data['action']='FY_BY_CLICKBUTTON'data['doctype']='json'data['i']=contentdata['keyfrom']='fanyi.web'data['type']='auto'data['typoResult']='true'data['ue']='UTF-8'data['xmlVersion']='1.8'data=urllib.parse.urlencode(data).encode("utf-8") response=urllib.request.urlopen(url,data)html=response.read().decode('utf-8')res=json.loads(html) #res is a directprint("The result:%s" % (res['translateResult'][0][0]['tgt']))4、有道翻譯增加頭部信息(1)(通過增加header信息參數(shù),創(chuàng)建頭部字典)。

import urllib.requestimport urllib.parseimport jsoncontent=input("Please input the content that you will translate:")url='http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&smartresult=ugc&sessionFrom=https://www.baidu.com/link'head={} # the info of req.header to imitate the Agent just like visiting the website by browserhead['User-Agent']="Mozilla/5.0 (Windows NT 6.3; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0"data={}data['action']='FY_BY_CLICKBUTTON'data['doctype']='json'data['i']=contentdata['keyfrom']='fanyi.web'data['type']='auto'data['typoResult']='true'data['ue']='UTF-8'data['xmlVersion']='1.8'data=urllib.parse.urlencode(data).encode("utf-8") #response=urllib.request.urlopen(url,data)req=urllib.request.Request(url,data,head)response=urllib.request.urlopen(req)html=response.read().decode('utf-8')res=json.loads(html) #res is a directprint("The result:%s" % (res['translateResult'][0][0]['tgt']))

5、有道翻譯增加頭部信息(2)(通過Request.add_header())。

import urllib.requestimport urllib.parseimport jsoncontent=input("Please input the content that you will translate:")url='http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&smartresult=ugc&sessionFrom=https://www.baidu.com/link''''head={} # the info of req.header to imitate the Agent just like visiting the website by browserhead['User-Agent']="Mozilla/5.0 (Windows NT 6.3; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0"'''data={}data['action']='FY_BY_CLICKBUTTON'data['doctype']='json'data['i']=contentdata['keyfrom']='fanyi.web'data['type']='auto'data['typoResult']='true'data['ue']='UTF-8'data['xmlVersion']='1.8'data=urllib.parse.urlencode(data).encode("utf-8") #response=urllib.request.urlopen(url,data)req=urllib.request.Request(url,data)req.add_header('User-Agent',"Mozilla/5.0 (Windows NT 6.3; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0")response=urllib.request.urlopen(req)html=response.read().decode('utf-8')res=json.loads(html) #res is a directprint("The result:%s" % (res['translateResult'][0][0]['tgt']))7、使用代理。

1、創(chuàng)建參數(shù)字典{‘type’:'proxy

proxy_support=urllib.request.ProxyHandler({})

2、 定制、創(chuàng)建opener。

opener=urllib.request.build_opener(proxy_support)3、安裝opener

urllibrequestinstall_opener(opener)

4、調(diào)用opener。

opener.open(url)

代碼如下

import urllib.requestimport randomimport timewhile True:    url='http://www.whatismyip.com.tw' #a website that can requery the ip of your device    iplist=['171.39.32.171:9999','112.245.170.47:9999','111.76.129.119:808','27.206.143.225:9999','114.138.196.144:9999'] #it shuld include the ip:port    #1、創(chuàng)建參數(shù)字典{‘type’:'proxy ip':'port'}    proxy_support=urllib.request.ProxyHandler({'http':random.choice(iplist)})    #proxy_support=urllib.request.ProxyHandler({'http':'123.163.219.132:81'})    #2、 定制、創(chuàng)建opener。    opener=urllib.request.build_opener(proxy_support)    opener.addheaders=[('User-Agent','Mozilla/5.0 (Windows NT 6.3; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0')]    #3、安裝opener    urllib.request.install_opener(opener)    res=urllib.request.urlopen(url)    html=res.read().decode('utf-8')    print(html)    time.sleep(5)


發(fā)表評(píng)論 共有條評(píng)論
用戶名: 密碼:
驗(yàn)證碼: 匿名發(fā)表
主站蜘蛛池模板: 揭西县| 灵宝市| 屏边| 漳州市| 衡东县| 福建省| 津南区| 荣成市| 平邑县| 北辰区| 黄平县| 彰化市| 舟山市| 阳城县| 甘孜| 东乌珠穆沁旗| 灵川县| 东源县| 鹤峰县| 介休市| 湖北省| 巢湖市| 怀安县| 金秀| 四会市| 台州市| 固始县| 惠州市| 福贡县| 万载县| 英德市| 北碚区| 什邡市| 瓦房店市| 安泽县| 西充县| 遵义县| 宽城| 那坡县| 北京市| 襄汾县|