python urllib爬取百度云連接的實例代碼

2019-11-25 16:05:12

字體：大中小

來源：轉載

供稿：網友

翻看自己以前寫的程序，發現寫過一個爬取盤多多百度云資源的東西，完全是當時想看變形金剛才自己寫的，而且當時第一次接觸python大概寫了有2天才搞出來這個程序，學習python語言，可以看得出來那時候的代碼寫的真的low。雖然現在也不怎么樣，哈哈，一直學習中，不做過多解釋，上圖上代碼，因為變量的聲明是什么我自己也忘了（手動傲嬌），連寫入文件當時都不會哈哈哈哈哈哈哈哈，也不知道class中可以通過init初始化，唉學習python原來我學到了這么多東西，感謝python

from bs4 import BeautifulSoupimport urllibimport requestsimport readr =[]''''對搜素資源名字進行url編碼'''search_text =raw_input('請輸入搜索資源名：')search_text = search_text.decode('gbk')search_text = search_text.encode('utf-8')search_text = urllib.quote(search_text)''''獲取文件地址'''home = urllib.urlopen('http://www.panduoduo.net/s/name/'+search_text)'''獲取百度云地址'''def getbaidu(adr):  for i in adr:    url = urllib.urlopen('http://www.panduoduo.net'+i)    bs = BeautifulSoup(url)    bs1 = bs.select('.dbutton2')    href = re.compile('http/%(/%|/d|/w|////|//|/.)*')    b = href.search(str(bs1))    name = str(bs.select('.center')).decode('utf-8')    text1 = re.compile('/<h1/sclass/=/"center"/>[/d|/w|/D|/W]*/</h1/>')    text2 = text1.search(name)    rag1 = re.compile('/>[/d|/w|/D|/W]*/<')    if text2:      text3 = rag1.search(text2.group())      if text3:        print text3.group()    if b:      text = urllib.unquote(str(b.group())).decode('utf-8')      print text'''初始化'''def init(adr):  soup = BeautifulSoup(home)  soup = soup.select('.row')  pattern = re.compile('//r///d+')  for i in soup:    i = str(i)    adress = pattern.search(i)    adress = adress.group()    adr.append(adress)print 'running---------'    init(adr)getbaidu(adr)

以上就是本文的全部內容，希望對大家的學習有所幫助，也希望大家多多支持武林網。

上一篇：Python之str操作方法(詳解)

下一篇：Python的IDEL增加清屏功能實例