国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > Python > 正文

使用python爬蟲實(shí)現(xiàn)網(wǎng)絡(luò)股票信息爬取的demo

2020-02-16 11:28:50
字體:
供稿:網(wǎng)友

實(shí)例如下所示:

import requestsfrom bs4 import BeautifulSoupimport tracebackimport re def getHTMLText(url): try:  r = requests.get(url)  r.raise_for_status()  r.encoding = r.apparent_encoding  return r.text except:  return "" def getStockList(lst, stockURL): html = getHTMLText(stockURL) soup = BeautifulSoup(html, 'html.parser')  a = soup.find_all('a') for i in a:  try:   href = i.attrs['href']   lst.append(re.findall(r"[s][hz]/d{6}", href)[0])  except:   continue def getStockInfo(lst, stockURL, fpath): for stock in lst:  url = stockURL + stock + ".html"  html = getHTMLText(url)  try:   if html=="":    continue   infoDict = {}   soup = BeautifulSoup(html, 'html.parser')   stockInfo = soup.find('div',attrs={'class':'stock-bets'})    name = stockInfo.find_all(attrs={'class':'bets-name'})[0]   infoDict.update({'股票名稱': name.text.split()[0]})       keyList = stockInfo.find_all('dt')   valueList = stockInfo.find_all('dd')   for i in range(len(keyList)):    key = keyList[i].text    val = valueList[i].text    infoDict[key] = val       with open(fpath, 'a', encoding='utf-8') as f:    f.write( str(infoDict) + '/n' )  except:   traceback.print_exc()   continue def main(): stock_list_url = 'http://quote.eastmoney.com/stocklist.html' stock_info_url = 'https://gupiao.baidu.com/stock/' output_file = 'D:/BaiduStockInfo.txt' slist=[] getStockList(slist, stock_list_url) getStockInfo(slist, stock_info_url, output_file) main()

優(yōu)化并且加入進(jìn)度條顯示

import requestsfrom bs4 import BeautifulSoupimport tracebackimport redef getHTMLText(url, code="utf-8"): try:  r = requests.get(url)  r.raise_for_status()  r.encoding = code  return r.text except:  return ""def getStockList(lst, stockURL): html = getHTMLText(stockURL, "GB2312") soup = BeautifulSoup(html, 'html.parser') a = soup.find_all('a') for i in a:  try:   href = i.attrs['href']   lst.append(re.findall(r"[s][hz]/d{6}", href)[0])  except:   continuedef getStockInfo(lst, stockURL, fpath): count = 0 for stock in lst:  url = stockURL + stock + ".html"  html = getHTMLText(url)  try:   if html == "":    continue   infoDict = {}   soup = BeautifulSoup(html, 'html.parser')   stockInfo = soup.find('div', attrs={'class': 'stock-bets'})   name = stockInfo.find_all(attrs={'class': 'bets-name'})[0]   infoDict.update({'股票名稱': name.text.split()[0]})   keyList = stockInfo.find_all('dt')   valueList = stockInfo.find_all('dd')   for i in range(len(keyList)):    key = keyList[i].text    val = valueList[i].text    infoDict[key] = val   with open(fpath, 'a', encoding='utf-8') as f:    f.write(str(infoDict) + '/n')    count = count + 1    print("/r當(dāng)前進(jìn)度: {:.2f}%".format(count * 100 / len(lst)), end="")  except:   count = count + 1   print("/r當(dāng)前進(jìn)度: {:.2f}%".format(count * 100 / len(lst)), end="")   continuedef main(): stock_list_url = 'http://quote.eastmoney.com/stocklist.html' stock_info_url = 'https://gupiao.baidu.com/stock/' output_file = 'BaiduStockInfo.txt' slist = [] getStockList(slist, stock_list_url) getStockInfo(slist, stock_info_url, output_file)main()            
發(fā)表評(píng)論 共有條評(píng)論
用戶名: 密碼:
驗(yàn)證碼: 匿名發(fā)表
主站蜘蛛池模板: 图片| 铜鼓县| 寻甸| 商洛市| 玉田县| 罗定市| 儋州市| 长春市| 邢台市| 丰顺县| 喀喇| 南京市| 屯昌县| 万山特区| 泰兴市| 石阡县| 蓬溪县| 湘阴县| 景泰县| 西乌珠穆沁旗| 大同县| 二连浩特市| 镇远县| 布尔津县| 于田县| 宁陕县| 高邮市| 观塘区| 丁青县| 鹿邑县| 抚宁县| 屏南县| 清河县| 萝北县| 普格县| 睢宁县| 宁武县| 阿鲁科尔沁旗| 泊头市| 临潭县| 定南县|