国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > .NET > 正文

ASP.NET實現抓取網頁中的鏈接

2024-07-10 12:55:05
字體:
來源:轉載
供稿:網友
輸入一個地址,就可以把那個網頁中的鏈接提取出來,下面這段代碼可以輕松實現,主要的是用到了正則表達式。
  
  geturl.aspx代碼如下:
  
  <%@ page language="<a target="_blank">vb</a>" codebehind="geturl.aspx.vb" autoeventwireup="false" inherits="aspx<a target="_blank">web</a>.geturl" %>
  <html>
  <head>
  <meta http-equiv="content-type" content="text/html; charset=gb2312">
  </head>
  <body>
  <form id="form1" method="post" runat="server">
   <p>
   <asp:label id="label1" runat="server"></asp:label>
   <asp:textbox id="urltextbox" runat="server" width="336px">
   http://lucky_elove.www1.dotnetplayground.com/
   </asp:textbox>
   <asp:button onclick="scrapebutton_click" id="scrapebutton" runat="server"></asp:button>
   </p>
   <hr width="100%" size="1">
   <p>
   <asp:label id="tipresult" runat="server"></asp:label>
   <asp:textbox id="resultlabel" runat="server" textmode="multiline"
   width="100%" height="400"></asp:textbox>
   </p>
  </form>
  </body>
  </html>
  后代碼geturl.aspx.vb如下:
  
  imports system.io
  imports system.net
  imports system.text
  imports system.text.regularexpressions
  imports system
  
  public class geturl
   inherits system.web.ui.page
   protected withevents label1 as system.web.ui.webcontrols.label
   protected withevents urltextbox as system.web.ui.webcontrols.textbox
   protected withevents scrapebutton as system.web.ui.webcontrols.button
   protected withevents tipresult as system.web.ui.webcontrols.label
   protected withevents resultlabel as system.web.ui.webcontrols.textbox
  
  #region " web 窗體設計器生成的代碼 "
  
   '該調用是 web 窗體設計器所必需的。
   <system.diagnostics.debuggerstepthrough()> private sub initializecomponent()
  
   end sub
  
   private sub page_init(byval sender as system.object, byval e as system.eventargs) handles mybase.init
   'codegen: 此方法調用是 web 窗體設計器所必需的
   '不要使用代碼編輯器修改它。
   initializecomponent()
   end sub
  
  #end region
  
   private sub page_load(byval sender as system.object, byval e as system.eventargs) handles mybase.load
   '在此處放置初始化頁的用戶代碼
   label1.text = "請輸入一個url地址:"
   scrapebutton.text = "分離href鏈接"
   end sub
   private report as new stringbuilder()
   private webpage as string
   private countofmatches as int32
  
   public sub scrapebutton_click(byval sender as system.object, byval e as system.eventargs)
   webpage = graburl()
   dim mydelegate as new matchevaluator(addressof matchhandler)
  
   dim linksexpression as new regex( _
   "/<a.+?href=['""](?!http/:////)(?!mailto/:)(?>foundanchor>[^'"">]+?)[^>]*?/>", _
   regexoptions.multiline or regexoptions.ignorecase or regexoptions.ignorepatternwhitespace)
  
   dim newwebpage as string = linksexpression.replace(webpage, mydelegate)
  
   tipresult.text = "<h2>從 " & urltextbox.text & "分離出的href鏈接</h2>" & _
   "<b>找到并整理" & countofmatches.tostring() & " 個鏈接</b><br><br>" & _
   report.tostring().replace(environment.newline, "<br>")
   tipresult.text &= "<h2>整理過的頁面</h2><script>window.document.title='抓取網頁中的鏈接'</script>"
   resultlabel.text = newwebpage
   end sub
  
   public function matchhandler(byval m as match) as string
   dim link as string = m.groups("foundanchor").value
   dim rtol as new regex("^", regexoptions.multiline or regexoptions.righttoleft)
   dim col, row as int32
   dim linebegin as int32 = rtol.match(webpage, m.index).index
  
   row = rtol.matches(webpage, m.index).count
   col = m.index - linebegin
  
   report.appendformat( _
   "link <b>{0}</b>, fixed at row: {1}, col: {2}{3}", _
   server.htmlencode(m.groups(0).value), _
   row, _
   col, _
   environment.newline _
   )
   dim newlink as string
   if link.startswith("/") then
   newlink = link.substring(1)
   else
   newlink = link
   end if
  
   countofmatches += 1
   return m.groups(0).value.replace(link, newlink)
   end function
  
   private function graburl() as string
   dim wc as new webclient()
   dim s as stream = wc.openread(urltextbox.text)
   dim sr as streamreader = new streamreader(s, system.text.encoding.default)
   graburl = sr.readtoend
   s.close()
   wc.dispose()
   end function
  
  end class
發表評論 共有條評論
用戶名: 密碼:
驗證碼: 匿名發表
主站蜘蛛池模板: 花垣县| 施甸县| 大足县| 千阳县| 昌邑市| 彰化县| 布拖县| 荥阳市| 瓦房店市| 乾安县| 武宁县| 磐安县| 余姚市| 晋州市| 吴江市| 海伦市| 阜平县| 仪征市| 黔东| 宁化县| 温宿县| 徐州市| 鄢陵县| 广西| 嘉定区| 郑州市| 永登县| 璧山县| 涿州市| 桦南县| 遵义县| 颍上县| 孟州市| 肥西县| 花莲县| 正安县| 武穴市| 台中县| 洛浦县| 景宁| 崇信县|