[讨论]关于读取下载下来的网页源代码的问题

主题：[讨论]关于读取下载下来的网页源代码的问题

liujingnjau [专家分：50] 发布于 2007-06-19 10:22:00

我用webzip把网页下载下来了，想读取下载的网页的源代码。
我用的是用打开文本的方式打开的，即用filestream和streamreader，但是有的网页可以读取源代码，有的读出来的是乱码。我想可能是网页编码的问题。于是在网上找了判断文本文件编码格式的代码，没用。我要读取的网页有的是繁体字。现在把部分代码贴出来，大家帮我想想办法。
dim fs As System.IO.FileStream
        Dim sr As System.IO.StreamReader


        Dim strtemp As String
        Dim strfulltext As String
        Dim strfilename As String
        Dim enco As System.Text.Encoding




       strfilename = "e:\…….htm"
       enco = getcode(strfilename)  '''''调用读取文本文件的编码格式的函数
       strfulltext = ""

      fs = New System.IO.FileStream(strfilename, IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.Read)
     sr = New System.IO.StreamReader(fs, enco)
     Do
        strtemp = sr.ReadLine
        If Not strtemp Is Nothing Then
            strfulltext = strfulltext & strtemp
        End If
     Loop

     TextBox1.Text = strfulltext

主题：[讨论]关于读取下载下来的网页源代码的问题

回复列表（共0个回复）

我来回复

程序员工具箱 new

代码片段

本版新帖

主题：[讨论]关于读取下载下来的网页源代码的问题

回复列表 （共0个回复）

我来回复

程序员工具箱 new

代码片段

本版新帖

回复列表（共0个回复）