日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

爬虫(转载的)

發布時間:2024/4/15 编程问答 24 豆豆
生活随笔 收集整理的這篇文章主要介紹了 爬虫(转载的) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
Code
你需要先得到網頁編碼。下面這段代碼可以解決大部分的網頁?
??
private?void?button3_Click(object?sender,?EventArgs?e)
????{
??????String[]?UrlList?
=?{
?????????????????????????
"http://www.kbs.co.kr/",
?????????????????????????
"http://rosemary.kbs.co.kr/",
?????????????????????????
"http://sbcx.saic.gov.cn/trade/index.jsp",
?????????????????????????
"http://www.csdn.net",
?????????????????????????
"http://www.google.cn/",
?????????????????????????
"http://www.baidu.com",
?????????????????????????
"http://www.javaeye.com/",
?????????????????????????
"http://blog.163.com/kel_scott66/blog/static/1150539632009614115635700/",
?????????????????????????
"http://www.sina.com.hk/",
?????????????????????????
"http://www.rthk.org.hk/"
????????????????????????};
??????
foreach?(String?u?in?UrlList)
??????{
????????textBox1.Text?
=?GetWebPage(u,?"GET");
????????MessageBox.Show(u);
??????}
????}

????
public?string?GetWebPage(string?uri,?string?method)
????{
??????
try
??????{
????????HttpWebRequest?req?
=?(HttpWebRequest)WebRequest.Create(uri);
????????req.Method?
=?method;
????????req.Timeout?
=?10000;
????????req.UserAgent?
=?"Mozilla/5.0?(Windows;?U;?Windows?NT?5.2;?zh-CN;?rv:1.9.1.4)?Gecko/20091016?Firefox/3.5.4?(.NET?CLR?3.5.30729)";
????????String?ReturnedEncoding?
=?"";
????????HttpWebResponse?res?
=?req.GetResponse()?as?HttpWebResponse;
????????Stream?ReceiveStream?
=?res.GetResponseStream();
????????StreamReader?sr?
=?new?StreamReader(ReceiveStream,?Encoding.UTF8);
????????
string?ReturnedContent?=?sr.ReadToEnd();

????????
if?(ReturnedEncoding?==?"")
????????{
??????????
//string?h?=?"<meta?http-equiv='Content-Type'?content='text/html;?charset=big5'>";
??????????Regex?reg_charset?=?new?Regex(@"charset\b\s*=\s*(?<charset>[^""|^'']*)");
??????????
if?(reg_charset.IsMatch(ReturnedContent))
??????????{
????????????ReturnedEncoding?
=?reg_charset.Match(ReturnedContent).Groups["charset"].Value;
??????????}
????????}

????????
if?(ReturnedEncoding?==?"")
????????{
??????????String?ct?
=?res.ContentType.ToLower().Replace("?",?"");
??????????
if?(ct.IndexOf("charset")?>?-1)
??????????{
????????????ReturnedEncoding?
=?ct.Substring(ct.IndexOf("charset=")?+?8);
??????????}
????????}

????????
if?(ReturnedEncoding?==?"")
????????{
??????????ReturnedEncoding?
=?res.ContentEncoding;
????????}


????????
if?(ReturnedEncoding?==?"")
????????{
??????????ReturnedEncoding?
=?res.CharacterSet;
????????}

????????Encoding?HtmlEncoding?
=?Encoding.Default;
????????
if?(ReturnedEncoding?!=?"")
????????{
??????????HtmlEncoding?
=?Encoding.GetEncoding(ReturnedEncoding);
????????}

????????req?
=?(HttpWebRequest)WebRequest.Create(uri);
????????req.Method?
=?method;
????????req.Timeout?
=?10000;
????????req.UserAgent?
=?"Mozilla/5.0?(Windows;?U;?Windows?NT?5.2;?zh-CN;?rv:1.9.1.4)?Gecko/20091016?Firefox/3.5.4?(.NET?CLR?3.5.30729)";
????????res?
=?req.GetResponse()?as?HttpWebResponse;
????????ReceiveStream?
=?res.GetResponseStream();
????????sr?
=?new?StreamReader(ReceiveStream,?HtmlEncoding);
????????ReturnedContent?
=?sr.ReadToEnd();
????????
return?ReturnedContent;
??????}
??????
catch
??????{
????????
return?"獲取失敗!";
??????}
????}

轉載于:https://www.cnblogs.com/z2002m/archive/2009/11/09/1599176.html

總結

以上是生活随笔為你收集整理的爬虫(转载的)的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。