【ShoppingPeeker】-基于Webkit内核的爬虫蜘蛛引擎 ShoppingWebCrawler的姊妹篇-可视化任务Web管理...
?
ShoppingPeeker
這個項目是蜘蛛項目的可視化任務站點。
項目github地址:ShoppingPeeker
開發語言:C#
開發工具:Visual Studio 2017 +.Net Core2.1
運行平臺:Linux/Windows
?
# 概述 ShoppingPeeker 是項目:ShoppingWebCrawler的可視化任務工具。
項目使用.net core2.x進行構建??梢赃\行在Windwos/Linux/Mac平臺。
項目采用Socket通信模式,實現本地采集任務與蜘蛛服務器進行通信。穩定高效。
如何部署?
1、克隆ShoppingWebCrawler項目到本地。如:d:\src\ShoppingWebCrawler;
2、克隆ShoppingPeeker項目到本地。d:\src\ShoppingPeeker;
(注意:兩個項目必須在同一個目錄下,因為使用了文件編譯引用!!!!)
3、啟動蜘蛛 ShoppingWebCrawler。
4、使用visual studio 2017 ? 或者cmd/powershell 進入項目文件夾 ? 使用命令:dotnet build;dotnet run
5、恭喜,項目成功啟動,示范站點的輸入框,是一個根據輸入的商品詞,抓取對應電商平臺的商品列表的功能示范!
項目構成
ShoppingPeeker.Web: .net core 2.x asp.net mvc 站點 插件模式: 不同的電商平臺使用插件模式進行采集任務的解析。在站點啟動的時候,掃描插件目錄。對插件進行附加,并監視查詢的變更!
數據持久化
基于Dapper的數據交互DataAccess.封裝對Linq 方式,實現對原始表數據的 增刪改查操作。
TCP 進行蜘蛛通信示范
封裝TCP通信,基于ADP.NET的連接方式,上手簡單容易,可配置。
基本通信:
var connStr = ConfigHelper.WebCrawlerSection.ConnectionStringCollection.First();using (var conn = new SoapTcpConnection(connStr)){if (conn.State == ConnectionState.Closed){conn.Open();}///發送pingvar str = conn.SendString(CommandConstants.CMD_Ping);resut = string.Concat("time :", DateTime.Now.ToString(), "; tcp server response: ", str);}?
采集請求:
var connStrConfig = webArgs.SystemAttachParas["SoapTcpConnectionString"] as WebCrawlerConnection;//重寫解析地址-首頁的分片jsonp地址string urlOfSlicedJsonp = this.ResolveSlicedSearchPageSilcedUrl(webArgs, next_start, show_items);webArgs.ResolvedUrl = new ResolvedSearchUrlWithParas { Url = urlOfSlicedJsonp };using (var conn = new SoapTcpConnection(connStrConfig)){if (conn.State == ConnectionState.Closed){conn.Open();}//發送soapvar soapCmd = new SoapMessage() { Head = CommandConstants.CMD_FetchPage };soapCmd.Body = JsonConvert.SerializeObject(webArgs);var dataContainer = conn.SendSoapMessage(soapCmd);if (null != dataContainer && dataContainer.Status == 1){htmlItemsContent = dataContainer.Result;}else{StringBuilder errMsg = new StringBuilder("抓取網頁請求失敗!參數:");errMsg.Append(soapCmd.Body);if (null != dataContainer && !string.IsNullOrEmpty(dataContainer.ErrorMsg)){errMsg.Append(";服務端錯誤消息:").Append(dataContainer.ErrorMsg);}PluginContext.Logger.Error(errMsg.ToString());}}?
數據交互示范
////// 增加 ///[TestMethod()] public void AddOneStudentsModelTest() {var model = new StudentsModel{Name = "你猜猜-" + DateTime.Now.ToString(),Age = DateTime.Now.Second,Sex = true,Score = 55.98m,Longitude = 555555.6666,AddTime = DateTime.Now};var result = serviceOfStudents.AddOneStudentsModel(model);var watch = new System.Diagnostics.Stopwatch();watch.Start();model = new StudentsModel{Name = "你猜222222222猜-" + DateTime.Now.ToString(),Age = DateTime.Now.Second,Sex = true,Score =6655.98m,Longitude = 99999999,AddTime = DateTime.Now};result = serviceOfStudents.AddOneStudentsModel(model);watch.Stop();Console.WriteLine(string.Format("real for insert one data use time is :{0} ms.", watch.ElapsedMilliseconds));Assert.IsTrue(result > 0);}/// <summary>/// 批量增加/// </summary> [TestMethod()]public void AddMulitiStudentsModelsTest(){var lstData = new List<StudentsModel>();var rand = new Random(DateTime.Now.Millisecond);for (int i = 0; i < 100; i++){var model = new StudentsModel{Name = "你猜猜-" + Guid.NewGuid().ToString(),Age = rand.Next(1, 100),Sex = false,Score = 33355.98m,Longitude = 59595959,AddTime = DateTime.Now};lstData.Add(model);}var result = serviceOfStudents.AddMulitiStudentsModels(lstData);Assert.IsTrue(1 == 1);}/// <summary>/// 更新數據實體-by主鍵/// </summary> [TestMethod()]public void UpdateOneStudentsModelTest(){var model = new StudentsModel{Id = 1,Age = 100};var result = serviceOfStudents.UpdateOneStudentsModel(model);Assert.IsTrue(result);}/// <summary>/// 條件更新/// 多個條件 and /// /// </summary> [TestMethod()]public void UpdateStudentsModelsByConditionTest(){var model = new StudentsModel{Age = 333};var result = serviceOfStudents.UpdateStudentsModelsByCondition(model,x => x.Id > 0 && x.Name.Contains("你猜猜%"));Assert.IsTrue(result);}/// </summary> [TestMethod()]public void GetstudentsElementByIdTest(){var model = this.serviceOfStudents.GetstudentsElementById(1);Assert.IsTrue(null!= model);}/// <summary>/// 條件獲取/// 或/// </summary> [TestMethod()]public void GetfStudentsElementsByConditionTest(){var lstData = this.serviceOfStudents.GetstudentsElementsByCondition(x => x.Id == 1 || x.Name.Contains("你猜猜%"));//(x => x.PubSubWsAddr.LenFuncInSql() > 0); Assert.IsTrue(lstData.Count > 0);lstData = this.serviceOfStudents.GetstudentsElementsByCondition(null);Assert.IsTrue(lstData.Count > 0);}/// <summary>/// 條件刪除 /// </summary> [TestMethod()]public void DeleteMulitiservicesAddressByConditionTest(){//var result = this.serviceOfStudents// .DeleteMulitiservicesAddressByCondition(x => x.PubSubWsAddr.LenFuncInSql() > 0);var result = this.serviceOfStudents.DeleteMulitistudentsByCondition(x => x.Id == 1 || x.Name.Contains("你猜猜%"));Assert.IsTrue(result);}//多個查詢條件構建 (使用Lambda表達式構建 進行條件body的合并) [TestMethod()]public void GetByMultipleConditionsTest(){//組合條件var predicate = PredicateBuilder.CreatNew<StudentsModel>();string id = "55";if (!string.IsNullOrEmpty(id) && id.ToInt() > 0){predicate = predicate.And(s => s.Id <= id.ToInt());}//開始組合表達式bodypredicate = predicate.Or(s => s.Name.Contains("你猜猜-2%"));var model = this.serviceOfStudents.GetstudentsElementsByCondition(predicate);Assert.IsNotNull(model);}[TestMethod()]public void GetStudentsModelsElementsByPagerAndConditionTest(){var pageSize = 10;var pageIndex = 0;var totalRecords = -1;var totalPages = -1;var lstData = this.serviceOfStudents.GetstudentsElementsByPagerAndCondition(pageIndex,pageSize,out totalRecords,out totalPages, x => x.Id > 0,//PubSubWsAddr.LenFuncInSql() "Id",OrderRule.DESC);Assert.IsTrue(lstData.Count > 0);}?
聯系作者
MyBlog:http://www.cnblogs.com/micro-chen/?
QQ:1021776019
?
?
轉載于:https://www.cnblogs.com/micro-chen/p/9075658.html
總結
以上是生活随笔為你收集整理的【ShoppingPeeker】-基于Webkit内核的爬虫蜘蛛引擎 ShoppingWebCrawler的姊妹篇-可视化任务Web管理...的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 微信小程序申请开通了流程
- 下一篇: Proe4.0折叠椅产品建模设计视频教程