日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

利用机器学习方法对猪肉价格预测

發布時間:2023/12/14 编程问答 26 豆豆
生活随笔 收集整理的這篇文章主要介紹了 利用机器学习方法对猪肉价格预测 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

基于機器學習對豬肉價格預測

  • 豬肉價格預測
    • 問題背景
    • 導入數據
    • 一、支持向量機
    • 二、隨機森林
    • 三、 MLP神經網絡

豬肉價格預測

  • 支持向量機回歸
  • 隨機森林回歸
  • MLP神經網絡回歸
  • 問題背景

    “豬糧安天下”,生豬自古以來便在國計民生中占據著重要地位,豬肉是我國城鄉居民“菜籃子”中不可或缺的產品。但從 2018 年非洲豬瘟爆發以來,生豬產業遭到巨大沖擊,生豬市場價格波動頻繁,不僅給養殖者造成巨大的經濟損失,也給廣大消費者造成了很大困擾。2020 年新冠肺炎疫情突襲,再次對逐步恢復的生豬產業產生一定不利影響。
    (本文指標選取有待商榷,僅僅做著玩)

    導入數據

    # 安裝庫專用# 通過如下命令設定鏡像 options(repos = 'http://mirrors.ustc.edu.cn/CRAN/') # 查看鏡像是否修改 getOption('repos') # 嘗試下載R包 #若有需要,進行安裝 #install.packages('h2o')

    ‘http://mirrors.ustc.edu.cn/CRAN/’

    #設置工作路徑 setwd("D:/LengPY") #導入數據 library(readxl) data<- read_excel("liudata.xlsx") head(data) A tibble: 6 × 11日期活豬仔豬去骨牛肉帶骨羊肉豆粕小麥麩玉米育肥豬綜合飼料非洲豬瘟新冠疫情<dttm><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><lgl><lgl>
    2006-01-017.569.8918.6019.092.641.241.261.81NANA
    2006-02-017.119.4818.6518.762.751.241.271.83NANA
    2006-03-016.688.8518.3718.252.691.231.281.83NANA
    2006-04-016.217.8218.3318.412.601.211.281.82NANA
    2006-05-015.966.9818.3118.352.561.211.341.84NANA
    2006-06-016.086.8418.3218.232.541.201.391.86NANA
    data1<-data[,2:9] head(data1) A tibble: 6 × 8活豬仔豬去骨牛肉帶骨羊肉豆粕小麥麩玉米育肥豬綜合飼料<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
    7.569.8918.6019.092.641.241.261.81
    7.119.4818.6518.762.751.241.271.83
    6.688.8518.3718.252.691.231.281.83
    6.217.8218.3318.412.601.211.281.82
    5.966.9818.3118.352.561.211.341.84
    6.086.8418.3218.232.541.201.391.86
    ## 可視化特征之間的相關系數 library(corrplot) cor <- cor(data1) corrplot.mixed(cor,tl.col="black",tl.pos = "lt",tl.cex = 0.8,number.cex = 0.7)

    可根據相關系數結果,對變量相關性進行初步探索。

    一、支持向量機

    ## 支持向量機回歸模型 library(e1071) library(caret) library(Metrics) library(readr)system.time(svmreg <- svm(活豬~.,data =data1,kernel = "radial") ) user system elapsed 0.02 0.00 0.02 summary(svmreg) Call: svm(formula = 活豬 ~ ., data = data1, kernel = "radial")Parameters:SVM-Type: eps-regression SVM-Kernel: radial cost: 1 gamma: 0.1428571 epsilon: 0.1 Number of Support Vectors: 111 set.seed(123) index <- sample(nrow(data1),round(0.7*nrow(data1))) train_data <- data1[index,] test_data <-data1[-index,] #查看訓練集維度 dim(train_data)
  • 128
  • 8
  • ## 在訓練集上的誤差 train_pre <- predict(svmreg,train_data) train_mae <- mae(train_data$活豬,train_pre) sprintf("訓練集上的絕對值誤差: %f",train_mae)

    ‘訓練集上的絕對值誤差: 1.007821’

    test_pre <- predict(svmreg,test_data) test_mae <- mae(train_data$活豬,test_pre) sprintf("測試集上的絕對值誤差: %f",test_mae)

    ‘測試集上的絕對值誤差: 6.022539’

    data<-data.frame(train_data$活豬,train_pre)

    預測j效果良好,測試集誤差較大。

    total_pre <- predict(svmreg,data1[2:8]) total_mae <- mse(data1$活豬,total_pre) sprintf("全部上的絕對值誤差: %f",total_mae)

    ‘全部上的絕對值誤差: 1.651221’

    輸出對應預測結果:

    data5<-data.frame(total_pre,data1$活豬) colnames(data5)<-c('預測','實際') data5 A data.frame: 183 × 2預測實際<dbl><dbl>1234567891011121314151617181920212223...171172173174175176177178179180181182183
    7.081429 7.56
    7.326892 7.11
    7.125876 6.68
    6.871381 6.21
    6.957324 5.96
    7.221186 6.08
    7.400246 6.47
    7.658352 7.17
    7.709680 7.84
    7.599010 7.93
    7.786543 8.33
    8.612998 9.18
    8.892967 9.55
    9.065229 9.20
    9.556271 8.91
    9.566126 9.02
    10.18416510.20
    11.07995611.37
    12.17209313.12
    13.36289414.27
    14.09594113.60
    13.87140013.21
    14.40334314.13
    ......
    34.9867935.93
    34.0710233.41
    34.4410729.50
    34.5429930.97
    33.6316735.73
    32.1018136.50
    29.4296535.20
    31.7515230.93
    32.8146429.23
    32.5977932.40
    29.8033635.17
    29.0178031.43
    29.5993227.90

    二、隨機森林

    library(readr) library(VIM) library(rpart) library(rpart.plot) library(Metrics) library(ROCR) library(tidyr) library(randomForest) library(ggRandomForests) set.seed(123) index <- sample(nrow(data1),round(0.7*nrow(data1))) train_data <- data1[index,] test_data <-data1[-index,] rfreg <- randomForest(活豬~.,data = train_data,ntree=500) summary(rfreg) Length Class Mode call 4 -none- call type 1 -none- character predicted 128 -none- numeric mse 500 -none- numeric rsq 500 -none- numeric oob.times 128 -none- numeric importance 7 -none- numeric importanceSD 0 -none- NULL localImportance 0 -none- NULL proximity 0 -none- NULL ntree 1 -none- numeric mtry 1 -none- numeric forest 11 -none- list coefs 0 -none- NULL y 128 -none- numeric test 0 -none- NULL inbag 0 -none- NULL terms 3 terms call ## 可視化模型隨著樹的增加誤差OOB的變化 par(family = "STKaiti") plot(rfreg,type = "l",col = "red",main = "隨機森林回歸")

    可發現,在trees數量 30時即可獲得較好結果。

    ## 使用ggrandomforest包可視化誤差 plot(gg_error(rfreg))+labs(title = "隨機森林回歸")

    ## 可視化變量的重要性 importance(rfreg) A matrix: 7 × 1 of type dblIncNodePurity仔豬去骨牛肉帶骨羊肉豆粕小麥麩玉米育肥豬綜合飼料
    1938.7910
    1623.8613
    1522.7004
    143.6410
    517.8845
    215.0726
    406.2389
    varImpPlot(rfreg,pch = 20, main = "Importance of Variables")


    #可發現仔豬、牛、羊肉對豬肉價格影響較大,因為互為替代品或前提(仔豬),故結果較為合理。

    ## 對測試集進行預測,并計算 Mean Squared Error rfpre <- predict(rfreg,test_data) sprintf("均方根誤差為: %f",mse(test_data$活豬,rfpre))

    ‘均方根誤差為: 1.646604’

    ## 參數搜索,尋找合適的 mtry參數,訓練更好的模型 ## Tune randomForest for the optimal mtry parameter set.seed(1234) rftune <- tuneRF(x = test_data[,2:8],y = test_data$活豬,stepFactor=1.5,ntreeTry = 500) mtry = 2 OOB error = 2.785136 Searching left ... Searching right ... mtry = 3 OOB error = 2.63388 0.05430821 0.05 mtry = 4 OOB error = 2.531242 0.03896842 0.05

    print(rftune) mtry OOBError 2 2 2.785136 3 3 2.633880 4 4 2.531242 ## OOBError誤差最小的mtry參數為6## 建立優化后的隨機森林回歸模型 rfregbest <- randomForest(活豬~.,data = train_data,ntree=500,mtry = 6)## 可視化兩種模型隨著樹的增加誤差OOB的變化 rfregerr <- as.data.frame(plot(rfreg))

    colnames(rfregerr) <- "rfregerr" rfregbesterr <- as.data.frame(plot(rfregbest))

    colnames(rfregerr) <- "rfregerr" rfregbesterr <- as.data.frame(plot(rfregbest))

    colnames(rfregbesterr) <- "rfregbesterr" plotrfdata <- cbind.data.frame(rfregerr,rfregbesterr) plotrfdata$ntree <- 1:nrow(plotrfdata) plotrfdata <- gather(plotrfdata,key = "Type",value = "Error",1:2) ggplot(plotrfdata,aes(x = ntree,y = Error))+geom_line(aes(linetype = Type,colour = Type),size = 0.9)+theme(legend.position = "top")+ggtitle("隨機森林回歸模型")+theme(plot.title = element_text(hjust = 0.5))

    ## 使用優化后的隨機森林回歸模型,對測試集進行預測,并計算 Mean Squared Error rfprebest <- predict(rfregbest,test_data[,2:8]) sprintf("優化后均方根誤差為: %f",mse(test_data$活豬,rfprebest))

    ‘優化后均方根誤差為: 1.660115’

    ## 使用優化后的隨機森林回歸模型,對測試集進行預測,并計算 Mean Squared Error #全部數據 total <- predict(rfregbest,data1[,2:8]) sprintf("優化后均方根誤差為: %f",mse(data1$活豬,total))

    ‘優化后均方根誤差為: 0.773364’

    #預測結果 totalpre<-data.frame(data1$活豬,total) totalpre <tr><th scope=row>172</th><td>33.41</td><td>33.70117</td></tr> <tr><th scope=row>173</th><td>29.50</td><td>31.42532</td></tr> <tr><th scope=row>174</th><td>30.97</td><td>32.10734</td></tr> <tr><th scope=row>175</th><td>35.73</td><td>34.39926</td></tr> <tr><th scope=row>176</th><td>36.50</td><td>35.47128</td></tr> <tr><th scope=row>177</th><td>35.20</td><td>34.27110</td></tr> <tr><th scope=row>178</th><td>30.93</td><td>32.03528</td></tr> <tr><th scope=row>179</th><td>29.23</td><td>31.95318</td></tr> <tr><th scope=row>180</th><td>32.40</td><td>32.44584</td></tr> <tr><th scope=row>181</th><td>35.17</td><td>33.15130</td></tr> <tr><th scope=row>182</th><td>31.43</td><td>32.34892</td></tr> <tr><th scope=row>183</th><td>27.90</td><td>32.95256</td></tr> A data.frame: 183 × 2data1.活豬total<dbl><dbl>123456789101112131415161718192021222324
    7.56 7.745717
    7.11 7.473767
    6.68 6.802603
    6.21 6.484864
    5.96 6.340616
    6.08 6.334053
    6.47 6.463621
    7.17 7.035551
    7.84 6.900711
    7.93 6.985457
    8.33 7.875041
    9.18 9.065969
    9.55 9.289838
    9.20 9.244121
    8.91 9.259116
    9.02 9.250085
    10.20 9.805307
    11.3710.590388
    13.1213.012034
    14.2713.206428
    13.6013.403372
    13.2113.176563
    14.1313.715155
    15.4615.510558
    ## 數據準備 index <- order(test_data$活豬) X <- sort(index) Y1 <- test_data$活豬[index] rfpre2 <- rfpre[index] rfprebest2 <- rfprebest[index]plotdata <- data.frame(X = X,Y1 = Y1,rfpre =rfpre2,rfprebest = rfprebest2) plotdata <- gather(plotdata,key="model",value="value",c(-X))## 可視化模型的預測誤差 ggplot(plotdata,aes(x = X,y = value))+geom_line(aes(linetype = model,colour = model),size = 0.8)+theme(legend.position = c(0.1,0.8),plot.title = element_text(hjust = 0.5))+ggtitle("隨機森林回歸模型")

    可發現預測結果良好,在豬瘟 時期價格預測偏差較大,可能 是未量化豬瘟i直接影響,故 可考慮進行改進。

    三、 MLP神經網絡

    library(RSNNS) set.seed(123) index <- sample(nrow(data1),round(0.7*nrow(data1))) train_data <- data1[index,] test_data <-data1[-index,] dim(train_data)## 數據max-min歸一化到0-1之間 data1[,2:8] <- normalizeData(data1[,2:8],"0_1") ## 豬肉價歸一化到0-1之間,并獲取歸一化參數 price <- normalizeData(data1$活豬,type = "0_1") NormParameters <- getNormParameters(price)
  • 128
  • 8
  • head(data1) A tibble: 6 × 8活豬仔豬去骨牛肉帶骨羊肉豆粕小麥麩玉米育肥豬綜合飼料<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
    7.560.0299872190.00463548250.0133678310.23954370.073684210.0000000000.000000000
    7.110.0259561500.00533782830.0086244070.28136880.073684210.0064516130.011049724
    6.680.0197620690.00140469170.0012936610.25855510.063157890.0129032260.011049724
    6.210.0096352370.00084281500.0035935030.22433460.042105260.0129032260.005524862
    5.960.0013764620.00056187670.0027310620.20912550.042105260.0516129030.016574586
    6.080.0000000000.00070234580.0010061810.20152090.031578950.0838709680.027624309
    hist(price)

    ## 數據切分 set.seed(123) datasplist <- splitForTrainingAndTest(data1[,2:8],price,ratio = 0.3)## MLP回歸模型 system.time( mlpreg <- mlp(datasplist$inputsTrain,datasplist$targetsTrain,maxit = 200,size = c(100,50,100,50),learnFunc = "Rprop",inputsTest=datasplist$inputsTest,targetsTest = datasplist$targetsTest,metric = "RSME") ) Warning message in snnsObject$setUnitName(num, iNames[[i]]): "SNNS error message in setUnitName : SNNS-Kernel Error: Symbol pattern invalid (must match [A-Za-z][^|, ]*)" Warning message in snnsObject$setUnitName(num, iNames[[i]]): "SNNS error message in setUnitName : SNNS-Kernel Error: Symbol pattern invalid (must match [A-Za-z][^|, ]*)" Warning message in snnsObject$setUnitName(num, iNames[[i]]): "SNNS error message in setUnitName : SNNS-Kernel Error: Symbol pattern invalid (must match [A-Za-z][^|, ]*)" Warning message in snnsObject$setUnitName(num, iNames[[i]]): "SNNS error message in setUnitName : SNNS-Kernel Error: Symbol pattern invalid (must match [A-Za-z][^|, ]*)" Warning message in snnsObject$setUnitName(num, iNames[[i]]): "SNNS error message in setUnitName : SNNS-Kernel Error: Symbol pattern invalid (must match [A-Za-z][^|, ]*)" Warning message in snnsObject$setUnitName(num, iNames[[i]]): "SNNS error message in setUnitName : SNNS-Kernel Error: Symbol pattern invalid (must match [A-Za-z][^|, ]*)" Warning message in snnsObject$setUnitName(num, iNames[[i]]): "SNNS error message in setUnitName : SNNS-Kernel Error: Symbol pattern invalid (must match [A-Za-z][^|, ]*)"user system elapsed 13.86 0.00 14.25 ## 可視化模型的效果 plotIterativeError(mlpreg,main = "MLP Iterative Erro")

    plotRegressionError(datasplist$targetsTrain,mlpreg$fitted.values,main="MLP train fit")

    plotRegressionError(datasplist$targetsTest,mlpreg$fittedTestValues,main="MLP test fit")

    ## 在訓練集上的誤差 mlppre_train <- denormalizeData(mlpreg$fitted.values,NormParameters) mlp_train <- denormalizeData(datasplist$targetsTrain,NormParameters) train_mae <- mae(mlp_train,mlppre_train) sprintf("訓練集上的絕對值誤差: %f",train_mae)

    ‘訓練集上的絕對值誤差: 0.456312’

    mlppre_test <- denormalizeData(mlpreg$fittedTestValues,NormParameters) mlp_test <- denormalizeData(datasplist$targetsTest,NormParameters) test_mae <- mae(mlp_test,mlppre_test) sprintf("測試集上的絕對值誤差: %f",test_mae)

    ‘測試集上的絕對值誤差: 6.967324’

    測試集誤差較大,需加以改進。

    如對豬瘟及新冠疫情進行加入分析。結合RNN等方法,獲得更精準結果。

    總結

    以上是生活随笔為你收集整理的利用机器学习方法对猪肉价格预测的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。