线性回归之决定系数(coefficient of determination)
1. Sum Of Squares Due To Error
對(duì)于第i個(gè)觀(guān)察點(diǎn), 真實(shí)數(shù)據(jù)的Yi與估算出來(lái)的Yi-head的之間的差稱(chēng)為第i個(gè)residual, SSE 就是所有觀(guān)察點(diǎn)的residual的和
2. Total Sum Of Squares
3. Sum Of Squares Due To Regression
通過(guò)以上我們能得到以下關(guān)于他們?nèi)叩年P(guān)系
決定系數(shù): 判斷 回歸方程 的擬合程度
(coefficient of determination)決定系數(shù)也就是說(shuō): 通過(guò)回歸方程得出的 dependent variable 有 number% 能被 independent variable 所解釋. 判斷擬合的程度
(Correlation coefficient) 相關(guān)系數(shù) : 測(cè)試dependent variable 和 independent variable 他們之間的線(xiàn)性關(guān)系有多強(qiáng). 也就是說(shuō), independent variable 產(chǎn)生變化時(shí) dependent variable 的變化有多大.
可以反映是正相關(guān)還是負(fù)相關(guān)
參考鏈接:http://blog.csdn.net/ytdxyhz/article/details/51730995
注意此決定系數(shù)不能用來(lái)衡量非線(xiàn)性回歸的擬合優(yōu)度
Why Is It Impossible to Calculate a Valid R-squared for Nonlinear Regression?
R-squared is based on the underlying assumption that you are fitting a linear model. If you aren’t fitting a linear model, you shouldn’t use it. The reason why is actually very easy to understand.
For linear models, the sums of the squared errors always add up in a specific manner: SS Regression + SS Error = SS Total.
This seems quite logical. The variance that the regression model accounts for plus the error variance adds up to equal the total variance. Further, R-squared equals SS Regression / SS Total, which mathematically must produce a value between 0 and 100%.
In nonlinear regression, SS Regression + SS Error do not equal SS Total! This completely invalidates R-squared for nonlinear models, and it no longer has to be between 0 and 100%.
參考鏈接:http://blog.minitab.com/blog/adventures-in-statistics-2/why-is-there-no-r-squared-for-nonlinear-regression
更新:
For cases other than fitting by ordinary least squares, theR2statistic can be calculated as above and may still be a useful measure. If fitting is byweighted least squaresorgeneralized least squares, alternative versions of R2can be calculated appropriate to those statistical frameworks, while the "raw"R2may still be useful if it is more easily interpreted. Values forR2can be calculated for any type of predictive model, which need not have a statistical basis.
參考鏈接:https://en.wikipedia.org/wiki/Coefficient_of_determination
更新:
https://stats.stackexchange.com/questions/7357/manually-calculated-r2-doesnt-match-up-with-randomforest-r2-for-testing
這篇回答中給了兩個(gè)信息:
(1)線(xiàn)性回歸的R方等于實(shí)際值與預(yù)測(cè)值的相關(guān)系數(shù)的平方
(2)randomForest is reporting variation explained as opposed to variance explained.
總結(jié)
以上是生活随笔為你收集整理的线性回归之决定系数(coefficient of determination)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 使用Java+SAP云平台+SAP Cl
- 下一篇: html5关键帧动画,一个小例子快速理解