當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

《Neural networks and deep learning》概览

發布時間：2025/7/25 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了《Neural networks and deep learning》概览小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

最近閱讀了《Neural networks and deep learning》這本書（online book，還沒出版），算是讀得比較仔細，前面幾章涉及的內容比較簡單，我著重看了第三章《Improving the way neural networks learn》，涉及深度神經網絡優化和訓練的各種技術，對第三章做了詳細的筆記（同時參考了其他資料，以后讀到其他相關的論文資料也會補充或更改），歡迎有閱讀這本書的同學一起交流。以下屬個人理解，如有錯誤請指正。

What this book is about？

這本書中的代碼基于Python實現，從MNIST這個例子出發，講人工神經網絡（Neural networks），逐步深入到深度學習（Deep Learning），以及代碼實現，一些優化方法。適合作為入門書。

1、 Using neural nets to recognize handwritten digits

文章概要

用人工神經網絡來識別MNIST數據集，Python實現，僅依賴NumPy庫。

2、 How the backpropagation algorithm works

文章概要

上一章沒有討論怎么優化NN，當時并沒有討論怎么計算損失函數的梯度，沒有討論優化過程，這就是這一章要講的BP算法。
BP算法在1970s出現，但直到1986年Hinton的paper發表之后它才火起來。
BP實現代碼

the code was contained in the update_ mini _ batch and backprop methods of the Network class.In particular, the update_mini_batch method updates the Network’s weights and biases by computing the gradient for the current mini_batch of training examples:
Fully matrix-based approach to backpropagation over a mini-batch

Our implementation of stochastic gradient descent loops over training examples in a mini-batch. It’s possible to modify the backpropagation algorithm so that it computes the gradients for all training examples in a mini-batch simultaneously. The idea is that instead of beginning with a single input vector, x, we can begin with a matrix X=[x1x2…xm] whose columns are the vectors in the mini-batch.

將mini batch里的所有樣本組合成一個大矩陣，然后計算梯度，這樣可以利用線性代數庫，大大地減少運行時間。
BP算法有多快？

BP算法剛發明的時候，計算機計算能力極其有限。現在BP在深度學習算法中廣泛應用，得益于計算能力的大躍升，以及很多有用的trick。
what’s the algorithm really doing？

這部分對BP算法深入討論，是個證明過程。網絡前面某個節點發生的改變，會一層一層往后傳遞，導致代價函數發生改變，這兩個改變之間的關系可以表示為：

一層一層地推導，又可以表示為：

后面還有一堆……

關于BP的原理，建議看看Andrew NG的UFLDL，也可以看一些相應的博文。

3、Improving the way neural networks learn

這一章討論一些加速BP算法、提高NN性能的技術。這些技術/trick在訓練網絡、優化的時候很常用，如下所述，（目前還沒整理完各個部分的筆記，而且篇幅長，就分為幾篇博客來寫，陸續在 [文章鏈接] 中貼出。）：

比方差代價函數更好的：交叉熵代價函數 [文章鏈接]

四種正則化方法（提高泛化能力，避免overfitting）： [文章鏈接]

L1 regularization
L2 regularization
dropout
artificial expansion of the training data

權重初始化的方法 [文章鏈接]

如何選取超參數（學習速率、正則化項參數、minibatch size） [文章鏈接]

4、A visual proof that neural nets can compute any function

轉載請注明出處：http://blog.csdn.net/u012162613/article/details/44220115

總結

以上是生活随笔為你收集整理的《Neural networks and deep learning》概览的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： DeepLearning tutoria
下一篇：卷积神经网络的一些细节