揭秘机器学习
The terms “artificial intelligence”, “machine learning”, and “neural networks” are loosely thrown around nowadays. With all of the hype that is built around it in the current news, machine learning is often praised as a silver bullet — a magical technique that can solve ANY problem, no matter how complicated.
如今,“人工智能”,“機器學習”和“神經網絡”這兩個術語松散地出現了。 借助當前新聞中圍繞它的所有炒作,機器學習通常被譽為“銀彈” —一種神奇的技術,可以解決任何問題,無論多么復雜。
Well, as you’d probably expect, that’s not completely true. What is true is that modern machine learning (ML) techniques have been able to perform tasks that were previously seemingly impossible. Not only that, compared to solutions we did have for certain tasks, but modern ML techniques have also proven to be better in almost every way: they are faster to develop, more robust, and often run faster. However, these techniques come with their own limitations that make them better suited to certain applications. So while the field is constantly evolving with hundreds of researchers all around the world expanding the state-of-the-art, it’s important to understand the fundamentals of how it works so that we can apply it to solve the problems at which it excels.
好吧,正如您可能期望的那樣,這并非完全正確。 真實的是,現代機器學習(ML)技術已經能夠執行以前看似不可能的任務。 不僅如此,與我們為某些任務提供的解決方案相比,現代ML技術還被證明在幾乎所有方面都更好:它們開發速度更快,功能更強大,而且運行速度通常更快。 但是,這些技術有其自身的局限性,使它們更適合某些應用程序。 因此,盡管該領域不斷發展,世界各地的數百名研究人員正在擴展最先進的技術,但重要的是要了解其工作原理,以便我們將其應用于解決其擅長的問題。
定義條款 (Define the terms)
Let’s first define the 3 most commonly used “buzz word” terms in this industry:
讓我們首先定義該行業中最常用的3個“流行語”術語:
Artificial Intelligence — The theory and development of computer systems able to perform tasks that normally require human intelligence [1]
人工智能— 能夠執行通常需要人類智能的任務的計算機系統的理論和發展 [1]
Machine Learning — Machine Learning is the study of computer algorithms that improve automatically through experience [2]
機器學習- 機器學習是對計算機算法的研究,這些算法會根據經驗自動提高[2]
Neural Network/Deep Learning — A machine learning technique, very loosely modeled on the structure of the human brain, that is effective at learning complex patterns in data.
神經網絡/深度學習- 一種機器學習技術,非常松散地模擬了人腦的結構,可以有效地學習數據中的復雜模式。
So, machine learning is a subset of artificial intelligence and neural networks are a subset of machine learning. Much of the news and advancements in artificial intelligence and machine learning have been due to neural networks, which is why the terms have been used interchangeably. Over the past couple of years, we have shown that neural nets are capable of highly complicated, nuanced, and diverse tasks. For example, they excel at image-based tasks such as object detection and classification, human pose detection, and human mood detection. They have been used for audio tasks, such as speech-to-text translation and music generation. All modern language translation services apply neural nets to extract the meaning from phrases and convert those phrases to different languages. Some recent notable advancements include beating the world champions in the board game Go (a game that is notoriously hard because it requires long-term strategy) and generating paragraphs of text on provided topics that are almost indistinguishable from human-generated text.
因此,機器學習是人工智能的子集,而神經網絡是機器學習的子集。 人工智能和機器學習的許多新聞和進步都歸因于神經網絡,這就是為什么這些術語可以互換使用的原因。 在過去的幾年中,我們已經證明神經網絡能夠執行高度復雜,細微而多樣的任務。 例如,他們擅長于基于圖像的任務,例如對象檢測和分類,人體姿勢檢測和人類情緒檢測。 它們已用于音頻任務,例如語音到文本翻譯和音樂生成。 所有現代語言翻譯服務都使用神經網絡從短語中提取含義并將這些短語轉換為不同的語言。 最近的一些顯著進步包括,在棋盤游戲Go中擊敗了世界冠軍(Go這款游戲之所以困難是因為它需要長期的策略),并且在提供的主題上生成了與人類生成的文本幾乎無法區分的文本段落。
Because of their outstanding performance, all of the big tech companies have been investing heavily in applying neural nets in their products. Google uses them for their search engine, translation, ad targeting, photo tagging, generating maps, and so much more. All of the big social media companies use it for recommending content to its users and understanding user sentiment. Self-driving car companies apply them to processing data about their surrounding environment so the car can make safe decisions. This list is by no means comprehensive.
由于其出色的性能,所有大型高科技公司都在其產品上應用神經網絡進行了大量投資。 Google將它們用于搜索引擎,翻譯,廣告定位,照片標記,生成地圖等等。 所有大型社交媒體公司都使用它來向用戶推薦內容并了解用戶情緒。 自動駕駛汽車公司將其應用于處理有關其周圍環境的數據,以便汽車可以做出安全的決策。 此列表絕不是全面的。
Example: Neural network performing object-detection [3]示例:執行對象檢測的神經網絡[3]Since most people refer to deep learning and neural networks when they talk about machine learning, we will focus on only that in this article.
由于大多數人在談論機器學習時都提到了深度學習和神經網絡,因此在本文中我們將僅著重于此。
神經網絡如何工作? (How does a Neural Net Work?)
Neural nets can process complex data such as images, audio clips, and videos. To us humans, we perceive this data with our senses as colors and sounds. To computers, however, images are just a collection of brightness values. As a result, to process and understand the contents of these rich data sources, we need to apply mathematical techniques. Neural nets are really just big and complicated math functions, like y = mx + b or y = e^x or y = sin(x). They take in a collection of numbers representing the data and they output another collection of numbers, describing the answer you “taught” them to give you.
神經網絡可以處理復雜的數據,例如圖像,音頻片段和視頻。 對于我們人類來說,我們以感知到的顏色和聲音來感知這些數據。 但是,對于計算機而言,圖像只是亮度值的集合。 因此,要處理和理解這些豐富數據源的內容,我們需要應用數學技術。 神經網絡實際上只是龐大而復雜的數學函數,例如y = mx + b或y = e ^ x或y = sin(x)。 他們采用一組代表數據的數字,并輸出另一組數字,描述您“教”給他們的答案。
As we mentioned previously, neural nets are loosely based on the structure of the brain (consider the brain a useful analogy, not an exact representation of how they work). Neural nets consist of multiple layers of “neurons”, where each layer is multiple “neurons” wide. Each neuron represents a relatively simple mathematical operation, think mx + b (where x is the number input to the neuron), followed by a non-linear function, like sin(x) (the actual function used depends on the specific task you’re doing). The output from each neuron gets fed into each of the neurons in the next layer. When these layers of neurons get stacked very deep (this is where the term deep learning comes from), the result is a function capable of describing very complicated relationships in the data.
如前所述,神經網絡松散地基于大腦的結構(認為大腦是有用的類比,而不是它們的工作原理的精確表示)。 神經網絡由多層“神經元”組成,其中每一層是多個“神經元”寬。 每個神經元代表一個相對簡單的數學運算,請考慮mx + b(其中x是輸入到神經元的數字),然后是一個非線性函數,例如sin(x)(所用的實際函數取決于您所執行的特定任務)重新做)。 每個神經元的輸出被饋送到下一層的每個神經元中。 當這些神經元層堆積得非常深時(這就是“深度學習”一詞的來歷),其結果是能夠描述數據中非常復雜的關系的函數。
Neural net architecture, circles represent neurons, lines represent data flow (source: Wikimedia Commons)神經網絡架構,圓圈代表神經元,線條代表數據流(來源:Wikimedia Commons)So we’ve covered what is neural net IS, but we haven’t talked about what “training” it means. How does a neural net “learn”? Remember how we mentioned that each neuron applies mx + b to its inputs? The “m” and “b” in that formula are actually learnable parameters. In other words, we tune the value of “m” and “b” in each neuron to change what the neural network does. Does this sound familiar? It is actually the same as the process of linear regression in statistics! In linear regression, we try to find a best fit line for our data by finding the correct parameters “m” and “b”. Neural nets are just doing this at a massive scale. Instead of finding 2 parameters for a line, we are finding millions or even billions of parameters for a very complicated function. So neural nets are sort of best fit lines for your data. It’s a little weird because instead of a converting a single input to a single output like in linear regression, neural nets convert from data like images to labels describing what the image contains. However, they are fundamentally the same thing.
因此,我們已經介紹了什么是神經網絡IS,但我們沒有談論它的“訓練”含義。 神經網絡如何“學習”? 還記得我們曾提到過每個神經元將mx + b應用于其輸入嗎? 該公式中的“ m”和“ b”實際上是可學習的參數。 換句話說,我們調整每個神經元中“ m”和“ b”的值以更改神經網絡的功能。 這聽起來很熟悉嗎? 它實際上與統計中的線性回歸過程相同! 在線性回歸中,我們嘗試通過找到正確的參數“ m”和“ b”來找到最適合我們數據的線。 神經網絡正在大規模地這樣做。 我們沒有為一條線找到2個參數,而是為一個非常復雜的函數找到了數百萬甚至數十億個參數。 因此,神經網絡是最適合您數據的線。 這有點奇怪,因為神經網絡沒有像線性回歸那樣將單個輸入轉換為單個輸出,而是將圖像等數據轉換為描述圖像所包含內容的標簽。 但是,它們本質上是同一件事。
Because neural nets are so complicated, we can’t find the m’s and b’s (called weights) the same way we would for linear regression, so we had to devise other methods. The reason neural nets are structured as we describe above is very intentional. They are built so that the whole neural net is differentiable (the whole neural net has a derivative, crazy right?). The details of what exactly that means is unimportant. But the consequence is that we can use a technique called “gradient descent” to select the correct weights. Gradient descent has proven to be a extremely successful method to find those weights. In fact, the gradient descent technique is the reason that neural nets have blown up in popularity. They would largely be useless without it.
由于神經網絡非常復雜,因此我們無法找到與線性回歸相同的m和b(稱為權重),因此我們不得不設計其他方法。 如上所述,神經網絡結構化的原因是非常故意的。 建立它們是為了使整個神經網絡具有可區分性(整個神經網絡具有派生的,瘋狂的權利嗎?)。 確切含義的細節并不重要。 但是結果是我們可以使用一種稱為“梯度下降”的技術來選擇正確的權重。 事實證明,梯度下降是找到這些權重的非常成功的方法。 實際上,梯度下降技術是神經網絡Swift普及的原因。 沒有它們,他們將大體上毫無用處。
The idea behind gradient descent is very simple and is actually similar to how humans learn (going back to the brain analogy). Let’s talk about a specific task — determining if an image contains a dog or not. We feed a bunch of images (some images contain dogs, some don’t) into the neural net and get the outputs for each of the images. At the start of the training, the weights are set randomly, so the output of the neural net is meaningless. After feeding a couple of images through, we compare the output of the neural net to the correct output for each image. We do this comparison using something called a loss function, which tells the algorithm how “wrong” the neural net was. Through some mathematical magic (really just multi-variable calculus), we then calculate how to change each weight (remember, just a lot of m’s and b’s) based on the loss function so that the neural net gets closer to the right answer. We apply those changes calculated from each image and then try again on a new set of images. As we repeat this process, the neural net gets better and better at identifying dogs. After a couple of thousands of cycles, the neural net gets very good at the task! So just like humans, the neural net “practices”, trying again and again and getting better every time.
梯度下降背后的想法非常簡單,實際上類似于人類的學習方式(回到大腦的類比)。 我們來討論一個特定的任務-確定圖像是否包含狗。 我們將一堆圖像(有些圖像包含狗,有些不包含狗)輸入神經網絡,并獲取每個圖像的輸出。 在訓練開始時,權重是隨機設置的,因此神經網絡的輸出毫無意義。 在饋送了幾張圖像之后,我們將神經網絡的輸出與每個圖像的正確輸出進行比較。 我們使用稱為損失函數的某種東西進行比較,該函數告訴算法神經網絡有多么“錯誤”。 通過一些數學魔術(實際上只是多變量演算),我們然后基于損失函數計算如何更改每個權重(記住,只有很多m和b),以便神經網絡更接近正確的答案。 我們應用從每張圖像計算出的更改,然后重試一組新圖像。 當我們重復此過程時,神經網絡在識別狗方面變得越來越好。 經過數以千計的循環后,神經網絡可以很好地完成任務! 因此,就像人類一樣,神經網絡“練習”,一次又一次地嘗試,每次都變得更好。
That’s it! Those are the fundamental concepts of how a neural net learns.
而已! 這些是神經網絡如何學習的基本概念。
我們可以從中得出什么結論? (What Conclusions Can We Gather from this?)
Now that you have a basic understanding of what a neural net is, let’s discuss some important points about applying neural nets to your tasks.
既然您對神經網絡是一個基本的了解,讓我們討論有關將神經網絡應用于任務的一些重要要點。
神經網絡依賴數據-大量數據 (Neural nets rely on data — lots of data)
When you’re doing linear regression, you need lots of data points to get a good best fit line. If your dataset is too small, you run the risk of getting a bad best fit line resulting in poor estimates.
在進行線性回歸時,需要大量數據點才能獲得最佳擬合線。 如果數據集太小,則可能會出現最佳擬合線不佳而導致估算不佳的風險。
Orange: Best fit line with 2 data points, Blue: Best fit line with 50 data points (including orange)橙色:具有2個數據點的最佳擬合線,藍色:具有50個數據點的最佳擬合線(包括橙色)The same is true for neural nets, except on a massive scale. Neural nets need thousands of examples to learn from. The more diverse and varied your dataset is, the better the neural network will perform on new, unseen examples (called generalization).
除了大規模之外,神經網絡也是如此。 神經網絡需要成千上萬的例子來學習。 您的數據集越多樣化和變化,神經網絡在新的,看不見的示例(稱為泛化)上的表現就越好。
Over the past couple of years, massive datasets have become available for a large variety of problems. However, the data requirement is a limiting factor for applying neural networks to brand-new tasks as it is not feasible to collect that many examples AND record the correct output for each example. Luckily, in the recent years, there has been significant advances in reducing the amount of data needed to train them. One popular technique is called transfer learning, where a neural network is trained to do a certain task and then “fine-tuned” to do another similar task with less data available.
在過去的幾年中,海量數據集已經可以解決各種各樣的問題。 但是,數據要求是將神經網絡應用于全新任務的限制因素,因為收集這么多示例并記錄每個示例的正確輸出是不可行的。 幸運的是,近年來,在減少訓練數據所需的數據量方面取得了重大進展。 一種流行的技術稱為轉移學習,其中訓練神經網絡執行某項任務,然后進行“微調”以使用較少的可用數據完成另一項相似的任務。
神經網絡需要功能強大的計算機才能運行 (Neural Nets require powerful computers to run)
As we mentioned earlier, modern neural nets have millions or billions of weights and do millions of multiplication and addition operations to calculate the outputs. This makes it difficult, sometimes impossible, to run them on older or less powerful computers. There is a significant research effort dedicated to making neural networks smaller and running on them on small computers.
正如我們前面提到的,現代神經網絡具有數百萬或數十億的權重,并進行數百萬的乘法和加法運算以計算輸出。 這使得在較舊或更弱的計算機上運行它們變得困難,有時甚至是不可能。 有大量的研究工作致力于使神經網絡更小并在小型計算機上運行。
結語 (Wrapping it up)
Remember how we said neural nets were not a catch-all solution to all of your problems? Most of the time, one of the two conclusions above are a limiting factor to their usefulness. As time goes on, these limits will reduce, but will never completely go away. Before you settle on deep learning for a new problem, consider other solutions first. There are often other machine learning/statistics tools that will provide satisfactory results that require less data or computational power. However, if you do have access to a large dataset and computational power, deep learning has the potential to build an unparalleled solution.
還記得我們怎么說神經網絡不是您所有問題的萬能解決方案嗎? 在大多數情況下,以上兩個結論之一是其實用性的限制因素。 隨著時間的流逝,這些限制將減少,但永遠不會完全消失。 在對新問題進行深入學習之前,請先考慮其他解決方案。 通常還有其他機器學習/統計工具可以提供令人滿意的結果,而所需的數據或計算能力卻更少。 但是,如果您確實有權訪問大型數據集和強大的計算能力,則深度學習有可能構建無與倫比的解決方案。
Hopefully, this article helped to de-mysitfy some of concepts behind the buzz-words that are thrown around and to clarify when these techniques should be considered.
希望本文有助于消除流行語背后的一些概念的神秘性,并闡明何時應考慮使用這些技術。
資料來源: (Sources:)
[1] “artificial intelligence.” Oxford Reference. ; Accessed 27 Jul. 2020.
[1]“人工智能”。 牛津參考。 ; 于2020年7月27日訪問。
https://www.oxfordreference.com/view/10.1093/oi/authority.20110803095426960.
https://www.oxfordreference.com/view/10.1093/oi/authority.20110803095426960。
[2] Mitchell, Tom M. Machine Learning. New York, NY: McGraw Hill, 2017.
[2] Mitchell,湯姆M. 機器學習 。 紐約,紐約:麥格勞·希爾(McGraw Hill),2017年。
[3] https://commons.wikimedia.org/wiki/File:Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg
[3] https://commons.wikimedia.org/wiki/File:Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg
翻譯自: https://medium.com/swlh/de-mystifying-machine-learning-482049ee8c02
總結
- 上一篇: gan学到的是什么_GAN推动生物学研究
- 下一篇: 投影仪投影粉色_DecisionTree