當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

arduino 入门套件_计算机视觉入门套件

發(fā)布時間：2023/12/15 编程问答 29 豆豆

生活随笔收集整理的這篇文章主要介紹了 arduino 入门套件_计算机视觉入门套件小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

arduino 入門套件

Among the many disciplines in the field of machine learning, computer vision has arguably seen unprecedented growth. In its current form, it offers plethora of models to choose from, each with its own shine. It’s quite easy then, to lose your way in this abyss. Fret not, for this foe can be defeated, like many others, using the power of mathematics and a touch of intuition.

在機(jī)器學(xué)習(xí)領(lǐng)域的許多學(xué)科中，計算機(jī)視覺可以說是空前的增長。在當(dāng)前形式下，它提供了多種模型供您選擇，每種模型都有自己的特色。那么，很容易迷失在深淵中。不要擔(dān)心，因?yàn)榭梢韵衿渌S多敵人一樣，利用數(shù)學(xué)的力量和直覺來打敗這個敵人。

Before we venture forth, it is important to have the basic knowledge of machine learning under your belt. To start with, we should understand the concept of convolution in general and then we can narrow it down to its application in machine learning. In hindsight, by convolution, we mean that a function strides over the domain of another function thereby leaving its imprint(Fig 1.).

在我們冒險之前，掌握機(jī)器學(xué)習(xí)的基礎(chǔ)知識很重要。首先，我們應(yīng)該大致了解卷積的概念，然后將其縮小到機(jī)器學(xué)習(xí)中的應(yīng)用。在事后看來，通過卷積，我們的意思是一個功能跨越了另一個功能的域，從而留下了它的烙印(圖1.)。

A computer can’t really “see” an image, all it can perceive are bits, 0/1. To comply with this lingo, we can represent images as a matrix of numbers, where each number corresponds to the pixel strength (0–255). We can then perform convolution by taking a filter window which strides over the image progressively(Fig 2.). Each filter is associated with a set of numbers which is multiplied to a portion of the image to extract a specific information. We employ multiple kernels to gather various aspects of the image. The end-goal is to learn suitable kernel weights which best encodes the data for our use case. This information capture process is what grants a computer the ability to “see”.

計算機(jī)無法真正“看到”圖像，它只能感知到0/1位。為了遵循這種術(shù)語，我們可以將圖像表示為數(shù)字矩陣，其中每個數(shù)字對應(yīng)于像素強(qiáng)度(0–255)。然后，我們可以通過獲取一個逐步跨越圖像的濾鏡窗口來進(jìn)行卷積(圖2)。每個過濾器都與一組數(shù)字相關(guān)聯(lián)，該數(shù)字與圖像的一部分相乘以提取特定信息。我們采用多個內(nèi)核來收集圖像的各個方面。最終目標(biāo)是學(xué)習(xí)合適的內(nèi)核權(quán)重，從而對我們的用例進(jìn)行最佳編碼。此信息捕獲過程使計算機(jī)能夠“查看”。

Fig2: Convolution at work, input image being convolved over using kernel weights. Source: Wikimedia圖2：工作中的卷積，使用內(nèi)核權(quán)重對輸入圖像進(jìn)行卷積。資料來源：Wikimedia

You might have noticed that down the neural net, the feature map tends to shrink. In fact, it is entirely possible for the image to vanish down the lane. In addition to that, the edges of the images get almost no say to the result since the filter passes through them only once. This is where the concept of ”Padding” debuts. Padding implies shielding our original image with an additional layer of zero value vectors. With this, we have already solved the first problem, i.e. feature map shrinking, with a smart choice of padding we can have the output feature map to have exact dimension as input, this is called “SAME PADDING”. This also ensures that the kernel filter overlaps more than once on the edges of our image. The case where we don’t employ this feature is called as “VALID PADDING”.

您可能已經(jīng)注意到，在神經(jīng)網(wǎng)絡(luò)中，特征圖趨于縮小。實(shí)際上，圖像完全有可能在車道上消失。除此之外，由于過濾器僅通過圖像一次，因此圖像的邊緣幾乎沒有發(fā)言權(quán)。這是“填充”概念首次出現(xiàn)的地方。填充意味著通過附加一層零值向量來屏蔽原始圖像。至此，我們已經(jīng)解決了第一個問題，即特征圖縮小，通過明智地選擇填充，我們可以使輸出特征圖具有準(zhǔn)確的尺寸作為輸入，這被稱為“ SAME PADDING” 。這也可以確保內(nèi)核濾鏡在圖像邊緣上重疊不止一次。我們不使用此功能的情況稱為“有效填充”。

Any significant stride in technology must surpass its predecessor in every way. In this light, the thought arises, where exactly does the classic framework fall behind? This can be very easily explained once we examine the computational cost. Dense layer consists of a tight-knit connection between each and every layer and “each connection is associated with a weight”(Ardakani et al.). On the contrary, since convolution only considers a portion of the image, we can interpret it as a “sparsely connected neural networks”. In this architecture, “each neuron is only connected to a few neurons based on a pattern and a set of weights is shared among all neurons”(Ardakani et al.). The tight-knit nature of Dense Layers is the reason it has exponentially higher number of learnable parameters than Convolution Layers.

技術(shù)上的任何重大進(jìn)步都必須以各種方式超越其前身。有鑒于此，這種思想產(chǎn)生了，經(jīng)典框架到底落在了什么地方？一旦我們檢查了計算成本，這很容易解釋。致密層由每一層之間的緊密連接組成， “每個連接都與重量有關(guān)” (Ardakani等人)。相反，由于卷積僅考慮圖像的一部分，因此我們可以將其解釋為“稀疏連接的神經(jīng)網(wǎng)絡(luò)”。在這種架構(gòu)中， “每個神經(jīng)元僅基于一種模式連接到幾個神經(jīng)元，并且在所有神經(jīng)元之間共享一組權(quán)重” (Ardakani等人)。密集層的緊密本質(zhì)是可卷積參數(shù)比卷積層指數(shù)增長的原因。

I think there are two main advantages of convolutional layers over just using fully connected layers. And the advantages are parameter sharing and sparsity of connections.

我認(rèn)為卷積層比僅使用完全連接的層有兩個主要優(yōu)點(diǎn)。優(yōu)點(diǎn)是參數(shù)共享和連接稀疏。

-Andrew NG

-安德魯(NG)

It is often observed that a convolutional layer appears in conjunction with a “Pooling” layer. Pooling layer, as the name suggests, down-samples the feature map of the previous layer. This is important as plain convolution latch tightly to the input feature map, this means even the finest distortion in the image may lead to entirely different results. By down-sampling, we get a summary statistic of the input thereby making the model translation invariant.

經(jīng)常觀察到，卷積層與“池”層一起出現(xiàn)。顧名思義，池化層對上一層的特征圖進(jìn)行下采樣。這很重要，因?yàn)槠胀ň矸e緊緊地鎖在輸入特征圖上，這意味著即使圖像中最精細(xì)的失真也可能導(dǎo)致完全不同的結(jié)果。通過下采樣，我們可以獲得輸入的摘要統(tǒng)計信息，從而使模型轉(zhuǎn)換不變。

Imagine an image of a cat comes in.

想象有一只貓的形象進(jìn)來。

Imagine the same image comes in, but rotated.

想象一下有相同的圖像進(jìn)來，但是旋轉(zhuǎn)了。

If you have the same response for the two, that’s invariance.

如果您對兩者的響應(yīng)相同，那就是不變性。

-source: Tapa Ghosh, www.quora.com

資料來源：Tapa Ghosh， www.quora.com

By “translation invariant” we mean it’s invariant to linear shift of the target image.

“平移不變”是指目標(biāo)圖像的線性位移不變。

Fig2: Original image(left) shifted left(right) produces the same output. Source: Vishal Sharma, www.quora.com圖2：原始圖像(左)向左(右)移動產(chǎn)生相同的輸出。資料來源：Vishal Sharma，www.quora.com

Pooling layers cuts through these noises and lusters the dominant features to shine brighter. This provides immunity against these distortions and makes our model robust to changes. There are two of methods of sampling often employed

匯聚層消除了這些噪音，使主要特征更加明亮。這樣可以抵抗這些失真，并使我們的模型對更改具有魯棒性。經(jīng)常采用兩種采樣方法

Average Pool: This takes a filter and averages over the image. This gives a fair-say to the nuances of the image. Often this method is not employed.

平均池：這將使用過濾器并對圖像進(jìn)行平均。這可以公平地對待圖像的細(xì)微差別。通常不使用此方法。

Max Pool: Used prevalently, it takes the maximum of pixel values under its window. In this method, we take only the most dominant feature into consideration.

最大池：通常使用，它在其窗口下獲取最大像素值。在這種方法中，我們僅考慮最主要的特征。

It is important to note that, pooling layer in itself does not have learnable parameters. They are fixed size operations and they are set before training, a.k.a “hyper-parameters”. Some models e.g. MobileNet doesn’t rely on pooling layers for down-sampling instead “down sampling is handled with strided convolution”(Howard et al.).

重要的是要注意，池化層本身沒有可學(xué)習(xí)的參數(shù)。它們是固定大小的操作，它們在訓(xùn)練之前設(shè)置，也稱為“超參數(shù)”。某些模型(例如MobileNet)不依賴于池層進(jìn)行下采樣，而是“通過分步卷積處理下采樣” (Howard等人)。

Convolutional neural network is now the go-to method for computer-vision problems. It’s introduction to this field has been a true game changer. It continues to be precise, faster and robust by the day but it’s roots nevertheless are humble.

卷積神經(jīng)網(wǎng)絡(luò)現(xiàn)在是解決計算機(jī)視覺問題的首選方法。它對這一領(lǐng)域的介紹確實(shí)改變了游戲規(guī)則。它一直保持著精確，快速和強(qiáng)大的優(yōu)勢，但其根基卻是謙虛的。

參考書目 (Bibliography)

Ardakani, Arash, Carlo Condo, and Warren J. Gross. “Sparsely-connected neural networks: towards efficient vlsi implementation of deep neural networks.” arXiv preprint arXiv:1611.01427 (2016).
Ardakani，Arash，Carlo Condo和Warren J. Gross。 “稀疏連接的神經(jīng)網(wǎng)絡(luò)：邁向深度神經(jīng)網(wǎng)絡(luò)的高效vlsi實(shí)現(xiàn)?！?arXiv預(yù)印本arXiv：1611.01427 (2016)。
Andrew Y. Ng, “Why Convolutions?”, www.coursera.org
吳安德(Andrew Y. Ng)，“為什么卷積？”， www.coursera。組織
Howard, Andrew G., et al. “Mobilenets: Efficient convolutional neural networks for mobile vision applications.” arXiv preprint arXiv:1704.04861 (2017).
霍華德(Andrew G.)等。 “ Mobilenets：針對移動視覺應(yīng)用的高效卷積神經(jīng)網(wǎng)絡(luò)。” arXiv預(yù)印本arXiv：1704.04861 (2017)。

翻譯自: https://medium.com/analytics-vidhya/starters-pack-for-computer-vision-779b240cb045

arduino 入門套件

總結(jié)

以上是生活随笔為你收集整理的arduino 入门套件_计算机视觉入门套件的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：边缘计算边缘计算edge_Edge AI
下一篇：了解LSTM和GRU