熔池 沉积_用于3D打印的AI(第2部分):异常熔池检测的一课学习
熔池 沉積
This article is part 2 of the AI for 3-D Printing series. Read part 1 and part 3.
本文是3-D打印AI系列的第2部分。 閱讀 第1 部分 和 第3部分 。
From part 1, we have concluded that there is a need to define a single metric to measure the degree of anomaly of melt pool video frames for an LPBF in-situ monitoring system. Ideally, the evaluation of this metric should be computationally inexpensive such that it can be computed in a near real-time setting. This article introduces an anomaly detection framework centered around the concept of One Class Learning.
從第1部分中 ,我們得出結論,需要定義一個單一的度量標準來測量LPBF原位監視系統的熔池視頻幀的異常程度。 理想情況下,此指標的評估應在計算上便宜,以便可以在接近實時的設置下進行計算。 本文介紹了圍繞“一類學習”概念的異常檢測框架。
Disclaimer: The purpose of this article is to showcase how different models can be put together in an anomaly detection framework. Details such as hyperparameters fine tuning and model architecture will be omitted.
免責聲明:本文的目的是展示如何在異常檢測框架中將不同的模型組合在一起。 諸如超參數微調和模型架構之類的細節將被省略。
介紹 (Introduction)
One class learning is an unsupervised method of training classifiers when some classes in the dataset are either present in a small amount, or have no well-defined characteristics. The imbalanced dataset issue introduces overfitting risks as supervised machine learning models usually struggles to generalise for the minority class. With one class learning, a classifier is trained to specialise in recognising well-characterised instances from a single class, hence this type of classifier is also known as a One Class Classifier.
當數據集中的某些類別數量很少或沒有明確定義的特征時,一種類別學習是一種無監督的分類器訓練方法。 不平衡的數據集問題引入了過度擬合的風險,因為有監督的機器學習模型通常難以為少數群體推廣。 通過一級學習,訓練了一個分類器以專門識別單個班級中特征明確的實例,因此,這種類型的分類器也稱為“一類分類器”。
In the context of detecting anomalous melt pools, though the proportion of anomalous melt pools is smaller relative to the normal ones, we could, in theory, fix the data imbalance by generating a bunch of anomalous training examples with the poor printing process. Whether the purposely generated anomalies share the same distribution as the real anomalies would be one concern but more importantly, the real problem here is that the anomalies are not very well characterised. This means that manually annotating the anomalies is prone to inconsistency error, due to the less established definition of an anomaly (e.g. How many spatter particles a melt pool must eject and how big must they be for the melt pool to be considered as an anomalous instance?). One class leaning provides an alternative to supervised training as no labelling will be required for the training phase.
在檢測異常熔池的情況下,盡管異常熔池的比例相對于正常熔池較小,但從理論上講,我們可以通過生成一堆打印過程較差的異常訓練示例來解決數據不平衡問題。 故意生成的異常是否與真實異常共享相同的分布將是一個問題,但更重要的是,這里的真正問題是異常的特征不是很好。 這意味著,由于對異常的定義不夠明確(例如,熔池必須噴出多少個飛濺粒子,并且熔池要被視為異常實例的大小),因此手動注釋異常很容易出現不一致錯誤。 ?)。 由于在培訓階段不需要標簽,因此一類學習可以代替監督培訓。
There are a few examples of one class learning algorithms such as the autoencoder and one class support vector machine. In this project, the usage of a deep convolutional autoencoder is experimented with for anomalies detection.
存在一類學習算法的一些示例,例如自動編碼器和一類支持向量機。 在該項目中,嘗試使用深度卷積自動編碼器進行異常檢測。
With one class learning, we can assign a single continuous metric to measure the degree of melt pool anomaly. Operating on an unsupervised basis, one class classifier requires no labelled data for training.
通過一堂課的學習,我們可以分配一個連續的指標來衡量熔池異常的程度。 一個分類器在無人監督的基礎上運行,不需要標簽數據即可進行訓練。
自動編碼器 (Autoencoder)
This section provides a brief introduction to an autoencoder.
本節簡要介紹了自動編碼器。
Generic architecture of an autoencoder.自動編碼器的通用體系結構。The diagram above shows the generic architecture of an autoencoder. An autoencoder consists of two main components, the encoder and the decoder. The encoder compresses the high dimensional input data to a lower dimensional latent space, which is the bottleneck of an autoencoder’s architecture. The decoder will then decode the encodings back to the original dimensional space. The encoding-decoding process is subjected to the constraint whereby the input and output must be similar in Euclidean sense.
上圖顯示了自動編碼器的通用體系結構。 自動編碼器由兩個主要組件組成,即編碼器和解碼器 。 編碼器將高維輸入數據壓縮到低維潛在空間,這是自動編碼器體系結構的瓶頸。 然后,解碼器將把編碼解碼回原始維空間。 編碼-解碼過程受到約束,由此輸入和輸出在歐幾里得意義上必須相似。
Mathematically, this can be written as, Z = g(X) and X’ = f(Z) subjected to X ≈ X’, where X is the input image, g is the encoder, Z is the latent vector, f is the decoder and X’ is the output image.
在數學上,這可以寫成Z = G(X)和X '= F(Z)進行X≈X',其中 X是輸入圖像,g是編碼器, Z是潛矢量,f是解碼器, X'是輸出圖像。
Loss function of an autoencoder自動編碼器的損耗功能The loss function for the training of an autoencoder is the Euclidean distance between X and X’. The autoencoder aims to minimise the loss function during the training process so that the resulting output is similar to the input data. Note that the loss function is also a measure of dissimilarity between the input and output. As such, it is also known as reconstruction error (RE).
訓練自動編碼器的損失函數是X和X'之間的歐幾里得距離。 自動編碼器的目的是在訓練過程中最小化損失函數,以使結果輸出類似于輸入數據。 注意,損失函數也是輸入和輸出之間差異的量度。 這樣,它也稱為重建誤差(RE)。
With the smaller dimensional space in the bottleneck, the encoder is forced to only encode the most representative features from the input into the bottleneck to allow reconstruction of input data by the decoder. Deciding the dimension of the latent space is crucial as small dimensions will impose too much restriction on the flow of information from the encoder to the decoder, making it hard for the decoder to reconstruct the input data. On the other hand, with overly big latent dimensions, the encoder will not learn to capture the important features of the input data as not much restriction is imposed on the flow of information.
由于瓶頸中的維空間較小,編碼器被迫僅對從輸入到瓶頸的最具代表性的特征進行編碼,以允許解碼器重建輸入數據。 決定潛在空間的尺寸至關重要,因為較小的尺寸會限制從編碼器到解碼器的信息流,從而使解碼器難以重構輸入數據。 另一方面,由于潛在尺寸過大,編碼器將無法學習捕獲輸入數據的重要特征,因為對信息流沒有太多限制。
數據預篩 (Data Pre-sieving)
This section presents a data filtering method for training data preparation. As mentioned, one class learning requires the anomalous instances to be the minorities in the training dataset. A quick way to ensure the required constraint is via the usage of unsupervised clustering algorithm such as k-means clustering. Specifically, the focused melt pools from cluster 1 and 9 are shuffled and used to train our autoencoder.
本節介紹用于訓練數據準備的數據過濾方法。 如前所述,一個班級學習要求異常實例是訓練數據集中的少數派。 一種確保所需約束的快速方法是使用無監督聚類算法,例如k-means聚類。 具體來說,將來自群集1和9的聚焦熔池進行混洗并用于訓練我們的自動編碼器。
Focused melt pools from cluster 1 and cluster 9 are used for the training of the autoencoder來自群集1和群集9的聚焦熔池用于自動編碼器的訓練Since the autoencoder is exposed to mostly normal melt pool images during training, it will learn to capture the underlying normal melt pool representations so that the majority of the reconstructed output is similar to the input. This results in a small overall training loss.
由于自動編碼器在訓練期間會暴露于大多數正常熔池圖像,因此它將學習捕獲基本的正常熔池表示,從而使大部分重構輸出類似于輸入。 這導致少量的總體訓練損失。
異常檢測 (Anomalies Detection)
The reconstruction error (RE) metric is used to measure the dissimilarity between the input and the reconstructed output. As a sanity check, the RE metric of out-of-focus melt pools were computed and visualised.
重建誤差(RE)度量用于測量輸入和重建輸出之間的差異。 作為健全性檢查,對失焦熔池的RE度量進行了計算和可視化。
Sanity check: Trained autoencoder applied on out-of-focus melt pool video frames.健全性檢查:訓練有素的自動編碼器應用于散焦融解池視頻幀。As illustrated, the autoencoder fails to reconstruct encountered anomalies and, as a result, RE metric spikes indicating the occurrences of anomalous events. The test also shows that, for out-of-focus printing, the RE provides a good relative measure for the degree of anomaly. For example, plume instances with larger coverage give larger RE compared to smaller plume instances.
如圖所示,自動編碼器無法重建遇到的異常,結果,RE度量峰值會指示異常事件的發生。 該測試還表明,對于散焦打印,RE為異常程度提供了良好的相對度量。 例如,與較小的羽狀實例相比,具有較大覆蓋范圍的羽狀實例提供較大的RE。
Melt pools sampled from (a) RE<0.003 (b) 0.004<RE<0.005 and (c) RE>0.006從(a)RE <0.003(b)0.004 <RE <0.005和(c)RE> 0.006采樣的熔池Some anomalous melt pools reconstructions are also presented as below.
一些異常熔池重建也如下所示。
Original (first row), reconstructed (second row) and error melt pool images with RE x 103 on top (third row)原始圖像(第一行),重建圖像(第二行)和錯誤熔池圖像,頂部RE x103(第三行)Notice how the autoencoder fails to reconstruct the anomalous parts such as the unstable tail of melt pools and spatter particles. For comparison, melt pool images with smaller RE are shown below.
注意自動編碼器如何無法重建異常部分,例如熔池的不穩定尾部和飛濺粒子。 為了進行比較,下面顯示了具有較小RE的熔池圖像。
Original (first row), reconstructed (second row) and error melt pool images with RE x 103 on top (third row)原始圖像(第一行),重建圖像(第二行)和錯誤融解池圖像,頂部具有RE x103(第三行)We can also project melt pool video frames onto a 2-D scatter plot to visualise the melt pool representations. Shown below is a plot of first latent component against second latent component with anchored training melt pool video frames.
我們還可以將熔池視頻幀投影到二維散點圖上,以可視化熔池表示形式。 下面顯示的是帶有固定訓練熔池視頻幀的第一潛在分量相對于第二潛在分量的曲線圖。
Second latent component, Z2 plotted against first latent component, Z1 for some of the training data points.對于某些訓練數據點,第二潛在分量Z2相對于第一潛在分量Z1繪制。Interestingly, the autoencoder captures the travelling direction of the melt pools and normal melt pools do appear as clusters in the latent space.
有趣的是,自動編碼器捕獲了熔池的行進方向,正常熔池的確在潛在空間中以簇的形式出現。
Second latent component, Z2 plotted against first latent component, Z1 for some of the testing data points.對于一些測試數據點,第二潛在分量Z2相對于第一潛在分量Z1繪制。For similar plot but on testing data, we observed that some anomalies are located far away from the clusters.
對于相似的圖但在測試數據上,我們觀察到一些異常現象離群集很遠。
模型性能評估 (Model Performance Evaluation)
Next, ~1500 unseen data points were labelled. The labelled data points can then be used to help determine a suitable RE threshold and to quantify the performance of the autoencoder.
接下來,標記了約1500個看不見的數據點。 然后可以將標記的數據點用于幫助確定合適的RE閾值并量化自動編碼器的性能。
ROC curve used to determine the best RE thresholdROC曲線用于確定最佳RE閾值An appropriate RE threshold was determined with a receiver operating characteristic (ROC) curve. Note that the anomalies are defined as positive class while normal melt pools are labelled as negative class.
使用接收器工作特性(ROC)曲線確定合適的RE閾值。 注意,異常被定義為正類,而正常熔池被標記為負類。
Finally, with the determined RE threshold, the autoencoder performance in terms of recall and precision of both melt pool classes were evaluated and summarised in the table below:
最后,在確定了RE閾值的情況下,評估并總結了兩種熔池類別的自動編碼器性能,包括召回率和精度:
一些想法 (Some Thoughts)
Measuring the time required for this anomaly detection framework, from min max normalising an image to RE computation, an autoencoder on average took 1 microsecond to output a prediction. This is approximately 1000 times faster than region props features extraction alone. More importantly, the image processing is also faster than the FPS of LPBF in-situ monitoring system used.
測量此異常檢測框架所需的時間,從最小最大標準化圖像到RE計算,自動編碼器平均花費1微秒來輸出預測。 這比僅使用區域道具特征提取快約1000倍。 更重要的是,圖像處理還比所使用的LPBF原位監控系統的FPS更快。
Being a completely unsupervised model, the training of the autoencoder does not required any manual labelling, this is a huge save in effort as supervised deep learning models typically need a lot of labelled data. Overall, the implementation of one class learning framework shows great computational time saving for LPBF in-situ monitoring. With faster processing speed, we can now selectively extract features of melt pools for subsequent anomaly analysis.
作為一個完全不受監督的模型,自動編碼器的訓練不需要任何手動標記,這是節省大量精力的方法,因為受監督的深度學習模型通常需要大量的標記數據。 總體而言,一類學習框架的實施顯示了LPBF原位監控的大量計算時間節省。 以更快的處理速度,我們現在可以有選擇地提取熔池的特征以進行后續的異常分析。
Alternatively, we can make use of the latent vectors as extracted features for the training of classifiers since the encoding contains most valuable information about the melt pool geometries. This will be illustrated in more details in part 3 of this series.
另外,由于編碼包含有關熔池幾何形狀的最有價值的信息,因此我們可以將潛在矢量用作提取的特征來訓練分類器。 這將在本系列的第3部分中更詳細地說明。
注意事項 (Caveats)
The performance of an autoencoder varies significantly when deploying on a very different unseen dataset. Data extrapolation is mostly an issue for machine learning models, however, since the autoencoder works by encoding the most informative features from the input data, the usage of it is even more data specific. This is good for the purpose of one class learning but the robustness of RE would be questionable for melt pool images produced under different printing parameters. For instance, an autoencoder trained with melt pools captured from a meander scanning would probably be bad at reconstructing most melt pool images from island scanning strategy, regardless of whether they are anomalous or not.
當在非常不同的看不見的數據集上部署時,自動編碼器的性能會有很大的不同。 數據外推主要是機器學習模型的問題,但是,由于自動編碼器通過對輸入數據中最有用的功能進行編碼來工作,因此它的使用甚至更具體于數據。 這對于一類學習的目的是有益的,但對于在不同打印參數下產生的熔池圖像,RE的健壯性將令人懷疑。 例如,使用從彎曲掃描捕獲的熔池訓練的自動編碼器可能很難從島掃描策略重建大多數熔池圖像,而不管它們是否異常。
結束語 (Concluding Remark)
This article illustrates the usage of a deep convolutional autoencoder for melt pool anomalies detection. Trained on a pre-sieved dataset by K-Means clustering, the autoencoder was able to reconstruct normal melt pool images very well. Failing to reconstruct anomalous melt pools, the autoencoder commits a large reconstruction error. The usage of autoencoder shows promising results in potential computational savings and performance. In the next article, we will explore more on the usage of a probabilistic variant of a vanilla autoencoder for anomalies classification. Thanks for reading :)
本文說明了深度卷積自動編碼器在熔池異常檢測中的用法。 通過K-Means聚類在預先篩選的數據集上進行訓練,自動編碼器能夠很好地重建正常熔池圖像。 未能重建異常熔池,自動編碼器會產生較大的重建錯誤。 自動編碼器的使用在潛在的計算節省和性能方面顯示出令人鼓舞的結果。 在下一篇文章中,我們將探索更多關于使用香草自動編碼器的概率變體進行異常分類的方法。 謝謝閱讀 :)
翻譯自: https://towardsdatascience.com/ai-for-3-d-printing-anomalous-melt-pools-detection-and-classification-part-2-895704203c5a
熔池 沉積
總結
以上是生活随笔為你收集整理的熔池 沉积_用于3D打印的AI(第2部分):异常熔池检测的一课学习的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: fitbit手表中文说明书_如何获取和分
- 下一篇: dash使用_使用Dash和SHAP构建