目标检测迁移学习_使用迁移学习检测疟疾
目標檢測遷移學習
Written by Francesco Palma and Isaac Rosat
由Francesco Palma和Isaac Rosat撰寫
In this article, we will first describe how malaria works and how to diagnose this disease, as well as the problems inherent to it. Then we will talk about the ML model used to detect malaria with the help of blood samples and the results of its performance.
在本文中,我們將首先描述瘧疾的工作原理以及如何診斷這種疾病以及其固有的問題。 然后,我們將討論用于借助血液樣本檢測瘧疾的ML模型及其性能結果。
Malaria is one of the deadliest parasite-related diseases humankind has ever known. It has been around ever since humans have been on the face of the earth and it is certainly here for the long term.
瘧疾是人類已知的最致命的與寄生蟲相關的疾病之一。 自人類出現在地球上以來,它就已經存在了,而且從長遠來看,它肯定在這里。
In 2018, 228 million were infected and 405'000 people died, making it a significant health concern and problem. Africa is home to 94% of all malaria cases (2018), but it is also prevalent in southern China and present on nearly all continents. As malaria mainly hits African countries, it is even harder to contain and effectively fight this parasite. Indeed, the developing countries in which it prospers do not always have the appropriate resources to reduce their vulnerability to malaria. There are an estimated 12 billion US Dollars lost by malaria-hit countries per year, due to the inability of many citizens to work during big malaria outbreaks, high healthcare costs and adverse effects on tourism. Add to the pot around 2.7 billion Dollars of humanitarian aid and there is no doubt that malaria has a tremendous negative impact on the economy of affected countries.
2018年,有2.28億人被感染,有405,000人死亡,這是一個嚴重的健康問題。 非洲占所有瘧疾病例的94%(2018),但在中國南部也很普遍,幾乎遍布所有大洲。 由于瘧疾主要侵襲非洲國家,因此遏制和有效對抗這種寄生蟲變得更加困難。 確實,繁榮的發展中國家并不總是擁有適當的資源來減少其對瘧疾的脆弱性。 由于許多公民在瘧疾大爆發期間無力工作,高昂的醫療保健費用以及對旅游業的不利影響,受瘧疾影響的國家每年估計損失120億美元。 加上約27億美元的人道主義援助,毫無疑問,瘧疾對受災國家的經濟產生了巨大的負面影響。
瘧疾,這是什么? (Malaria, what is it?)
Malaria is actually only the name given to the disease. It is caused by a unicellular protozoan parasite known by its genus name, Plasmodium.
瘧疾實際上只是該疾病的名字。 它是由單細胞原生動物寄生蟲引起的,該寄生蟲的屬名是瘧原蟲。
Plasmodium comprises a wide variety of parasites, around 200, but only a handful can contaminate humans. As they are parasites, they use several hosts to feed and reproduce.
瘧原蟲包含各種各樣的寄生蟲,大約200種,但是只有極少數會污染人類。 由于它們是寄生蟲,因此它們使用多個宿主來繁殖和繁殖。
5 species of plasmodium parasites can successfully parasite humans and be harmful to them, P. flaciparum, P. vivax, P. ovale, P.malariae, and P.knowlesi (P. stands for Plasmodium). While all of the 5 species cited can cause malaria, it is p.flaciparum that is usually lethal and causes severe malaria.
5種瘧原蟲的可以成功寄生蟲人類和有害于健康,P. flaciparum,間日瘧原蟲,卵形瘧原蟲,三日瘧原蟲和 P.knowlesi(P.代表瘧原蟲)。 雖然所有的5種引可引起瘧疾,它是p.flaciparum這通常是致命的,并導致嚴重的瘧疾。
two different types of malaria causing plasmodiums, source: https://microbenotes.com/differences-between-plasmodium-vivax-and-plasmodium-falciparum/兩種導致瘧原蟲的瘧疾,來源: https : //microbenotes.com/differences-between-plasmodium-vivax-and-plasmodium-falciparum/These parasites have a rather simple life cycle compared to other types of parasites, as they only have two hosts, mosquitoes and humans.
與其他類型的寄生蟲相比,這些寄生蟲的生命周期相當簡單,因為它們只有兩個宿主,蚊子和人類。
The parasite’s life starts in the mosquito’s gut, and after a multiplication stage, they move to the insect’s salivary gland. When the mosquito feeds on humans for blood, the plasmodium goes into the human’s blood flow. Following the infection, they first go in the liver and go through a multiplication phase. Once mature and in more significant numbers, they leave the liver and go in the red blood cells (erythrocytes).
寄生蟲的生活始于蚊子的腸道,在繁殖階段之后,它們移到了昆蟲的唾液腺上。 當蚊子以人類為食時,瘧原蟲進入了人類的血液。 感染后,它們首先進入肝臟并經歷繁殖期。 一旦成熟并且數量更多,它們就會離開肝臟,進入紅細胞(紅細胞)。
At this point, the parasite will keep multiplying until the body can effectively fight the infection. Some of the plasmodium will differentiate into sexual egg-like cells and will freely flow in the humans’ blood until a mosquito feeds again and uptakes them. Once in the mosquitoes gut again, the male egg-like cell will fertilize the female, and the cycle can start again.
在這一點上,寄生蟲將繼續繁殖,直到身體可以有效抵抗感染。 某些瘧原蟲會分化為有性的卵樣細胞,并在人的血液中自由流動,直到蚊子再次覓食并吸收它們為止。 再次進入蚊子腸,雄性卵狀細胞將使雌性受精,并且周期可再次開始。
life cycle of the plasmodium, source: https://sk.pinterest.com/pin/706220785288117948/瘧原蟲的生命周期,來源: https : //sk.pinterest.com/pin/706220785288117948/Malaria fevers usually occur when the plasmodium burst out of the erythrocytes in big numbers and provoke an immune response such as high fever, vomiting, nausea, and headaches. As many red blood cells are being lysed, it can also result in anemia in some instances and this anemia can lead to several organ failures and metabolic acidosis.
瘧疾通常在瘧原蟲大量從紅細胞中爆發并引起免疫React(如高燒,嘔吐,惡心和頭痛)時發生。 隨著許多紅細胞的溶解,在某些情況下它也可能導致貧血,這種貧血可以導致多種器官衰竭和代謝性酸中毒。
The best way to effectively fight the parasite lies more in prevention above all else. This includes using mosquito repellent as well as insecticide-treated mosquito nets when sleeping at night. Some high-risk areas are also smoked with insecticide in order to avoid having too many mosquitoes.
有效對抗寄生蟲的最佳方法更多地在于預防。 這包括在晚上睡覺時使用驅蚊劑和經過殺蟲劑處理的蚊帳。 為了避免蚊子過多,一些高危地區也被殺蟲劑熏制。
Once infected, several medications exist, but this medication must be given very rapidly to avoid severe harm or even death of the individual.
一旦被感染,存在幾種藥物,但是必須非常Swift地給予這種藥物以避免嚴重的傷害甚至個體死亡。
One big problem with this parasite and probably one of the main reasons it is still around is its ability to rapidly evolve and develop resistance to the medication it encounters. As the mosquitoes develop resistance to insecticides, it makes it even harder to fight malaria. Additionally, the unhygienic conditions in developing countries, particularly sub-Saharan Africa, are an ideal mosquito breeding ground, especially in the rainy seasons when the insects have many puddles to lay their eggs in.
這種寄生蟲的一個大問題,也可能是它仍然存在的主要原因之一,是其Swift進化并發展出對所遇到藥物的抵抗力的能力。 由于蚊子對殺蟲劑產生抗藥性,因此與瘧疾作斗爭變得更加困難。 此外,發展中國家,特別是撒哈拉以南非洲地區的不衛生條件是理想的蚊子繁殖地,尤其是在雨季,因為昆蟲有很多水坑來產卵。
The most effective way to avoid severe complications from malaria is to give a rapid diagnosis. This will avoid death as well as contain the parasite, so as not to infect more people.
避免瘧疾造成嚴重并發癥的最有效方法是進行快速診斷。 這將避免死亡以及包含寄生蟲,以免感染更多的人。
The most common way to detect a malaria infection is to carry out microscopic blood analysis. This does not require a complex set of skills, however, basic medical knowledge is essential. The problem is that in developing countries the appropriate medical material and personnel are not always readily available, making this diagnosis slower than desirable if not non-existent.
檢測瘧疾感染的最常見方法是進行顯微血液分析。 這不需要復雜的技能,但是,基本的醫學知識是必不可少的。 問題在于,在發展中國家,并非總是能隨時獲得適當的醫療材料和人員,如果不存在這種診斷,其診斷速度將比期望的慢。
Another major issue is that when the resources are available, there is a trend to overestimate the number of infected individuals. Many studies suggest that there is a huge problem of overdiagnosis (as high as 98% wrong diagnosis in some certain rural health centers (Angola, 2012). This results in a massive misuse of malaria treatment and therefore, could end in a shortage of malaria drugs. Plus, this misuse of medication lowers the patients’ response to the drugs when really sick as well as strengthening the plasmodium’s resistance. Finally, this entrenches the lack of appropriate knowledge necessary to do the microscopic blood analysis.
另一個主要問題是,當資源可用時,有一種趨勢是高估受感染個體的數量。 許多研究表明,存在一個過度診斷的巨大問題(在某些農村衛生中心,高達98%的錯誤診斷(安哥拉,2012年),這導致大規模濫用瘧疾治療,因此可能導致瘧疾短缺此外,這種濫用藥物的行為會降低患者真正生病時對藥物的React,并增強瘧原蟲的抵抗力,最后,這會導致缺乏進行顯微血液分析所需的適當知識。
Despite many efforts trying to fight malaria, misuse of resources is still a major problem盡管為抗擊瘧疾付出了許多努力,但濫用資源仍然是一個主要問題AI如何幫助瘧疾? (How could A.I help with malaria?)
We have seen that misuse of resources is a major problem, but how could money be better invested in order to help with malaria?
我們已經看到,濫用資源是一個主要問題,但是如何更好地投資以幫助瘧疾呢?
A. I could be the answer and on many sides. First and foremost, rapidly detecting this disease using A.I and ML programs could save countless lives by quickly giving a diagnosis. Not only would it be faster than a human, but it could potentially be much more accurate, given the numbers seen earlier. This accuracy would lead to better management of the medication, leading to enormous money savings. Those savings could then be reinjected and be of better use.
答:從很多方面來說,我都是答案。 首先,使用AI和ML程序快速檢測出這種疾病可以通過快速做出診斷來挽救無數生命。 考慮到之前看到的數字,它不僅會比人類快,而且可能會更準確。 這種準確性將導致更好地管理藥物,從而節省大量資金。 然后可以將這些節省的資金重新注入并得到更好的利用。
Taking photos of red blood cells is very easy and one would only need a microscope and an adaptor ring to take pictures of the erythrocytes. Once the photo is taken, it would be possible to simply run it through an ML program, trained to recognize infected cells.
拍攝紅血球非常容易,只需顯微鏡和轉接環即可拍攝紅血球。 拍照后,就可以通過訓練有素的ML程序簡單地運行它,該程序經過訓練可以識別受感染的細胞。
Given the rather simple way to detect an infection, we thought we could try to develop a malaria detection program using Giotto, no code involved. The point would be to input unclassified images of red blood cells and let the program do the classification, with satisfactory accuracy.
鑒于檢測感染的方法非常簡單,我們認為我們可以嘗試使用Giotto開發瘧疾檢測程序,而無需編寫任何代碼。 關鍵是輸入未分類的紅細胞圖像,然后讓程序以令人滿意的精度進行分類。
Giotto is a machine learning platform that can develop an ML program in image classification, without having to code. For a more detailed, step-by-step description of Giotto, you can read one of our previous blog-posts.
Giotto是一個機器學習平臺,可以開發圖像分類中的ML程序,而無需編寫代碼。 有關Giotto的更詳細的分步說明,您可以閱讀我們以前的博客文章之一 。
No-code platforms are straightforward to use, and this could really democratize A.I. Not only can anyone use it, but the possibility to deploy a docker means with only a computer and no wifi, you can use A.I in the simplest way. This is a critical point as many of the high-risk malaria zones are very remote and, therefore, don’t have access to decent internet connexion. If this worked, rural communities would not have to travel to the closest health center (keeping in mind that this is very far sometimes). The point would not be to replace human expertise, but when there are no other choices, this could be an answer.
無代碼平臺易于使用,這確實可以使AI民主化,不僅任何人都可以使用它,而且部署Docker的可能性意味著只需要一臺計算機而沒有wifi,則可以以最簡單的方式使用AI。 這是一個關鍵點,因為許多高風險瘧疾地區非常偏遠,因此無法獲得良好的互聯網連接。 如果這樣做有效,則農村社區將不必去最近的保健中心(請記住,有時候這很遙遠)。 關鍵不是要取代人類的專業知識,但是當沒有其他選擇時,這可能就是答案。
The key to properly train a program is to have the appropriate data. This means that you need to have adequately labeled folders of each class to analyze as well as more less same number of images in each folder (when talking about image classification, of course). We used a data set composed of 27'558 images of erythrocytes, half parasited, and half normal.
正確訓練程序的關鍵是擁有適當的數據。 這意味著您需要在每個類別中有足夠標簽的文件夾來進行分析,并且每個文件夾中的圖像數量要少得多(當然,在談論圖像分類時)。 我們使用的數據集由27'558張紅細胞圖像組成,一半為寄生蟲,一半為正常人。
Having properly labeled images is primary具有正確標簽的圖像是主要的In order to work with this data set, we first had to upload it to Giotto. We then had to choose the data augmentation techniques that had to be applied to our images. In this case, all of the methods were applied, as none of them would alter the images’ integrity. By doing this augmentation process, we boosted our model’s performance by feeding it more data than the initial set.
為了使用該數據集,我們首先必須將其上傳到Giotto。 然后,我們必須選擇必須應用于圖像的數據增強技術。 在這種情況下,將應用所有方法,因為它們都不會改變圖像的完整性。 通過執行這種擴充過程,我們通過向模型提供比初始集合更多的數據來提高模型的性能。
All of the data augmentation processes were applied to our data.所有的數據擴充過程都應用于我們的數據。Once this was done, we had to choose all of the specificities of our model, such as the resNet number and the number of epochs. Our model performed best with a resNet 34 and 30 epochs.
完成此操作后,我們必須選擇模型的所有特殊性,例如resNet數量和時期數。 我們的模型在resNet 34和30個紀元時表現最佳。
Training time was rather long on this project as the data set was massive; it took around 3 hours to complete the training.
該項目的培訓時間很長,因為數據集非常龐大。 完成培訓大約花了3個小時。
With a 20% validation split, our model achieved 97.88% accuracy, without a single line of code.
通過20%的驗證拆分,我們的模型無需一行代碼即可達到97.88%的準確性。
This accuracy is satisfying as the setting chosen for the program are not too high (resNet 34 and 30 epochs). Very similar results were achieved with 10 epochs already (~96%).
由于為程序選擇的設置不太高(resNet 34和30個紀元),因此此精度令人滿意。 已經有10個時期(?96%)取得了非常相似的結果。
結論 (Conclusion)
Despite medicine’s impressive innovations, this parasite still thrives in many parts of the world, taking hundreds of thousands of lives every year. Finding new ways to fight malaria is crucial. This could help millions as well as promote economic development in high-risk areas.
盡管醫學取得了令人矚目的創新,但這種寄生蟲仍在世界許多地方蓬勃發展,每年奪去數十萬人的生命。 尋找與瘧疾作斗爭的新方法至關重要。 這可以幫助數百萬人,并促進高風險地區的經濟發展。
97.88% accuracy is satisfactory, especially as this was achieved after training of only 3 hours, using absolutely no code. In low resource locations, this could easily achieve better results than someone who has little if no training to run an appropriate analysis. Such an easy application would be straightforward to deploy in rural places, allowing populations to self- medicate if there is no other choice.
97.88%的準確性是令人滿意的,特別是因為僅用3個小時的培訓就完全沒有使用任何代碼,就可以達到這一精度。 在資源匱乏的地區,這比那些沒有受過足夠的培訓(如果沒有進行適當的分析的培訓)的人而言,很容易獲得更好的結果。 這樣簡單的應用程序將很容易部署在農村地區,如果沒有其他選擇,則可以使人們自療。
Summary of our model.我們的模型摘要。You can try the program yourself : https://cloud.giotto.ai/ic/malaria password: malaria123
您可以自己嘗試該程序: https ://cloud.giotto.ai/ic/malaria密碼:malaria123
來源和鏈接 (sources and links)
翻譯自: https://towardsdatascience.com/detecting-malaria-using-transfer-learning-fab5e1810a88
目標檢測遷移學習
總結
以上是生活随笔為你收集整理的目标检测迁移学习_使用迁移学习检测疟疾的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 机器学习 处理不平衡数据_在机器学习中处
- 下一篇: 多目标分类的混淆矩阵_用于目标检测的混淆