生信漫谈如何利用MEGA7构建系统进化树
生活随笔
收集整理的這篇文章主要介紹了
生信漫谈如何利用MEGA7构建系统进化树
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
前言
生物技術近年發展越來迅猛,掌握一門生信語言或者一個生信軟件的使用,這將為我們的科研學習之路提供非常大的便利。今天我們主要來介紹如何用MEGA7進行進化樹。
1、下面以MEGA7為例來進行講解,下面是下載地址,大家根據自己的系統進行下載即可。
http://www.megasoftware.net/?
?
?
?
2、序列的準備,必須是fasta結尾的格式,其他像txt格式,軟件不能識別,以下以擬南芥SPL15基因的蛋白序列為例,進行同源序列查找
>NP_191351.1 SPL15 [organism=Arabidopsis thaliana] [GeneID=824961] MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSN VKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALF TSHYSRIAPSLYGNPNAAMIKSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPE MINNNSTDSSCALSLLSNSYPIHQQQLQTPTNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQ YLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMGGFELYLHQQVLKQYMEPENTRAYDSSPQHF NWSL?
選擇其中同源序列高的前19條蛋白序列進行下載進行示范
?
?
下載后的序列形式:
>AST51816.1 Venus [Cloning vector pSTB205] MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKLICTTGKLPVPWPTLVTTLGYGLQCFARYPDHMK QHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYITADKQKN GIKANFKIRHNIEDGGVQLADHYQQNTPIGDGPVLLPDNHYLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKE LLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKVCC IHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKS VLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTN TWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMGGF ELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >NP_191351.1 squamosa promoter binding protein-like 15 [Arabidopsis thaliana] MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >KAG7634825.1 SBP domain superfamily [Arabidopsis suecica] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >CAA0387110.1 unnamed protein product [Arabidopsis thaliana] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMG GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >CAD5326126.1 unnamed protein product [Arabidopsis thaliana] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKSVLGDP TAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTNTWRPS SGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMGGFELYLH QQVLKQYMEPENTRAYDSSPQHFNWSL >KAG7561265.1 SBP domain superfamily [Arabidopsis thaliana x Arabidopsis arenosa] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYSRIAPSLYGNPNAAMI KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGTTMGGFELSLH QQVLRQYMEPENTRAYDSSPQHFNWSL >XP_002878178.1 squamosa promoter-binding-like protein 15 [Arabidopsis lyrata subsp. lyrata] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYTRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMRRSGPWQINPERESHQIMNVLSHGSSSFTTCPEITNNNSTDSSCALSLLSNSNPIQQQQLQTP TNLWRPSSGFDSMISFSDRVTMAQPPPISTHHQYLSQTWDVMAGGKSNSHYMSPVSQISEPAEFQISNGTTMGGFELSLH QQVLRQYMEPENTRAYDSSPQHFNWSL >KAG7566101.1 SBP domain [Arabidopsis suecica] MELLMGSGHAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQLEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTSSLFTSRYSRIAPSLYGNPNAAMI KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPAGFQISNGTTMGGFELSLH QQVLRQYMEPENTRAYDSSPQHFNWSL >CAE6076605.1 unnamed protein product [Arabidopsis arenosa] MRRGRGKGKRQNATAREDRGSGEEEKIPAFRRRGRPQKPVKDEIEEEEVELVKKTEEEEDKDDDTNGSVTSKEDVTENGR KRKKPVESKESNITEEENGVGSKSSTEDSMKSSSSIGFRQNGSRRKNKPRRAAEAVVECNGAESGGSSSTESSSLSGGLR FGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQ LSEFDLEKRSCRRRLACHNERRRKPQSTTSLFTSRYSRIAPSLYGNPNAAMIKSVLGDPMAWSTAKSVMRRSGPWQINPE RESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTPTNLWRPSSGFDSLISFSDRVTMAQPPPI STHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGSTMGGFELSLHQQVLRQYMEPENTRAYDSSPQHFNWSL >XP_006291402.1 squamosa promoter-binding-like protein 15 [Capsella rubella] MELLMGSGQAESGGSSSTESSLLSGGLRFGQKIYFEDGSGSGSKNRVSTGHKSSMTTVARCQVEGCKMDLSNAKAYYSRH KVCCIHSKSSKVIVSGLHQRFCQQCSRFHHLSEFDLEKRSCRRRLACHNERRRKPQPATLFTSHYTRIAPSLYGNANAAM IKSVLGDPTAWSTSRSVMRSSGPWQINPVKESNQLMNVYSQESSSFTITCPEMMNNNSTDSGCALSLLSNSNPIQQQQQQ PQTQTNIWRSSSGFDSMILDRVTMAQPPPISGHHQYLNQTLAFMAGEKSNSHYMSPVLGPSQISEPDEFQISNGTTMDGF ELSLHQQVLRQYMEPENTRAYDSSPHYFNWSL >CAH2063751.1 unnamed protein product [Thlaspi arvense] MELLMGSGQNRTESYGSSSTESSSLSGGLRFGQKIYFEDGSGSGGGSNKNRVNTGRKSRTARCQVEGCRMDLSNVKTYYS RHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTSLLTSRYSRIAPSLYGNAN TAMIRSVLGDPTAWSTARSVMRRSAPWQINPERESHQLMNVFSHDSSSFTTTCPEMMNSNGTDSSCALSLLSNSNTNQQQ QLLQTSTNIWRPSSGFDSANADRATMAQPPPVSNQHQYLNQTWEFMAGEKSNSHYLSPVLGLSQISEPVDFQISNGTTMG GFELSIHQQVLRHYMEPENTRAYDSSAQHFNWSL >XP_010516431.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa] MELLIGGSGQTESGGASSTKSSSLSGGLRFGQKIYFEDGSGSGSKNRVGTGHKSSTTTTTARCQVEGCKMDLSNAKAYYS RHKVCCIHSKSSKVIVSGLRQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTSQYTRIAPSLYGDANA AMMKSVLGDPTVWSTARSVMRRSGPWQISPVKESHHQLMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNSNP IQQQQQQLQTQTHIWRPSLGFDSMTVDRVTMAQPPPISSHHQYLNQTLEFMAGEKSSSHYMSPVLGPSQISEPDEFQISN GTTMDGFELSLHQQVLRQYMEPENTRAYDSSPHHFNWSL >AKC05620.1 squamosa promoter-binding-like protein 15 [Cardamine hirsuta] MELLMGSGQSESGASSSNESSSLSGGLRFGQKIYFEDGSGSGSKNRVSSTGRKSSTTTARCQVEGCRMDLSNAKTYYSRH KVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPATTLFTSRFTRTAPSHYGNANAA MIKSVLGDPTAWTAERSVMRRSAPWQSNPSHQVMIDFSHGSSSLTTTCPEMMNNTSTDSSCALSLLSNSNQTQQLQQQLQ TPANIWRASSGFDSMIADRVTMAQPPPISTHHQYLNQSWEFMPGEKNDSHYMSPMSQISEPADLHMRNRTTMGGFEVSLH QQVMRQYMAPENTRAYDSSPQHFNWSL >XP_010504729.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa] MELLMGGSGQTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVGAGHKSSTTARCQVEGCKMDLSNAKAYYSRHK VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTRIAASLYGNANAAMIKSVL GDPTVWSTARSVMRRSGPWQINPVKESHHQHMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNPIQQQQQQLQ TQTNIWRPSSGFDYMTVDRVTLAQPPPIPSHHQYLNQTLEFMTGEKNSSHYMSPALGPSQISAPDEFQISNGTTMDGFEL SLHQQVLRQYMAPENTRAYDSSPHHFNWSL >CAA7060637.1 unnamed protein product [Microthlaspi erraticum] MELLMDSSQTESGGSSSIESSSLTGGLRFGQKIYFEDGSGSGAKSSKNRVNTARKSSTSTARCQVEGCRMDLSNAKTYYS RHKVCCIHSKSSNVIVSGLHQRFHLLSEFDLEKRSCRRRLACHNERRRKPHATTNLLTSRYSRIAPSLYENANTAIFRSV LGDTTAWSAARPVMRRSGPWQINPERESNLNVFSHGSSSFTTCPAMMNNNSTDSSCALSLLSNSNTNTNQQQQQPLQTST DTWRPSSGFDSMIADRVTMAQPPPVSIHNQYLNQSWDFMEGEKSNSHHMSPVLGLSQISEPADFQLSNGMGGGFELSLHQ QVLKQYMEPENTRAYDSSPQHFNWSL >KAG2324838.1 hypothetical protein Bca52824_007566 [Brassica carinata] MELLMGSGQDHPQSAGSSSTLSGGLRFGQKIYFEDGSGAGLSRNRVNNTGRKSMTARCQVEGCRMDLSNAKTYYSRHKVC CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQTTTTLLTSHYSSIAPSLYGNAIRSVLG DPTLWSTARGSSAPWQINPERESHHQLMNIISFGSSSFTNSTDSSCALSLLSNSNRNQQEQQPLQTPTNAWRPSLDFDSI VADRVTMAQPPPVSIQNQYLNQTWEFMSGEKSNAHCISPVLGLSQISEPVDFQTSNGATMSGVELSLHQQVLRQYLEPEN TRAYDSSHQHFNWSL >CAH8384605.1 unnamed protein product [Eruca vesicaria subsp. sativa] MELEMGSGQKKPESAGSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVSSTGRKSMTARCQVEGCRTDLSNAKTYYSRHKVC CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTLLTSRYSSLYGNAIRSVLGDPTT WSTARGSAPWKINQESDRHQLMNVISFGSSSFTTCPEMMNNNSTDSSCALSLLSNSNPNQQEQQPLQTSNTIWRPSLDFD STVADRVTMAQPPPVSMQNQYLNQTWEFMSGEKSNAQCISPVLGQSQISEPVDFQIGTTMGGGFELSLHQQVLRQYMEPE NTRAYDTSPQYFNWSL >KAF8114775.1 hypothetical protein N665_0034s0114 [Sinapis alba] MELLMGSGQNQPESAGSSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVNTGRKSTTARCQVEGCRMDLSSAKTYYSRHKVC CIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTFLTSHYSSIAPSLYGNAIRGVLG DSTTWSTARGSAPLQINPERESHRLMNVFSFGSSSFTNNSTDSSCALSLLSNSNPNQQEQQPLQTPTNTWRPSLDFDSIV ADRVTMAQPPPVSVQNQYLNQTWEFMSGEKSNGQHYISPVLGLSQISEPVDFQISNGATMSGVELSLHQQVLRQYLEPEN TRAYDSSPQHFNWSL >XP_010427684.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa] MELLMGGTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVVTGHKSSTTTTTARCQVEGCKMDLSNAKAYYSRHK VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLFTSHYTRIAPSLYGNANAAMI KSVLGDPTVWSTARSVMRRSGPWQINPVKESHHQLMNVFSQESSSFTITCPEMMNNNNSTDSSCALSLLSNSNSNPIQQQ QQQLQTQTNIWRPSLGFDSMTVDRVTLAQPPPILSHHQYMSPVLGPSQISAPDEFQISNVTTMDGFELSLHQQVLRQYME PQNTRAYDSSPHHFNWSL3、導入蛋白序列,點擊File菜單欄導入或者直接拖進MEGA7軟件都可以
?
?
?
?
以上任一種形式都可以。
4、多序列比對,導入成功后如下圖所示。
?
?
選擇Alignment > Align by ClustalW > OK > 默認參數
?
?
?
?
比對后的結果如下圖:
?
?
5、系統進化樹構建,選擇NJ法進行構建系統進化樹
?
?
?
?
?
?
6、結果展示
第一種,步長樹
?
?
第二種步長對齊樹
?
?
其他形狀的樹,點擊以上按鈕進行展示
?
?
7、保存nwk樹文件,導入樹圖片進行美化
?
?
?
?
?
?
生信漫談
生信漫談,認識生信,學習生信,跨越生信入門路上的障礙,從而利用生信技術解決科研學習路上的絆腳石!
總結
以上是生活随笔為你收集整理的生信漫谈如何利用MEGA7构建系统进化树的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 卧槽,发现一堆Windows神器软件!!
- 下一篇: java信息管理系统总结_java实现科