當前位置：首頁 > 编程资源 > 综合教程 >内容正文

综合教程

Integrative Genomics Viewer (IGV)

發布時間：2023/12/13 综合教程 32 生活家

生活随笔收集整理的這篇文章主要介紹了 Integrative Genomics Viewer (IGV) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

http://software.broadinstitute.org/software/igv/igvtools_commandline

Running igvtools from the Command Line

Downloading igvtools

The igvtools utilities can be downloaded from theDownloadspage on the IGVWeb site.

igvtools_<version #>.zip includes the jar file and shell scripts for running igvtools, as well as the genome files.
igvtools_nogenomes_<version #>.zip includes the jar file and shell scripts and shell scripts for running igvtools.

Starting with shell scripts

The igvtools utilities can be invoked, with or without the graphical user interface (GUI), from one of the following scripts:

igvtools (command-line version for linux and Mac OS 10.x)
igvtools_gui (gui version for linux and Mac OS 10.x)

igvtools.bat (command-line version for windows)
igvtools_gui.bat (gui version for windows)

The general form of the command-line version is:

igvtools [command] [options][arguments]
or
igvtools.bat [command] [options][arguments]

Recognized commands, options, arguments, and file types are described below.

Starting with java

Igvtools can also be started directly using Java. This option allows more control over Java parameters, such as the maximum memory to allocate. In this example, igvtools is started with 1500 MB of memory allocated:

java -Xmx1500m -Djava.awt.headless=true -jar igvtools.jar [command] [options][arguments]

To start with a GUI the command is

java -Xmx1500m -jar igvtools.jar -g

Memory settings

The scripts above allocate分配 a fixed amount of memory. If this amount is not available on your platform you will get an error along the lines of "Could not start the Virtual Machine". If this happens you will need to edit the scripts to reduce the amount of memory requested, or use the Java startup option. The memory is set via a "-Xmx" parameter. For example -Xmx1500m requests 1500 MB, -Xmx1g requests 1 gigabyte.

Genome

The genome argument in thetoTDFandcountcommand can be either an id, or a full path to a.chrom.sizesor an IGV .genome file.

Commands

toTDF

ThetoTDFcommand converts轉化 a sorted data input file to a binary tiled data (.tdf) file. Use this command to pre-process large datasets for improved IGV performance.

Supported input file formats are: .wig, .cn, .snp, .igv, and .gct.

Note:This tool was previously known as "tile"

Usage:

igvtools toTDF [options] [inputFile] [outputFile] [genome]

Required arguments:

inputFile The input file (see supported formats below).

outputFile Binary output file. Must end in ".tdf".

genome A genome idor path to a.chrom.sizesor .genome file. Default is hg18.

Options:

-z num Specifies the maximum zoom level縮放級別 to precompute預計算. The default
value is 7 and is sufficient for most files. To reduce file
size at the expense of IGV performance this value can be
reduced.

-f list A comma delimited list specifying window functions to use
when reducing the data to precomputed tiles. Possible
values are min, max, and mean. By default only the mean
is calculated.

-p file Specifies a "bed" file to be used to map probe identifiers
to locations. This option is useful when preprocessing . gct
files. The bed file should contain 4 columns:
chr start end name
where name is the probe name in the .gct file.

Example:

igvtools toTDF -z 5 copyNumberFile.cn copyNumberFile.tdf hg18

Notes:

Data file formats, with the exception of .gct files, must be sorted by start position. Files can be sorted with thesortcommand described below. Attempting to preprocess an unsorted file will result in an error.

Count

Thecountcommand computes average feature density平均密度特征 over a specified window size across the genome. Common usages include computing coverage for alignment files and counting hits in Chip-seq experiments. By default, the resulting file will be displayed as a bar chart when loaded into IGV.

Supported input file formats are: .sam, .bam, .aligned, .psl, .pslx, and .bed.

Usage:

igvtools count [options] [inputFile] [outputFile] [genome]

Required arguments:

inputFile The input file (see supported formats above).

outputFile The output file, which can be binary "tdf" or ascii "wig" format. The filename must end in ".tdf" or ".wig", or be the special string "stdout". To indicate that you want to output both a .tdf and a .wig file, list both output filenames as a single string, separated by a comma with no other delimiters. If the output file is named "stdout"the output will be written to the standard output stream in wig format.

genome A genome idor path to a.chrom.sizesor .genome file. Default is hg18.

Options:

-z, --maxZoom num

Specifies the maximum zoom level to precompute.

-w, --windowSize num

The window size over which coverage is averaged. Defaults to 25 bp.

-e, --extFactor num

The read or feature is extended by the specified distance in bp prior to counting. This option is useful for chip-seq and rna-seq applications. The value is generally set to the average fragment length of the library minus the average read length.

--preExtFactor num

The read is extended upstream from the 5' end by the specified distance.

--postExtFactor num

Effectively overrides the read length, defines the downstream extent from the 5' end. Intended for use with preExtFactor.

-f, --windowFunctions list

A comma delimited list specifying window functions to use when reducing the data to precomputed tiles. Possible values are min, max, mean, median, p2, p10, p90, and p98. The "p" values represent percentile, so p2=2nd percentile, etc.

--strands [arg]

By default, counting is combined among both strands. This setting outputs the count for each strand separately. Legal argument values are 'read' or 'first'. 'read' Separates count by 'read' strand, 'first' uses the first in pair strand". Results are saved in a separate column for .wig output, and a separate track for TDF output.

--bases

Count the occurrence of each base (A,G,C,T,N). Takes no arguments. Results are saved in a separate column for .wig output, and a separate track for TDF output.

--query [querystring]

Only count a specific region. Query string has syntax <chr>:<start>-<end>. e.g. chr1:100-1000. Input file must be indexed.

--minMapQuality [mqual]

Set the minimum mapping quality of reads to include. Default is 0.

--includeDuplicates

Include duplicate alignments in count. Default false. If this flag is included, duplicates are counted. Takes no arguments

--pairs

Compute coverage from paired alignments counting the entire insert as covered. When using this option only reads marked "proper pairs" are used.

Notes:

The input file must be sorted by start position. See thesortcommand below.

Example:
igvtools count -z 5 -w 25 -e 250 alignments.bam alignments.cov.tdf hg18

Index

Creates an index for an alignment or feature file. Index files are required for loading alignment files into IGV, and can significantly improve performance for large feature files. Note that the index file is not directly loaded into IGV. Rather, IGV looks for the index file when the alignment or feature file is loaded. This command does not take an output file argument. Instead, the filename is generated by appending ".sai" (for alignments) or ".idx" (for features) to the input filename as IGV relies on this naming convention to find the index . The input file must be sorted by start position (see sort command, below).

Supported input file formats are: .sam, .aligned, .vcf, .psl, and .bed.

NOTES:

This command will not index a binary (BAM) file. Use thesamtoolspackage to sort and index BAM files.
The "sai"index is an IGV format, it does not work with samtools or any other application.

Usage:

igvtools index [inputFile]

Sort

Sorts the input file by start position, as required.

Supported input file formats are: .cn, .igv, .sam, .aligned, .psl, .bed, and .vcf.

NOTE:This command does not work with BAMfiles. Thesamtoolspackage can be used to sort .bam files.

Usage:

igvtools sort [options] [inputFile] [outputFile]

Required arguments:

inputFile

outputFile

The special string "stdout" can be used as [outputFile], in which case the output willbe written to the standard output stream instead of a file.

Options:

-t tmpdir

Specify a temporary working directory. For large input filesthis directory will be used to store intermediate results ofthe sort. The default is the users temp directory.

-m maxRecords

The maximum number of records to keep in memory during thesort. The default value is 500000. Increase this numberif you receive "too many open files" errors. Decrease itif you experience "out of memory" errors.

總結

以上是生活随笔為你收集整理的Integrative Genomics Viewer (IGV)的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： GPU Gems1 - 17 环境遮挡
下一篇：利用邓西百度网盘消息群发工具对百度网盘的