Protein Alignment Program
A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In many cases, the input set of query sequences are assumed to have an evolutionary relationship by which they share a lineage and are descended from a common ancestor. Program Selection Optimize for. And determines overall alignment score. Search using Blastp (protein-protein BLAST). Software to align DNA, RNA, protein, or DNA + protein sequences via pairwise and multiple sequence alignment algorithms including MUSCLE, Mauve, MAFFT, Clustal Omega, Jotun Hein, Wilbur-Lipman, Martinez Needleman-Wunsch, Lipman-Pearson and Dotplot analysis.
Benchmarking [ ] Name Link Authors PFAM 30.0 (2016) SMART (2015) Letunic, Copley, Schmidt, Ciccarelli, Doerks, Schultz, Ponting, Bork BAliBASE 3 (2015) Thompson, Plewniak, Poch Oxbench (2011) Raghava, Searle, Audley, Barber, Barton Benchmark collection (2009) Edgar HOMSTRAD (2005) Mizuguchi PREFAB 4.0 (2005) Edgar SABmark (2004) Van Walle, Lasters, Wyns Alignment viewers, editors [ ] Please see. Short-read sequence alignment [ ] Name Description paired-end option Use FASTQ quality Gapped Multi-threaded License Link Reference Year Arioc Computes Smith-Waterman gapped alignments and mapping qualities on one or more GPUs. Supports BS-seq alignments. Processes 100,000 to 500,000 reads per second (varies with data, hardware, and configured sensitivity). Yes No Yes Yes Free, 2015 BarraCUDA A GPGPU accelerated (FM-index) short read alignment program based on BWA, supports alignment of indels with gap openings and extensions. Yes No Yes Yes, and Free, BBMap Uses a short kmers to rapidly index genome; no size or scaffold count limit. Higher sensitivity and specificity than Burrows-Wheeler aligners, with similar or greater speed.
Performs affine-transform-optimized global alignment, which is slower but more accurate than Smith-Waterman. Handles Illumina, 454, PacBio, Sanger, and Ion Torrent data. Play 101 Car Games. Splice-aware; capable of processing long indels and RNA-seq.
Pure Java; runs on any platform. Yes Yes Yes Yes Free, 2010 Explicit time and accuracy tradeoff with a prior accuracy estimation, supported by indexing the reference sequences.
Optimally compresses indexes. Can handle billions of short reads. Can handle insertions, deletions, SNPs, and color errors (can map ABI SOLiD color space reads). Performs a full Smith Waterman alignment. Yes, Free, [ ] 2009 BigBWA Runs the -BWA on a cluster. It supports the algorithms BWA-MEM, BWA-ALN, and BWA-SW, working with paired and single reads. It implies an important reduction in the computational time when running in a Hadoop cluster, adding scalability and fault-tolerancy.
Yes Low quality bases trimming Yes Yes Free, 3 2015 BLASTN BLAST's nucleotide alignment program, slow and not accurate for short reads, and uses a sequence database (EST, sanger sequence) rather than a reference genome. Can handle one mismatch in initial alignment step. Yes, client-server, for academic and noncommercial use 2002 Uses a to create a permanent, reusable index of the genome; 1.3 GB memory footprint for human genome. Aligns more than 25 million Illumina reads in 1 CPU hour.
Supports Maq-like and SOAP-like alignment policies Yes Yes No Yes, Free, 2009 HIVE-hexagon Uses a and bloom matrix to create and filter potential positions on the genome. For higher efficiency uses cross-similarity between short reads and avoids realigning non unique redundant sequences. It is faster than bowtie and bwa and allows indels and divergent sensitive alignments on viruses, bacteria, and more conservative eukaryotic alignments. Yes Yes Yes Yes, for academic and noncommercial users registered to HIVE deployment instance 2014 BWA Uses a to create an index of the genome. It's a bit slower than bowtie but allows indels in alignment. Auto Tool Minecraft Code. Yes Low quality bases trimming Yes Yes Free, 2009 BWA-PSSM A probabilistic short read aligner based on the use of position specific scoring matrices (PSSM).
The aligner is adaptable in the sense that it can take into account the quality scores of the reads and models of data specific biases, such as those observed in Ancient DNA, PAR-CLIP data or genomes with biased nucleotide compositions. Yes Yes Yes Yes Free, 2014 CASHX Quantify and manage large quantities of short-read sequence data.