The Sequence Manipulation Suite - a collection of simple programs for generating, formatting, and analyzing short DNA and protein sequences.
annotate_SNPs.pl - this Perl script annotates SNPs identified by the next-generation sequencing of genomic DNA or transcripts.
backup.sh - this shell script archives directories of interest on a Linux-based system.
genome_pattern_search.pl - a Perl program that reads a genomic sequence in FASTA format and searches for the patterns you specify using regular expressions.
get_cds.pl - this Perl script accepts a GenBank or EMBL file and extracts the protein translations or the DNA coding sequences and writes them to a new file in FASTA format.
get_genes_in_area.pl - this Perl script accepts as input a position or list of positions in a genome and returns descriptions of nearby genes.
get_orfs.pl - this Perl script accepts a sequence file as input and extracts the open reading frames (ORFs) greater than or equal to the size you specify.
get_snps_by_gene_ontology.pl - this Perl script accepts a species name and a Gene Ontology (GO) accession number, and returns a list of SNPs located in or nearby genes associated with the GO accession.
maq_pipeline.sh - this bash script processes short sequence reads from Illumina's Genome Analyzer (Solexa) system, using the Maq package.
md5_sums.pl - this Perl script accepts a list of directories and recursively generates a list of the files in the directories and their MD5 values.
NGS-SNP - this collection of scripts annotates raw SNP lists returned from programs such as Maq.