Core Tools


LaTeX is a program (derived from Don Knuth’s TeX program) that is ideal for writing manuscripts that use a lot of equations. I wrote my PhD in LaTeX and so have most of my graduate students. A LaTeX cheat sheet (list of symbols, etc) can be found here.

LaTeX stuff: provides a description of the programs that I use for managing pdf reprints, BibTeX databases and other activities related to writing manuscripts in LaTeX.


GNU Emacs is “an extensible, customizable, free/libre text editor — and more.” An Emacs cheat sheet can be found here.

Emacs stuff: provides a description of my emacs configuration and packages that I have found useful.

Sequed is an emacs package I wrote for viewing alignments and manipulating DNA sequence data in fasta format.

Proggy fonts

These are my favorite terminal fonts. Available here.

Programming Tools

Lex and Yacc

Lex and yacc (now flex and bison) can be used to write a lexer and parser for such things as configuration files, trees in newick format, and so on. A basic tutorial is available here. There are also detailed manuals for flex and bison.

Javascript Resources

React stuff React is an open source project that has developed a pretty nice language extension to Javascript. The basic idea is to have components that include both logic and presentation (rather than having logic in a javascript file, presentation in an html file, and styles in a css file). A simplified language for writing React called JSX can be used to write the code which is then transpiled into javascript using Babel. Straight javascript and JSX can be mixed together in a source file as needed. Pretty cool. React can also use a component based approach to styling that is described in detail at I have been experimenting with React and will collect some useful sites together here eventually.


This is a suite of gnu tools for creating portable source code distributions. A great tutorial can be found here. The definitive “goat” book on autotools and libtools can be found online here.


This is a gnu standard library that is incredibly useful for parsing command line options. It is best used with the gcc/c++ compiler. A brief description is found here.


Git (git) is a great tool for maintaining prior versions of your source code (or manuscripts) in an organized hierarchy. A free book on git is here. A quick reference is here. Open source projects can be hosted for free on github and private repositories for academic research (with some limitations) can be hosted on bitbucket. I use both. Projects on github can have web pages (gitpages) hosted as well. This website is hosted on github


NGS data file formats

The FASTQ format appears to be emerging as a standard for next-generation sequencing data (particularly Illumina GAII reads). The Wikipedia description of the format is here.

NCI wiki

This wiki explains barcoding scheme and other features of cancer genome databases such as GCAT.