PHYLIP and phylogeny inference

PHYLIP Phylogeny
PHYLIP Phylogeny

The PHYlogeny Inference Package, PHYLIP, written by Joe Felsenstein (University of Washington) has reached the milestone of being 30 years old. Almost every recent volume of Annals of Botany has papers using this program for analysis of data, among the 32,000 papers that cite it published since 1980. There are few software packages that have stood the test of time over so many generations of computers – I have been using it regularly since I was a PhD student. Developed long before WIMP (windows, icons, mouse, pointer) interfaces, PHYLIP still retains its modular simplicity and versatility. You are much less likely to end up with garbage-in, garbage-out, none of the algorithms is proprietary, there is a gigantic literature associated with the program, and the documentation is a model of clarity. I always feel uncomfortable with the big modern informatics analysis packages where you are so ‘protected’ (or is that ‘unprotected’?) from the algorithms, and the PHYLIP methods have a robust (if sometimes controversial) literature behind them.

Over this period, Dr Felsenstein has continued with the extension of the program, and as well as the documentation, there is a Phylip Facebook pages, where he responds to every enquiry from the most basic to most advanced. Despite the widespread use of PHYLIP and value to a wide community, the anti-acknowledgements at the bottom of the Credits page also make interesting reading, showing just how difficult the funding of development and publication of such a widely used method can be.

Because of its straightforward nature, I use PHYLIP regularly in courses – at the basic level through one of the many web-interfaces so downloading and installation is not needed. One of my uses is shown at : the students score a number of leaf characters from trees and shrubs, which classifies evergreens (Quercus ilex and Osmanthus) together, and then compare that with DNA data which follows a natural classification and oaks lie together. Another easily visualized application of PHYLIP for teaching is to work out relationships between Scotch whiskies. A book by the second most famous Michael Jackson (after, that is, the editor of AoB Plants) classifies 109 different malt whisky types on the basis of 68 organoleptic qualities (colour, nose, body, palate and finish), and Lapointe and, Legendre published a “Classification of Pure Malt Scotch Whiskies.” (Applied Statistics. 1994; 43(1):237; (archive repository version available here; data matrix for analysis is here at Pierre Legendre’s website, and he has another related paper on distance matrices between malt whiskies).

So happy birthday PHYLIP, and congratulations to your creator.

Edit 1 Dec: add link to malt whisky (not whiskey) datamatrix.

Peter Bradbury from USDA, Cornell, also notes other useful SNP analysis programs:


There is also PowerMarker, which I use in courses too and allows simple pasting of Excel formatted data.

  • Thanks!

    I would join in a toast of a single-malt whiskey (if I had any). When I lived in Edinburgh on sabbatical 1982-1983 I used to treat friends to a single-malt tasting session — by going down to the nearest off-license and buying some of the those little “airline bottles” of whiskey, some of which were actually single malts. We then had very tiny but very inexpensive tastes.

    During that year I happened to hear presentation on a numerical taxonomy study of microorganisms in whiskey malts, which contained the following memorable sentence” “First we went around and visited 150 Highland distilleries — that was the fun part.”

    Now if I could only persuade the granting agencies that PHYLIP was worth funding. Maybe if I sent them bottles of single-malt …