Friday, December 11, 2009

Who's Hungry for DIM SUM?

Kevin Savidge, Emily McTavish and I have been working on a program for the simulation of demography and individual migration events on a continuous landscape. We've decided to call it DIM SUM (Demography and Individual Migration Simulated Using a Markov chain). DIM SUM was primarily written by Kevin as an undergraduate research project.

DIM SUM is a stand-alone Java program that takes one main XML input file, as well as files specifying carrying capacity and borders across an arbitrary landscape. These secondary files can be provided as numeric matrices or as images. If images, particular color channels correspond to the different input variables. Because it can take images as input, you can output your favorite map from GIS software (with some color corresponding to how suitable a given pixel of habitat may be) and simulate directly on that landscape! It also uses great circle distances, in case the landscape on which you're simulating has a large latitudinal span (since two lines of latitude are closer to one another near the poles as compared to the equator).

As output, DIM SUM provides trees of ancestor-descendant relationships, the locations of all individuals, and can also plot all individuals directly on the landscape. Here's a simple example using the Hawaiian islands, where all land area is specified as equally good habitat. In this example, we've used four separate starting populations of size 1, with each population of origin tagged with a different color. Note how the population starting on the island with the greatest expanse of suitable habitat (the Big Island) begins to dominate the populations on the other islands. Here's the population after 5 generations with a population size of 21...




...after 25 generations with a population size of 2,288...


...after 200 generations with a population size of 2,635...



...and after 800 generations with a population size of 2,636...



Our original motivation for creating such a simulator was to generate data for the testing of phylogeographic methods (although DIM SUM has many other potential uses). In particular, these types of simulations should provide a more evenhanded test of coalescent-based methods and allow the testing of methods that model migration on a continuous landscape (Lemmon and Lemmon, 2008, Syst. Biol., 57(4): 544-561). One very nice feature of continuous landscape methods is that they allow you to infer the geographic locations (think latitude and longitude) of ancestors. In fact, you can plot a likelihood surface directly on the landscape! To do this, you sequentially constrain the location for the ancestor of interest to a series of locations across the landscape, calculate the maximum likelihood score, and then use this matrix of scores to approximate the surface.

DIM SUM can generate data suitable for testing such a method, by comparing the true location of the ancestor to the inferred likelihood surface. We performed some simple simulations to try out the Lemmons' program (PhyloMapper) for inferring ancestral locations. In the plots below, the true location of the ancestor is a marked with a "*".

Here's a surface for a simulation in which we allowed a population to expand from a single individual in the middle of the range for 5,000 generations and then double in size for another 5,000 generations. The area in red gives the 95% confidence envelope...



...and here's a case where we allowed both the population and range sizes to double (the range was originally constrained to the left half of the landscape)...



Note that there is some bias when migration occurs more often in one direction (when the range expands), but it wasn't substantial enough to reject the true location in this case. Given the new perspective on phylogeographic data that it provides, I hope PhyloMapper sees more widespread use and development.

If you're interested in using DIM SUM, you can download it at http://code.google.com/p/bio-dimsum. Please let us know what you think!

P.S. Count yourself lucky if you can find an undergrad who can write a program like this in under a semester!