Thursday, July 9, 2009

Welcome to geodendron!

Now that this blog has been up for several weeks, it's probably time to post something. It's become increasingly apparent (at least in my opinion) that the time is ripe for some serious testing of phylo- and biogeographic methods. Hopefully this blog will serve as a forum for thoughts and developments related to (i) approaches to simulating appropriate benchmark datasets, (ii) simulation studies to test existing methods, and (iii) the development of new analytical tools.

I envision a few posts each week on published papers or unpublished thoughts, each of which will spark comments and discourse on the post's topic. Not that this should need to be pointed out, but just in case: all discussions on this blog should remain civil and respectful in tone.

Jeet and I realized the potential usefulness of this blog when we discovered that we were duplicating each other's efforts in writing generalized simulation programs for phylogeographic data. Roughly, both of our programs relax coalescent assumptions (among other things) and allow individuals to move on a continuous landscape. My collaborators and I are in the process of writing up an application note about our program and are trying to figure out what simple simulations would be good for illustrating the utility of continuous-landscape simulations over existing coalescent-based simulations.

So, to kick things off, what are your thoughts on this? What are some simple scenarios where relaxing the coalescent assumption might be particularly important?

3 comments:

  1. Hi all, this should be fun!
    Following Jeremy's post, here's a few thoughts:

    1) A very nice paper relevant to these topics, and well-written, is this one:
    Muster C, Maddison WP, Uhlmann S, Berendonk TU, Vogler AP (2009) Arctic-alpine distributions – metapopulations on a continental scale? American Naturalist, 173, 313–326.

    There are probably too many real-life violations of coalescent assumptions to list, but some of the most commonly encountered ones (recognised or not) would include (1) 'fuzzy' (cf. 'crisp') population boundaries, (2) subtle substructure within groups (can come in a variety of flavours, including restricted GF across / IBD, kin clustering, etc), (3) presence of one or more first-generation (i.e., oringinal) migrants in the 'wrong' population (similar to fuzzy boundaries, but is related to dispersal biology of the organism rather then just trying to draw circles on a map). Most of these involve relatively slight violations of assumptions - I think that these occur frequently in analysed datasets, whereas researchers tend to pick up on the very obvious stuff (e.g., deep substructe with little or no GF) and deal with it appropriately. In general, would try to keep the simulations closely tied to how that coalescent methods are actually being used for empirical data.

    A question: Jeremy, are we really talking about 'relaxing' coal. assumptions, or do you mean knowingly violating them? I think could be an important difference.

    cheers,
    Ryan

    ReplyDelete
  2. Hi all!
    I think this blog is a great idea (and will hopefully gain more momentum after the summer doldrums).
    Thanks for the paper suggestion Ryan.
    I had some ideas and questions following up on Jeremy's coalescent post, but specifically related to the simulation program SPLATCHE
    ( Currat, Ray and Excoffier, splatche: a program to simulate genetic diversity taking into account environmental heterogeneity. Mol Ecol Notes (2004) vol. 4 (1) pp. 139-142 )
    and the followup paper
    ( Klopfstein, Currat and Excoffier. The fate of mutations surfing on the wave of a range expansion. Molecular Biology and Evolution (2006) )
    (I know these are a bit old, but I have been thinking about them recently)
    Have any of you worked with splatche?
    I agree with Ryan that the way populations are delimited in the coalescent is one of the most likely problems.
    The Klopfstein paper uses simulations in splatche to show how clines can be set up by mutations "surfing" along the edge of a range expansion. However, it seems to me that this is largely due to the range expansion consisting of colonization of empty "demes" followed by expansion within each of these populations (and some migration between occupied demes). I'm not convinced that you would get this surfing effect in a continuous (individual based) range expansion (as opposed to stepwise population expansion). The authors do mention that the extent of surfing is dependent on the population size of demes simulated. Anyhow, I though that this might be a (simulated) case, where the coalescent assumptions of population structure result in entirely different results than an individual based model would. But I suppose that depending on what species you are dealing with this step-wise population-by-population expansion might not necessarily be biologically unrealistic. What do you think?

    Hope everyone is having a great summer,
    Emily

    ReplyDelete
  3. Allele surfing... this paper shows that the phenomenon can happen in real life, albeit with pampered bugs in petri dishes! Hallatschek et al. (2007) Genetic drift at expanding frontiers promotes gene segregation. PNAS, 104:19926-19930.

    This is just a nice overview: Excoffier & Ray (2008) Surfing during population expansions promotes genetic revolutions and structuration. TREE, 23:347-351.

    For those who are interested, the idea is that at the leading edge of an expanding population, densities are thin on the ground, and that means effective population sizes are locally very small (even though Ne could be huge in other parts of the species' range). Small Ne = high potential for rapid drift over very few generations, and so alleles that are otherwise very rare can 'surf' to locally high frequency at the leading edge. Importantly, these dramatic frequency shifts can be maintained, and so this is potentially a big deal when it comes to making historical inferences based on observations of abrupt spatial-genetic discontinuities across a landscape - rather than isolation/vicariance, it could be produced by simple uni-directional range expansion.

    Maybe I'm overly-simplistic, I think this would be pretty easy to recognise, if you have the right sort of data. Imagine 'apparent breaks' caused by Holocene range expansions (masquerading as early-mid Pleistocene vicariance). The allele frequency changes could be abrupt in space, but the divergences among those alleles will be very shallow in time. Also, there is no reason (in my mind) to expect multiple loci to show ‘allele surfing’ breaks in the same geographic location if RE is the underlying process, yet vicariance should affect whole genomes. So, if you get DNA sequences plus frequency data, and have >1 locus, should be pretty straight forward to tease this apart?

    cheers,
    Ryan

    ReplyDelete