Description Usage Arguments Details Value Author(s) References
Sets up and executes the original GeoHiSSE model (Hidden Geographic State Speciation and Extinction) on a phylogeny and character distribution.
1 2 3 4 5 6  GeoHiSSE.old(phy, data, f=c(1,1,1), speciation=c(1,2,3), extirpation=c(1,2),
hidden.areas=FALSE, trans.rate=NULL, assume.cladogenetic=TRUE,
condition.on.survival=TRUE, root.type="madfitz", root.p=NULL, sann=TRUE,
sann.its=1000, bounded.search=TRUE, max.tol=.Machine$double.eps^.50,
mag.san.start=0.5, starting.vals=NULL, speciation.upper=1000, extirpation.upper=1000,
trans.upper=100, ode.eps=0)

phy 
a phylogenetic tree, in 
data 
a matrix (or dataframe) with two columns containing species information. First column has the species names and second column has area codes. Values for the areas need to be 0, 1, or 2, where 0 is the widespread area '01', 1 is endemic area '0' and 2 is endemic area '1'. See 'Details'. 
f 
vector of length 3 with the estimated proportion of extant species in state 1 (area '0'), state 2 (area '1'), and state 0 (widespread area '01') that are included in the phylogeny. A value of c(0.25, 0.25, 0.5) means that 25 percent of species in areas '0' and '1' and 50 percent of species in area '01' are included in the phylogeny. By default all species are assumed to be sampled. 
speciation 
a numeric vector of length equal to 3+(number of

extirpation 
a numeric vector of length equal to 2+(number of

hidden.areas 
a logical indicating whether the model includes a
hidden area. The default is 
trans.rate 
provides the transition rate model. See function

assume.cladogenetic 
assumes that cladogenetic events occur at nodes. The
default is 
condition.on.survival 
a logical indicating whether the likelihood
should be conditioned on the survival of two lineages and the
speciation event subtending them (Nee et al. 1994). The default is 
root.type 
indicates whether root summarization follow the procedure described by FitzJohn et al. 2009, “madfitz” or HerreraAlsina et al. 2018, “herr_als”. 
root.p 
a vector indicating fixed root state probabilities. The
default is 
sann 
a logical indicating whether a twostep optimization
procedure is to be used. The first includes a simulate annealing
approach, with the second involving a refinement using

sann.its 
a numeric indicating the number of times the simulated annealing algorithm should call the objective function. 
bounded.search 
a logical indicating whether or not bounds should
be enforced during optimization. The default is is 
max.tol 
supplies the relative optimization tolerance to

mag.san.start 
Sets the extinction fraction to estimate the starting values for the diversification parameters. The equation used is based on Magallon and Sanderson (2001), and follows the procedure used in the original GeoSSE implementation. 
starting.vals 
a numeric vector of length 3 with starting values for the model for all areas and hidden states. Position [1] sets turnover, [2] sets extinction fraction, and [3] dispersal rates. 
speciation.upper 
sets the upper bound for the speciation parameters. 
extirpation.upper 
sets the upper bound for the extirpation parameters. 
trans.upper 
sets the upper bound for the transition rate parameters. 
ode.eps 
sets the tolerance for the integration at the end of a branch. Essentially if the sum of compD is less than this tolerance, then it assumes the results are unstable and discards them. The default is set to zero, but in testing a value of 1e8 can sometimes produce stable solutions for both easy and very difficult optimization problems. 
This function sets up and executes the original GeoHiSSE model. The model closely
follows diversitree
, although here we employ modified
optimization procedures. As for data file format, GeoHiSSE
expects a two column matrix or data frame, with the first column
containing the species names and the second containing the are
information. The area information need to be in the format of three
numbers: 0 for area '01', 1 for area '0', and 2 for '1'. Please note that
the code for the areas here differ from the
make.geosse
function of package diversitree
.
The order of the data file and the names in the
“phylo” object need not be in the same order; hisse
deals
with this internally. Also, the character information MUST be 0,
1, or 2, otherwise, the function will return an error message.
To setup a model, users input vectors containing values to indicate how
many free parameters are to be estimated for each of the variables in
the model. This is done using the speciation
and
extirpation
parameters. One needs to specify a value for each of
the parameters of the model, when two parameters show the same value,
then the parameters are set to be linked during the estimation of the
model. For example, a GeoHiSSE model with 1 hidden area and all free
parameters has speciation = 1:6
. The same model with
speciation rates constrained to be the same for all hidden areas has
speciation = c(1,2,3,1,2,3)
. This same format applies to
extirpation
. Please note that GeoHiSSE currently works with up to
4 hidden areas. The most complex model would be speciation = 1:15
and extirpation = 1:10
.
Once the model is specified, the parameters can be estimated using the subplex routine (default), or use a twostep process (i.e., sann=TRUE) that first employs a stochastic simulated annealing procedure, which is later refined using the subplex routine.
The “trans.rate” input is the transition model and has an
entirely different setup than speciation and extirpation rates.
See TransMatMakerGeoHiSSE.old
function for more details.
For userspecified “root.p”, you should specify the probability for each area. If you are doing a hidden model, there will be six areas: 0A, 1A, 01A, 0B, 1B, 01B. So if you wanted to say the root had to be in area 0 (widespread distribution), you would specify “root.p = c(0.5, 0, 0, 0.5, 0, 0)”. In other words, the root has a 50% chance to be in one of the areas 0A or 0B.
For the “root.type” option, we are currently maintaining the previous default of “madfitz”. However, it was recently pointed out by HerreraAlsina et al. (2018) that at the root, the individual likelihoods for each possible state should be conditioned prior to averaging the individual likelihoods across states. This can be set doing “herr_als”. It is unclear to us which is exactly correct, but it does seem that both “madfitz” and “herr_als” behave exactly as they should in the case of characterindependent diversification (i.e., reduces to likelihood of tree + likelihood of trait model). We've also tested the behavior and the likelihood differences are very subtle and the parameter estimates in simulation are nearly indistinguishable from the “madfitz” conditioning scheme. We provide both options and encourage users to try both and let us know conditions in which the result vary dramatically under the two root implementations. We suspect they do not.
Also, note, that in the case of “root.type=user” and “root.type=equal” are no longer explicit “root.type” options. Instead, either “madfitz” or “herr_als” are specified and the “root.p” can be set to allow for custom root options.
GeoHiSSE
returns an object of class geohisse.fit
. This is a list with
elements:
$loglik 
the maximum negative loglikelihood. 
$AIC 
Akaike information criterion. 
$AICc 
Akaike information criterion corrected for samplesize. 
$solution 
a matrix containing the maximum likelihood estimates of the model parameters. 
$index.par 
an index matrix of the parameters being estimated. 
$f 
usersupplied sampling frequencies. 
$hidden.areas 
a logical indicating whether hidden areas were included in the model. 
$assume.cladogenetic 
a logical indicating whether cladogenetic events were allowed at nodes. 
$condition.on.surivival 
a logical indicating whether the likelihood was conditioned on the survival of two lineages and the speciation event subtending them. 
$root.type 
indicates the userspecified root prior assumption. 
$root.p 
indicates whether the userspecified fixed root probabilities. 
$timeslice 
indicates whether the userspecified timeslice that split the tree. 
$phy 
usersupplied tree 
$data 
usersupplied dataset 
$trans.matrix 
the usersupplied transition matrix 
$max.tol 
relative optimization tolerance. 
$starting.vals 
The starting values for the optimization. 
$upper.bounds 
the vector of upper limits to the optimization search. 
$lower.bounds 
the vector of lower limits to the optimization search. 
$ode.eps 
The ode.eps value used for the estimation. 
Jeremy M. Beaulieu
Caetano, D.S., B.C. O'Meara, and J.M. Beaulieu. 2018. Hidden state models improve statedependent diversification approaches, including biogeographic models. Evolution, 72:23082324.
Beaulieu, J.M, and B.C. O'Meara. 2016. Detecting hidden diversification shifts in models of traitdependent speciation and extinction. Syst. Biol. 65:583601.
FitzJohn R.G., W.P. Maddison, and S.P. Otto. 2009. Estimating traitdependent speciation and extinction rates from incompletely resolved phylogenies. Syst. Biol. 58:595611.
Goldberg, E. E., L. T. Lancaster, and R. H. Ree. 2011. Phylogenetic Inference of Reciprocal Effects between Geographic Range Evolution and Diversification. Syst. Biol. 60:451465.
Maddison W.P., P.E. Midford, and S.P. Otto. 2007. Estimating a binary characters effect on speciation and extinction. Syst. Biol. 56:701710.
Nee S., R.M. May, and P.H. Harvey. 1994. The reconstructed evolutionary process. Philos. Trans. R. Soc. Lond. B Biol. Sci. 344:305311.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.