ivy.chars.hrm module¶

ivy.chars.hrm.AIC(l, k)[source]¶: Akaike information criterion

ivy.chars.hrm.any_equal(ar)[source]¶: Given a 3-D matrix, test if any of the sub-matrices are equal. For use in evaluating symmetric constraint in hrm likelihood function

ivy.chars.hrm.cluster_models(tree, chars, Q, nregime, pi=u'Equal', findmin=True)[source]¶: Given an MLE Q, return candidate models with more parsimonious parameter set by merging similar parameters

ivy.chars.hrm.create_hrm_ar(tree, chars, nregime, findmin=True)[source]¶: Create arrays to be used in hrm likelihood function

ivy.chars.hrm.create_likelihood_function_hrm_mk_MLE(tree, chars, nregime, Qtype, pi=u'Equal', findmin=True, constraints=u'Rate', orderedRegimes=True)[source]¶

Create a function that takes values for Q and returns likelihood.

Specify the Q to be ER, Sym, or ARD

Returned function to be passed into scipy.optimize

Parameters:

tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...

Can also be a list with tip states in preorder sequence
Qtype (str) – ARD only
pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node.
min (bool) – Whether the function is to be minimized (False means it will be maximized)
orderedRegimes (bool) – Whether or not regimes are ordered such that a transition from 1 -> 3 is not possible (a branch in state 1 must pass through 2 to get to 3)

Notes

Constraints:: The maximum rate within each rate category must be ordered ( fastest rate in slower regime must be slower than fastest rate in fastest regime)

Returns:	Function accepting a list of parameters and returning log-likelihood. To be optmimized with scipy.optimize.minimize
Return type:	function

ivy.chars.hrm.extract_Qparams(Q, nregime)[source]¶

ivy.chars.hrm.extract_regimes(Q, nobschar, nregime)[source]¶: Given a HRM Q matrix,extract the within-regime transitions

ivy.chars.hrm.fill_Q_layout(regimetype, Qparams)[source]¶: Fill a single regime matrix based on the regime type and the given parameters.

ivy.chars.hrm.fill_Q_matrix(nobschar, nregime, wrparams, brparams, Qtype=u'ARD', out=None, orderedRegimes=True)[source]¶

Fill a Q matrix with nchar*nregime rows and cols with values from Qparams

Parameters:

nchar (int) – number of observed characters
nregime (int) – number of hidden rates per character
wrparams (list) – List of unique Q values for within-regime transitions, in order as they appear in columnwise iteration
brparams (list) – list of unique Q values for between-regime transition, in order as they appear in columnwise iteration
Qtype (str) – Either “ARD” or “Simple”. What type of Q matrix to fill.
out (np.array) – Optional numpy array to put output into
orderedRegimes (bool) – Whether or not regimes are ordered such that a transition from 1 -> 3 is not possible (a branch in state 1 must pass through 2 to get to 3)

Returns:

Q-matrix with values filled in. Check to make sure values: have been filled in properly

Return type:

array

ivy.chars.hrm.fill_distinct_regime_Q(regimetypes, Qparams, nregime, nobschar, Q=None, Q_layout=None, br_variable=False)[source]¶

ivy.chars.hrm.fill_model_Q(mod, Qparams, Q)[source]¶

Fill Q matrix given model and parameters.

Parameters:

mod (list) –
List of tuples. Code for the mode used. The first nregime tuples correspond to the within-regime transition rates for each regime. The last tuple corresponds to the between-regime transition rates. Ex. [(1, 2), (3, 4), (5, 6, 7, 8)] corresponds to a matrix of:

[-,1,5,-] [2,-,-,6] [7,-,-,3] [-,8,4,-]
Qparams (list) – List of floats corresponding to the values for slow, medium, fast, etc. rates. The first is always 0
Q (np.array) – Pre-allocated Q matrix

ivy.chars.hrm.fit_hrm(tree, chars, nregime, pi=u'Equal', constraints=u'Rate', Qtype=u'ARD', orderedRegimes=True, startingvals=None)[source]¶

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:	tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root) chars (dict) – Dict mapping character states to tip labels. Character states should be coded 0,1,2... Can also be a list with tip states in preorder sequence nregime (int) – Number of hidden rates per character pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes startingvals (list) – Values to initialize optimization at. Optional
Returns:	Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:	tuple

ivy.chars.hrm.fit_hrm_distinct_regimes(tree, chars, nregime, nparams, pi=u'Equal', br_variable=False, parallel=False, ncores=2, out_file=None)[source]¶

Fit hrm with distinct regime types given number of regimes and number of parameters

BINARY CHARACTERS ONLY

Parameters:

tree (Node) – Root node of tree
chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...

Can also be a list with tip states in preorder sequence
nregime (int) – Number of regimes to test in model
nparams (int) – Number of unique parameters available for regimes
pi (str) – Root prior
br_variable (bool) – Whether or not between-regime rates are allowed to vary
parallel (bool) – Whether to run in parallel
ncores (int) – If parallel is True, number of cores to run on
out_file (str) – Optional output file name. If given, results will written into the current directory with this string as the first part of the filename

ivy.chars.hrm.fit_hrm_mkARD(tree, chars, nregime, pi=u'Equal', constraints=u'Rate', orderedRegimes=True, startingvals=None)[source]¶

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:	tree (Node) – Root node of a tree. All branch lengths must be chars (dict) – Dict mapping character states to tip labels. Character states should be coded 0,1,2... Can also be a list with tip states in preorder sequence nregime (int) – Number of hidden rates per character pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns:	Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:	tuple

ivy.chars.hrm.fit_hrm_mkSimple(tree, chars, nregime, pi=u'Equal', orderedRegimes=True, startingvals=None)[source]¶

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:	tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root) chars (dict) – Dict mapping character states to tip labels. Character states should be coded 0,1,2... Can also be a list with tip states in preorder sequence nregime (int) – Number of hidden rates per character pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
Returns:	Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:	tuple

ivy.chars.hrm.fit_hrm_model(tree, chars, nregime, mod, pi=u'Equal', findmin=True, initialvals=None)[source]¶: Fit parameters for a single model specified by the user

ivy.chars.hrm.fit_hrm_model_likelihood(tree, chars, nregime, mod, pi=u'Equal', findmin=True)[source]¶: Likelihood function for fitting user-specified model

ivy.chars.hrm.fit_hrm_qidx(tree, chars, nregime, qidx, pi=u'Equal', orderedRegimes=True, startingvals=None)[source]¶

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:	tree (Node) – Root node of a tree. All branch lengths must be chars (dict) – Dict mapping character states to tip labels. Character states should be coded 0,1,2... Can also be a list with tip states in preorder sequence nregime (int) – Number of hidden rates per character pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns:	Dict of fitted Q matrix (a np array) and log-likelihood value
Return type:	dict

ivy.chars.hrm.fit_hrm_randstartingpoints(tree, chars, nregime, nstart=10, pi=u'Equal', constraints=u'Rate', Qtype=u'ARD', orderedRegimes=True)[source]¶

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood. Run nstart times, each with different randomized starting values, and return the highest likelihood

Parameters:	tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root) chars (dict) – Dict mapping character states to tip labels. Character states should be coded 0,1,2... Can also be a list with tip states in preorder sequence nregime (int) – Number of hidden rates per character nstart (int) – Number of times to run likelihood fit pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns:	Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:	tuple

ivy.chars.hrm.format_mod(mod, nregime, nobschar)[source]¶: Convert flattened model to formatted model for use in fill_model_Q

ivy.chars.hrm.hrm_disctinct_regimes_likelihoodfunc(tree, chars, regimetypes, pi=u'Equal', findmin=True, br_variable=False, ar=None)[source]¶

ivy.chars.hrm.hrm_mk(tree, chars, Q, nregime, pi=u'Equal', returnPi=False, ar=None)[source]¶

Return log-likelihood of hidden-rates-markov mk model as described in Beaulieu et al. 2013

Parameters:	tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root) chars (dict) – Dict mapping character states to tip labels. Character states should be coded 0,1,2... Can also be a list with tip states in preorder sequence Q (np.array) – Instantaneous rate matrix p (np.array) – Optional pre-allocated p matrix pi (str or np.array) – Option to weight the root node by given values. Either a string containing the method or an array of weights. Weights should be given in order. Accepted methods of weighting root: Equal: flat prior Equilibrium: Prior equal to stationary distribution of Q matrix Fitzjohn: Root states weighted by how well they explain the data at the tips. returnPi (bool) – Whether or not to return the final values of root node weighting ar (dict) – Dict of pre-allocated arrays to improve speed by avoiding creating and destroying new arrays
Returns:	Log-likelihood of model
Return type:	float

Examples

from ivy.examples import primates,primate_data Q = np.array([[-0.11,0.1 ,0.01 ,0 ],

[0.1 ,-0.11,0.0 ,0.01 ], [0.01 ,0.0 ,-0.51,0.5 ], [0.0 ,0.01 ,0.5 ,-0.51]])

hrm_mk(primates,primate_data,Q,nregime=2)

ivy.chars.hrm.make_identity_regime(regimePair)[source]¶: Given one regime pair, reduce it to its identity (eg (2,2),(3,3) becomes (1,1),(2,2)

ivy.chars.hrm.make_regime_type_combos(nregime, nparams)[source]¶: Create regime combinations for a binary character

ivy.chars.hrm.optimize_regime(comb, tree=None, chars=None, nregime=None, nparams=None, pi=None, br_variable=None, out_file=None, ar=None)[source]¶: Top-level function for optimizing hrm distinct-regime model in parallel.

ivy.chars.hrm.pairwise_merge(tree, chars, Q, nregime, pi=u'Equal')[source]¶

Given an MLE Q, merge similar parameters pairwise to find more parsimonious models

Parameters:

tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...

Can also be a list with tip states in preorder sequence
Q (np.array) – Instantaneous rate matrix
nregime (int) – Number of regimes in the model
pi (str or np.array) –
Option to weight the root node by given values. Either a string containing the method or an array of weights. Weights should be given in order. Accepted methods of weighting root:

Equal: flat prior Equilibrium: Prior equal to stationary distribution

of Q matrix

Fitzjohn: Root states weighted by how well they

explain the data at the tips.

ivy.chars.hrm.remove_redundant_regimes(regimePairs)[source]¶: Remove redundant regime pairs (eg (1,1),(2,2) and (2,2),(3,3) are redundant and only one should be kept)