ivy.chars.hrm module¶
-
ivy.chars.hrm.
any_equal
(ar)[source]¶ Given a 3-D matrix, test if any of the sub-matrices are equal. For use in evaluating symmetric constraint in hrm likelihood function
-
ivy.chars.hrm.
cluster_models
(tree, chars, Q, nregime, pi=u'Equal', findmin=True)[source]¶ Given an MLE Q, return candidate models with more parsimonious parameter set by merging similar parameters
-
ivy.chars.hrm.
create_hrm_ar
(tree, chars, nregime, findmin=True)[source]¶ Create arrays to be used in hrm likelihood function
-
ivy.chars.hrm.
create_likelihood_function_hrm_mk_MLE
(tree, chars, nregime, Qtype, pi=u'Equal', findmin=True, constraints=u'Rate', orderedRegimes=True)[source]¶ Create a function that takes values for Q and returns likelihood.
Specify the Q to be ER, Sym, or ARD
Returned function to be passed into scipy.optimize
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- Qtype (str) – ARD only
- pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node.
- min (bool) – Whether the function is to be minimized (False means it will be maximized)
- orderedRegimes (bool) – Whether or not regimes are ordered such that a transition from 1 -> 3 is not possible (a branch in state 1 must pass through 2 to get to 3)
Notes
- Constraints:
- The maximum rate within each rate category must be ordered ( fastest rate in slower regime must be slower than fastest rate in fastest regime)
Returns: - Function accepting a list of parameters and returning
- log-likelihood. To be optmimized with scipy.optimize.minimize
Return type: function
-
ivy.chars.hrm.
extract_regimes
(Q, nobschar, nregime)[source]¶ Given a HRM Q matrix,extract the within-regime transitions
-
ivy.chars.hrm.
fill_Q_layout
(regimetype, Qparams)[source]¶ Fill a single regime matrix based on the regime type and the given parameters.
-
ivy.chars.hrm.
fill_Q_matrix
(nobschar, nregime, wrparams, brparams, Qtype=u'ARD', out=None, orderedRegimes=True)[source]¶ Fill a Q matrix with nchar*nregime rows and cols with values from Qparams
Parameters: - nchar (int) – number of observed characters
- nregime (int) – number of hidden rates per character
- wrparams (list) – List of unique Q values for within-regime transitions, in order as they appear in columnwise iteration
- brparams (list) – list of unique Q values for between-regime transition, in order as they appear in columnwise iteration
- Qtype (str) – Either “ARD” or “Simple”. What type of Q matrix to fill.
- out (np.array) – Optional numpy array to put output into
- orderedRegimes (bool) – Whether or not regimes are ordered such that a transition from 1 -> 3 is not possible (a branch in state 1 must pass through 2 to get to 3)
Returns: - Q-matrix with values filled in. Check to make sure values
have been filled in properly
Return type:
-
ivy.chars.hrm.
fill_distinct_regime_Q
(regimetypes, Qparams, nregime, nobschar, Q=None, Q_layout=None, br_variable=False)[source]¶
-
ivy.chars.hrm.
fill_model_Q
(mod, Qparams, Q)[source]¶ Fill Q matrix given model and parameters.
Parameters: - mod (list) –
List of tuples. Code for the mode used. The first nregime tuples correspond to the within-regime transition rates for each regime. The last tuple corresponds to the between-regime transition rates. Ex. [(1, 2), (3, 4), (5, 6, 7, 8)] corresponds to a matrix of:
[-,1,5,-] [2,-,-,6] [7,-,-,3] [-,8,4,-] - Qparams (list) – List of floats corresponding to the values for slow, medium, fast, etc. rates. The first is always 0
- Q (np.array) – Pre-allocated Q matrix
- mod (list) –
-
ivy.chars.hrm.
fit_hrm
(tree, chars, nregime, pi=u'Equal', constraints=u'Rate', Qtype=u'ARD', orderedRegimes=True, startingvals=None)[source]¶ Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- nregime (int) – Number of hidden rates per character
- pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
- orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
- startingvals (list) – Values to initialize optimization at. Optional
Returns: Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:
-
ivy.chars.hrm.
fit_hrm_distinct_regimes
(tree, chars, nregime, nparams, pi=u'Equal', br_variable=False, parallel=False, ncores=2, out_file=None)[source]¶ Fit hrm with distinct regime types given number of regimes and number of parameters
BINARY CHARACTERS ONLY
Parameters: - tree (Node) – Root node of tree
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- nregime (int) – Number of regimes to test in model
- nparams (int) – Number of unique parameters available for regimes
- pi (str) – Root prior
- br_variable (bool) – Whether or not between-regime rates are allowed to vary
- parallel (bool) – Whether to run in parallel
- ncores (int) – If parallel is True, number of cores to run on
- out_file (str) – Optional output file name. If given, results will written into the current directory with this string as the first part of the filename
-
ivy.chars.hrm.
fit_hrm_mkARD
(tree, chars, nregime, pi=u'Equal', constraints=u'Rate', orderedRegimes=True, startingvals=None)[source]¶ Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- nregime (int) – Number of hidden rates per character
- pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
- orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns: Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:
-
ivy.chars.hrm.
fit_hrm_mkSimple
(tree, chars, nregime, pi=u'Equal', orderedRegimes=True, startingvals=None)[source]¶ Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- nregime (int) – Number of hidden rates per character
- pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
Returns: Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:
-
ivy.chars.hrm.
fit_hrm_model
(tree, chars, nregime, mod, pi=u'Equal', findmin=True, initialvals=None)[source]¶ Fit parameters for a single model specified by the user
-
ivy.chars.hrm.
fit_hrm_model_likelihood
(tree, chars, nregime, mod, pi=u'Equal', findmin=True)[source]¶ Likelihood function for fitting user-specified model
-
ivy.chars.hrm.
fit_hrm_qidx
(tree, chars, nregime, qidx, pi=u'Equal', orderedRegimes=True, startingvals=None)[source]¶ Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- nregime (int) – Number of hidden rates per character
- pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
- orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns: Dict of fitted Q matrix (a np array) and log-likelihood value
Return type:
-
ivy.chars.hrm.
fit_hrm_randstartingpoints
(tree, chars, nregime, nstart=10, pi=u'Equal', constraints=u'Rate', Qtype=u'ARD', orderedRegimes=True)[source]¶ Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood. Run nstart times, each with different randomized starting values, and return the highest likelihood
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- nregime (int) – Number of hidden rates per character
- nstart (int) – Number of times to run likelihood fit
- pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
- orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns: Tuple of fitted Q matrix (a np array) and log-likelihood value
Return type:
-
ivy.chars.hrm.
format_mod
(mod, nregime, nobschar)[source]¶ Convert flattened model to formatted model for use in fill_model_Q
-
ivy.chars.hrm.
hrm_disctinct_regimes_likelihoodfunc
(tree, chars, regimetypes, pi=u'Equal', findmin=True, br_variable=False, ar=None)[source]¶
-
ivy.chars.hrm.
hrm_mk
(tree, chars, Q, nregime, pi=u'Equal', returnPi=False, ar=None)[source]¶ Return log-likelihood of hidden-rates-markov mk model as described in Beaulieu et al. 2013
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- Q (np.array) – Instantaneous rate matrix
- p (np.array) – Optional pre-allocated p matrix
- pi (str or np.array) –
Option to weight the root node by given values. Either a string containing the method or an array of weights. Weights should be given in order. Accepted methods of weighting root:
Equal: flat prior Equilibrium: Prior equal to stationary distributionof Q matrix- Fitzjohn: Root states weighted by how well they
- explain the data at the tips.
- returnPi (bool) – Whether or not to return the final values of root node weighting
- ar (dict) – Dict of pre-allocated arrays to improve speed by avoiding creating and destroying new arrays
Returns: Log-likelihood of model
Return type: Examples
from ivy.examples import primates,primate_data Q = np.array([[-0.11,0.1 ,0.01 ,0 ],
[0.1 ,-0.11,0.0 ,0.01 ], [0.01 ,0.0 ,-0.51,0.5 ], [0.0 ,0.01 ,0.5 ,-0.51]])hrm_mk(primates,primate_data,Q,nregime=2)
-
ivy.chars.hrm.
make_identity_regime
(regimePair)[source]¶ Given one regime pair, reduce it to its identity (eg (2,2),(3,3) becomes (1,1),(2,2)
-
ivy.chars.hrm.
make_regime_type_combos
(nregime, nparams)[source]¶ Create regime combinations for a binary character
-
ivy.chars.hrm.
optimize_regime
(comb, tree=None, chars=None, nregime=None, nparams=None, pi=None, br_variable=None, out_file=None, ar=None)[source]¶ Top-level function for optimizing hrm distinct-regime model in parallel.
-
ivy.chars.hrm.
pairwise_merge
(tree, chars, Q, nregime, pi=u'Equal')[source]¶ Given an MLE Q, merge similar parameters pairwise to find more parsimonious models
Parameters: - tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
- chars (dict) –
Dict mapping character states to tip labels. Character states should be coded 0,1,2...
Can also be a list with tip states in preorder sequence
- Q (np.array) – Instantaneous rate matrix
- nregime (int) – Number of regimes in the model
- pi (str or np.array) –
Option to weight the root node by given values. Either a string containing the method or an array of weights. Weights should be given in order. Accepted methods of weighting root:
Equal: flat prior Equilibrium: Prior equal to stationary distributionof Q matrix- Fitzjohn: Root states weighted by how well they
- explain the data at the tips.