ivy.chars.hrm module

ivy.chars.hrm.AIC(l, k)[source]

Akaike information criterion

ivy.chars.hrm.any_equal(ar)[source]

Given a 3-D matrix, test if any of the sub-matrices are equal. For use in evaluating symmetric constraint in hrm likelihood function

ivy.chars.hrm.cluster_models(tree, chars, Q, nregime, pi=u'Equal', findmin=True)[source]

Given an MLE Q, return candidate models with more parsimonious parameter set by merging similar parameters

ivy.chars.hrm.create_hrm_ar(tree, chars, nregime, findmin=True)[source]

Create arrays to be used in hrm likelihood function

ivy.chars.hrm.create_likelihood_function_hrm_mk_MLE(tree, chars, nregime, Qtype, pi=u'Equal', findmin=True, constraints=u'Rate', orderedRegimes=True)[source]

Create a function that takes values for Q and returns likelihood.

Specify the Q to be ER, Sym, or ARD

Returned function to be passed into scipy.optimize

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • Qtype (str) – ARD only
  • pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node.
  • min (bool) – Whether the function is to be minimized (False means it will be maximized)
  • orderedRegimes (bool) – Whether or not regimes are ordered such that a transition from 1 -> 3 is not possible (a branch in state 1 must pass through 2 to get to 3)

Notes

Constraints:
The maximum rate within each rate category must be ordered ( fastest rate in slower regime must be slower than fastest rate in fastest regime)
Returns:
Function accepting a list of parameters and returning
log-likelihood. To be optmimized with scipy.optimize.minimize
Return type:function
ivy.chars.hrm.extract_Qparams(Q, nregime)[source]
ivy.chars.hrm.extract_regimes(Q, nobschar, nregime)[source]

Given a HRM Q matrix,extract the within-regime transitions

ivy.chars.hrm.fill_Q_layout(regimetype, Qparams)[source]

Fill a single regime matrix based on the regime type and the given parameters.

ivy.chars.hrm.fill_Q_matrix(nobschar, nregime, wrparams, brparams, Qtype=u'ARD', out=None, orderedRegimes=True)[source]

Fill a Q matrix with nchar*nregime rows and cols with values from Qparams

Parameters:
  • nchar (int) – number of observed characters
  • nregime (int) – number of hidden rates per character
  • wrparams (list) – List of unique Q values for within-regime transitions, in order as they appear in columnwise iteration
  • brparams (list) – list of unique Q values for between-regime transition, in order as they appear in columnwise iteration
  • Qtype (str) – Either “ARD” or “Simple”. What type of Q matrix to fill.
  • out (np.array) – Optional numpy array to put output into
  • orderedRegimes (bool) – Whether or not regimes are ordered such that a transition from 1 -> 3 is not possible (a branch in state 1 must pass through 2 to get to 3)
Returns:

Q-matrix with values filled in. Check to make sure values

have been filled in properly

Return type:

array

ivy.chars.hrm.fill_distinct_regime_Q(regimetypes, Qparams, nregime, nobschar, Q=None, Q_layout=None, br_variable=False)[source]
ivy.chars.hrm.fill_model_Q(mod, Qparams, Q)[source]

Fill Q matrix given model and parameters.

Parameters:
  • mod (list) –

    List of tuples. Code for the mode used. The first nregime tuples correspond to the within-regime transition rates for each regime. The last tuple corresponds to the between-regime transition rates. Ex. [(1, 2), (3, 4), (5, 6, 7, 8)] corresponds to a matrix of:

    [-,1,5,-] [2,-,-,6] [7,-,-,3] [-,8,4,-]
  • Qparams (list) – List of floats corresponding to the values for slow, medium, fast, etc. rates. The first is always 0
  • Q (np.array) – Pre-allocated Q matrix
ivy.chars.hrm.fit_hrm(tree, chars, nregime, pi=u'Equal', constraints=u'Rate', Qtype=u'ARD', orderedRegimes=True, startingvals=None)[source]

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • nregime (int) – Number of hidden rates per character
  • pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
  • orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
  • startingvals (list) – Values to initialize optimization at. Optional
Returns:

Tuple of fitted Q matrix (a np array) and log-likelihood value

Return type:

tuple

ivy.chars.hrm.fit_hrm_distinct_regimes(tree, chars, nregime, nparams, pi=u'Equal', br_variable=False, parallel=False, ncores=2, out_file=None)[source]

Fit hrm with distinct regime types given number of regimes and number of parameters

BINARY CHARACTERS ONLY

Parameters:
  • tree (Node) – Root node of tree
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • nregime (int) – Number of regimes to test in model
  • nparams (int) – Number of unique parameters available for regimes
  • pi (str) – Root prior
  • br_variable (bool) – Whether or not between-regime rates are allowed to vary
  • parallel (bool) – Whether to run in parallel
  • ncores (int) – If parallel is True, number of cores to run on
  • out_file (str) – Optional output file name. If given, results will written into the current directory with this string as the first part of the filename
ivy.chars.hrm.fit_hrm_mkARD(tree, chars, nregime, pi=u'Equal', constraints=u'Rate', orderedRegimes=True, startingvals=None)[source]

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • nregime (int) – Number of hidden rates per character
  • pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
  • orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns:

Tuple of fitted Q matrix (a np array) and log-likelihood value

Return type:

tuple

ivy.chars.hrm.fit_hrm_mkSimple(tree, chars, nregime, pi=u'Equal', orderedRegimes=True, startingvals=None)[source]

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • nregime (int) – Number of hidden rates per character
  • pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
Returns:

Tuple of fitted Q matrix (a np array) and log-likelihood value

Return type:

tuple

ivy.chars.hrm.fit_hrm_model(tree, chars, nregime, mod, pi=u'Equal', findmin=True, initialvals=None)[source]

Fit parameters for a single model specified by the user

ivy.chars.hrm.fit_hrm_model_likelihood(tree, chars, nregime, mod, pi=u'Equal', findmin=True)[source]

Likelihood function for fitting user-specified model

ivy.chars.hrm.fit_hrm_qidx(tree, chars, nregime, qidx, pi=u'Equal', orderedRegimes=True, startingvals=None)[source]

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood.

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • nregime (int) – Number of hidden rates per character
  • pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
  • orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns:

Dict of fitted Q matrix (a np array) and log-likelihood value

Return type:

dict

ivy.chars.hrm.fit_hrm_randstartingpoints(tree, chars, nregime, nstart=10, pi=u'Equal', constraints=u'Rate', Qtype=u'ARD', orderedRegimes=True)[source]

Fit a hidden-rates mk model to a given tree and list of characters, and number of regumes. Return fitted ARD Q matrix and calculated likelihood. Run nstart times, each with different randomized starting values, and return the highest likelihood

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • nregime (int) – Number of hidden rates per character
  • nstart (int) – Number of times to run likelihood fit
  • pi (str) – Either “Equal”, “Equilibrium”, or “Fitzjohn”. How to weight values at root node. Defaults to “Equal” Method “Fitzjohn” is not thouroughly tested, use with caution
  • orderedRegimes (bool) – Whether or not to constrain regime transitions to only occur between adjacent regimes
Returns:

Tuple of fitted Q matrix (a np array) and log-likelihood value

Return type:

tuple

ivy.chars.hrm.format_mod(mod, nregime, nobschar)[source]

Convert flattened model to formatted model for use in fill_model_Q

ivy.chars.hrm.hrm_disctinct_regimes_likelihoodfunc(tree, chars, regimetypes, pi=u'Equal', findmin=True, br_variable=False, ar=None)[source]
ivy.chars.hrm.hrm_mk(tree, chars, Q, nregime, pi=u'Equal', returnPi=False, ar=None)[source]

Return log-likelihood of hidden-rates-markov mk model as described in Beaulieu et al. 2013

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • Q (np.array) – Instantaneous rate matrix
  • p (np.array) – Optional pre-allocated p matrix
  • pi (str or np.array) –

    Option to weight the root node by given values. Either a string containing the method or an array of weights. Weights should be given in order. Accepted methods of weighting root:

    Equal: flat prior Equilibrium: Prior equal to stationary distribution
    of Q matrix
    Fitzjohn: Root states weighted by how well they
    explain the data at the tips.
  • returnPi (bool) – Whether or not to return the final values of root node weighting
  • ar (dict) – Dict of pre-allocated arrays to improve speed by avoiding creating and destroying new arrays
Returns:

Log-likelihood of model

Return type:

float

Examples

from ivy.examples import primates,primate_data Q = np.array([[-0.11,0.1 ,0.01 ,0 ],

[0.1 ,-0.11,0.0 ,0.01 ], [0.01 ,0.0 ,-0.51,0.5 ], [0.0 ,0.01 ,0.5 ,-0.51]])

hrm_mk(primates,primate_data,Q,nregime=2)

ivy.chars.hrm.make_identity_regime(regimePair)[source]

Given one regime pair, reduce it to its identity (eg (2,2),(3,3) becomes (1,1),(2,2)

ivy.chars.hrm.make_regime_type_combos(nregime, nparams)[source]

Create regime combinations for a binary character

ivy.chars.hrm.optimize_regime(comb, tree=None, chars=None, nregime=None, nparams=None, pi=None, br_variable=None, out_file=None, ar=None)[source]

Top-level function for optimizing hrm distinct-regime model in parallel.

ivy.chars.hrm.pairwise_merge(tree, chars, Q, nregime, pi=u'Equal')[source]

Given an MLE Q, merge similar parameters pairwise to find more parsimonious models

Parameters:
  • tree (Node) – Root node of a tree. All branch lengths must be greater than 0 (except root)
  • chars (dict) –

    Dict mapping character states to tip labels. Character states should be coded 0,1,2...

    Can also be a list with tip states in preorder sequence

  • Q (np.array) – Instantaneous rate matrix
  • nregime (int) – Number of regimes in the model
  • pi (str or np.array) –

    Option to weight the root node by given values. Either a string containing the method or an array of weights. Weights should be given in order. Accepted methods of weighting root:

    Equal: flat prior Equilibrium: Prior equal to stationary distribution
    of Q matrix
    Fitzjohn: Root states weighted by how well they
    explain the data at the tips.
ivy.chars.hrm.remove_redundant_regimes(regimePairs)[source]

Remove redundant regime pairs (eg (1,1),(2,2) and (2,2),(3,3) are redundant and only one should be kept)