Figure 1 – MCOA mapping between ontology, ontology extension and Markov chain (A) Simple ontology and extension. (B) Markov chain representing simple ontology and extension according to MCOA method (C) Adjusted transition probability matrix for Markov chain according to MCOA method (D) Information rank values generated from adjusted transition probability matrix using α = 0.15 and ω = 0.01.

Markov Chain Ontology Analysis (MCOA) supports the analysis of hierarchical models relative to collections of domain data. This methodology was developed to enable the analysis of more complex ontological structures, against a wider range of data distributions, than has been possible using existing methods. MCOA specifically targets the challenges of class overlaps, continuously valued data, inter-instance relationships, non-hierarchical relationships between classes, semantic distance and sparse data.

MCOA models the classes in one more ontologies, the instances from one or more datasets and all directional inter-class, class-to-instance and inter-instance relationships as a single finite ergodic Markov chain. The adjusted transition probability matrix for this Markov chain enables the calculation of eigenvector-based metrics for each ontology class and dataset instance. Depending on the direction and weight of state transitions, these eigenvector-based metrics can be used to quantify the importance of classes relative to the dataset, the importance of instances relative to classes, or the importance of one set of classes relative to another set of classes based on annotations of a common dataset.

We have developed an MCOA enrichment analysis approach and implemented it as a plugin to the Ontologizer framework. For implementation details, see the MCOA paper referenced below.

To use:

- Launch the MCOA-customized version of the Ontologizer via Java Web Start: MCOA Ontologizer
- Load desired population and study datasets and select the "MCOA" method for analysis.
- Note: To associate weights with genes in a study set, specify each weight as a floating point number following the gene name.

- Several MCOA-parameters are exposed on the "MCOA" tab in the preferences window.
- For general instructions regarding the use of the Ontologizer, please see the Ontologizer documentation.

- If at all possible, you should specify false positive and false negative values (via the MCOA preferences tab) that reflect your best understanding of the experimental conditions under which the study gene list was generated. Although the method can be set to automatically estimate these values from the data, this feature can result in suboptimal settings.
- It is recommended that you initially generate and interpret uncorrected p-values. However, depending on your approach/philosophy regarding multiple hypothesis correction, you will likely want to apply some form of correction to the p-values (for the counter-view, see Rothman, K J. No adjustments are needed for multiple comparisons. Epidemiology 1, 43–46 (1990).). Please note that if you select to apply one of the multiple hypothesis correction methods supported by the Ontologizer to your results, the correction applied will be based on the number of GO terms that were found to maximize the likelihood function (and for which hypergeometric-based p-values were computed) and NOT the total number of GO terms that were used to build the initial Markov chain. Using the uncorrected p-values, you can make corrections using any desired method and parameters that are appropriate for your specific investigation.