2008 IGERT Project Meeting

Abstract

Abstract Title:
Multi-species Integrative Biclustering with cMonkey

Graduate Student Presenter: Peter Waltman
Name of the Author(s) and Affiliation(s): Peter Waltman, NYU COB program; Thadeus Kacmarczyk, NYU CGSB

Organisms respond to their changing environment via complex regulatory interaction networks, relaying information to target genes and cellular processes. Extensive data from high-throughput experimental technologies challenge systems-biologists to create new methods to clarify these inherently complex regulatory networks.

Integrative biclustering: Complete and accurate models of complex biological systems benefit from the integration of multiple forms of evidence derived from measuring these systems on different information levels (e.g. interaction networks, sequence motifs, protein and RNA expression, etc.). Biclustering, the simultaneous clustering of both genes and experiments, has emerged as an effective algorithm for the analysis of multiple systems biology data-types. In recent work, we introduced cMonkey, an algorithm that allows one to integrate diverse systems biology data-types to form optimal biclusters (Bonneau et al. Cell 2007).

Comparative Integrative Biclustering: We have extended the original cMonkey algorithm to simultaneously bicluster the genomes of multiple species. This method provides a framework that allows the insights from a well-studied organism to aid the analysis of related but less-studied organisms. By leveraging the power of comparative analysis, we identify both conserved modules of orthologous genes, as well as those that have diverged, yielding evolutionary insights into the formation and conservation of regulatory modules. We present results from the integrative biclustering of ~7 prokaryotic species and discuss methods for validation of this method.

Picture 1:
Picture 2: