The research activities of the Bioinformatics Laboratories of the DISCo are centered around a number of projects, loosely grouped around the themes of Bioinformatics, Systems biology, Sequence analysis, Natural computing, and Experimental Algorithmics.
Systems Biology and Data Analysis
One of the most interesting problems in Cancer Research concerns the evolution of a tumor from its initial early stages. Issues of heterogeneity and timing all enter the picture and make the reconstruction cancer progression models a complex endeavor that must also take into account the differences between individual tumors and large ensemble data sets. The BIMIB group has developed a number of tools and algorithms in the past years that address these issues. First there are tools and algorithms that use Suppes' Probabilistic Causality Networks to infer a cancer progression model; these algorithms are collected in the TRONCO library. Second there are studies on simulation of cancer evolution in relation to its clonal makeup; the CABERNET tool being the latest tool being published.
The analysis of biological systems relies more and more on computational and mathematical methods. The goals of such analysis are multifarious; among the most important ones is the discovery of the biochemical and genetic machinery responsible for pathology development, its control and, possibly, elimination. Such discoveries also rely on an understanding of the spatio-temporal development of biological phenomena, their cause (often "mutations") and their effects on different scales.
The RetroNet project intends to address this problem and others by i) sharing data and knowledge needed for a new integrative research approach in medicine, ii) sharing or jointly develop multiscale models, simulators and analysis tools, with particular attention to the development of Colon Rectal Cancer (CRC) and some of its metastatic effects, and, iii) creating the prototype of a collaborative environment supporting research in this highly interdisciplinary field, by leveraging the experience matured from of previous FP6 experiences .
The RetroNet project concentrates on the development and tuning of algorithms for detecting of emerging behavior from cells ensembles, by searching, analysing and formulating hypotheses of various feedback cycles in biological systems. The approach will leverage several Control-Theoretic concepts; especially the notions of state-estimation and control-policy learning as implicit drivers of biological behavior selection. The emerging-behavior detection algorithms will consider the content of Pathway and Models Databases and knowledge directly gained from clinicians and biologists running bio-banks or wet-laboratory focussed research.
The amount of biomedical information that can be accessed through the Internet has reached a level no one could have dreamt of just ten years ago. The success of the genome sequencing projects has created an enormous amount of data that cannot be manually analysed. Since disease phenotypes arise from complex interaction between genetic factors and environment, the value of high-throughput genomic research would be dramatically enhanced by associations with key patient data. These data are generally available but of disparate quality and sources.The development of a data management system which integrates genomic databanks, clinical databases, and data mining tools embedded into a common resource accessible to health care professionals would be extremely advantageous.
Ischemic stroke is a major health problem in the developed countries. It is a complex, multigenic disorder, since there are several subtypes and risk factors, and most of the cases have non-mendelian inheritance. The integration and the analysis of a large number of well-defined clinical, radiological and molecular data will improve the evidence on the different roles played by genetic and environmental risk factors in stroke pathophysiology.
NEUROWEB was funded by the European Commission in the FP6 program. The project identifier was IST-2006-518513.
The Biological processes Redescriptions by ONtological Expressions (BRONTE) addresses the needs of biologists by leveraging recent developments in the basic science and technology of biological research, computer science research, research on knowledge-based systems and statistical data-mining techniques. In particular, BRONTE addresses the problem of sifting through large microarray datasets by combining statistical-numerical analyses and terminological descriptions constructed from a wealth of newly developed knowledge bases, ontologies and controlled vocabularies.
Bioinformatics and Sequence Analysis
Alternative splicing (AS) is currently considered as one of the main mechanism able to explain the huge gap between the number of predicted genes and the high complexity of proteome in human. The main goal of this project is the development of fast and reliable computational tools for analyzing and predicting AS from ESTs and genomic data.
Phylogenetic Reconstruction and Comparison
Our research on this basic topic of Computational Biology mainly concerns the computational complexity and algorithmic solution of optimization problems derived by specific instances of the more general problem of comparing phylogenies (or evolutionary networks) to combine them into a single representation (i.e. an evolutionary tree or network). We address computational problems derived from consensus tree methods such as the maximum agreement subtree (MAST) problem and the maximum isomorphic subtree (MIT) problem. A basic problem we investigate in comparative phylogenetics is the reconciliation (or inference) of species tree from gene trees.
Algorithms for Haplotype Inference (HI) and Genetic Variation Analysis
Our research in this field is mainly focused on the design and experimentation of algorithm for solving combinatorial problems related to haplotype inference and genetic variations analysis.
Specific computational problems of interest are: (1) inferring the complete information on haplotypes from (incomplete or partial) haplotypes or genotypes assuming the Coalescent model, (2) efficient reconstruction of the perfect phylogeny describing the evolutionary history of SNPs (single nucleotide polymorphism) data in presence of recurrent mutations.
Sequence Analysis and Comparison
The main goal of this project concerns the development of algorithms for sequence analysis by novel alignment methodology and sequence comparison by consensus sequence methods with application in several field of genome sequence comparison (genome sequence rearrangement, multiple sequence comparison). Our investigation in this area has concerned the design of approximation and heuristic algorithms for the LCS and SCS, the Exemplar Longest Common Subsequence.
Splicing systems and regular languages
In our research, we focus on the original concept of finite splicing system that is closest to the real biological process: the splicing operation is meant to act by a finite set of rules (modeling enzymes) on a finite set of initial strings (modeling DNA sequences). Under this formal model, a splicing system is a generative mechanism of languages which turn out to be regular languages. A main goal of our research in this field is providing a characterization of the computational power of finite splicing systems and algorithmic procedures to decide regular splicing languages and build systems (synthesis of splicing systems) for such languages.
The main activity carried on in this lab is to design efficient algorithms for solving a number of combinatorial problems. Both theoretical and practical aspects are studied, as efficiency is sought at the algorithmic and implementation levels. Consequently it is of fundamental relevance the design and implementation of efficient data structures. Ongoing research is focused on the design of approximation and exact algorithm, as well as the analysis of algorithms. The analysis can be on the average case and on the worst case. The techniques employed can be mathematical and combinatorial when the emphasis is on the theoretical side, while an experimental study is preferred when the real-world behavior is analyzed.
More information can be found on the main page of the ALGO-lab.