INESC-ID   Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technology from seed


Knowledge Discovery and Bioinformatics
Inesc-ID Lisboa


Online Bayesian Time-varying Parameter Estimation of HIV-1 data

01/27/2012 - 14:30
01/27/2012 - 15:30

The importance of a system theory based approach in understanding immunological diseases, in particular the HIV-1 infection, is being increasingly recognized. The dynamics of virus infection may be effectively represented by compact state space models in the form of nonlinear ordinary differential equations (ODEs).
Nonlinear Bayesian filtering offers various online tools for system identification of parametric ordinary differential equation models. Since parameters may change with time, it is a relevant question to assess how well time-varying parameters can be estimated from data.
For this purpose two different filtering methods, Extended Kalman Filter and Particle Filter were applied for state and time-varying parameter estimation. After evaluating the methods on simulated time-series we applied them to long-term clinical datasets. Estimated time-varying parameters on clinical data are consistent with previously reported results with offline algorithms.

Tools and medical applications of an evolutionary cell biology

12/06/2011 - 12:30
12/06/2011 - 13:30

In the Computational Genomics Lab we combine the study of evolutionary cell biology with translational, or medical bioinformatics. We study evolutionary cell biology, i.e. the evolutionary mechanisms underlying the origins and evolution of cellular life and the complex structures within the cell. We are also very interested in the medical, or translational applications of bioinformatics and evolutionary genomics, and are conducing collaborative projects on pathogenic bacteria, protozoa and several types of human cancers. We are a small research group that includes biologists, computer scientists, chemists and clinicians.

G-Tries: an efficient data-structure for counting subgraphs

12/02/2011 - 14:30
12/02/2011 - 15:30

Complex networks are ubiquitous in real-world systems. In order to understand their design principles, the concept of network motifs emerged. These are recurrent overrepresented patterns of interconnections that can be seen as building blocks of networks. Algorithmically, discovering these motifs is a hard problem, which limits their practical applicability.

I will give an overview of the state of the art in algorithms for finding these patterns. I will then present a novel data structure, g-tries, designed to efficiently represent a collection of graphs and to search for them as induced subgraphs of another larger graph. I will explain how it takes advantage of common substructure and how symmetry breaking conditions can be used to avoid redundant computations. I will also briefly introduce a sampling methodology capable of trading accuracy for even better execution times, and give some notes on the scalability of the methods, showing that they are suitable for a parallel computation.

Finally, I will show an extensive empirical evaluation of the developed algorithms on a set of diversified complex networks, showing that g-tries can outperform all previously existing competitor algorithms.

Advances in understanding the epidemiology of HIV over the past decade

11/28/2011 - 11:00
11/28/2011 - 12:30

***Short biography***
Sir Roy is Professor of Infectious Disease Epidemiology in the School of Public Health, Faculty of Medicine, Imperial College London. His recent appointments include Rector of Imperial College London and Chief Scientist at the Ministry of Defence, UK. Sir Roy has also served as Director of the Wellcome Centre for Parasite Infections from 1989 to 1993 (at Imperial College London) and as Director of the Wellcome Centre for the Epidemiology of Infectious Disease from 1993 to 2000 (at the University of Oxford). He is the author of over 450 scientific articles and has sat on numerous government and international agency committees advising on public health and disease control including the World Health Organisation and UNAIDS. From 1991-2000 he was a Governor of the Wellcome Trust. He currently is a Trustee of the Natural History Museum, London, a Governor of the Institute of Government London, a Member of the Singapore National Research Foundation, a Member of the International Advisory Committee of Thailand National Science and Technology Development Agency, and is a Member of the Bill and Melinda Gates Grand Challenges advisory board. He is a non-executive director of GlaxoSmithKline and a member of the International Advisory Board of Hakluyt and Company Ltd. Sir Roy was elected Fellow of the Royal Society in 1986, a Founding Fellow of the Academy of Medical Sciences in 1998, a Foreign Associate Member of the Institute of Medicine at the US National Academy of Sciences in 1999 and a Foreign Member of the French Academy of Sciences in 2009. He was knighted in the 2006 Queen's Birthday Honours.

A Relational Data Mining perspective for Bioinformatics applications

10/31/2011 - 12:30
10/31/2011 - 13:45


Room C01 of IST (cave do pavilhão central)

The talk will have two parts. In the first part I will give a broad overview of the area of Inductive Logic Programming (ILP) as a promising approach to Relational Data Mining. Advantages of such approach together with their main applications will then be presented. In the second part I will focus on: i) a technique for conceptual clustering in Relational Data Mining that I have been working recently; ii) the application of ILP in rational Drug Design; iii) work on using ILP for Protein Folding.

"CAMP - Computational Analysis of MicroRNAs in Plants" & "NetDyn: Understanding real large networks, from structure to dynamics"

09/23/2011 - 14:30
09/23/2011 - 15:30


Room 336 (INESC-ID)

Paulo Fonseca and Alexandre Francisco are researchers at INESC-ID. They are the Principal Investigators of the two projects specified below, approved in the last FCT call. This friday they will informally talk about them.

Title: CAMP - Computational Analysis of MicroRNAs in Plants, PTDC/EIA-EIA/122534/2010
Speaker: Paulo Fonseca

Title: NetDyn: Understanding real large networks, from structure to dynamics, PTDC/EIA-CCO/118533/2010
Speaker: Alexandre Francisco

Towards a mathematical model of risk assessment of biocide induced antibiotic resistance

07/08/2011 - 15:00
07/08/2011 - 16:30

Biocides have been widely used for several decades to preserve materials including food and cosmetics, to decontaminate surfaces, to disinfect instruments, used in fabrics and, even, in toys, for personal hygiene, and to prevent transmission of infections. Nevertheless, when used in large volumes or at high concentrations, biocides have toxic effects and excessive use is dangerous for the environment, including animals and humans. Despite this widespread and ever increasing use of biocides, most bacterial and fungal species remain susceptible but decreased susceptibility has been reported and occasionally linked to antibiotic resistance, mainly in human and veterinary pathogens.

The problem of the development of resistances, together with the possibility to prevent them, has been carefully considered by the EC in the Biocides Directive 98/8/CE, a norm which oversees a high protection for the environment and man, and harmonizes the rules for placing on the market within the European Union any active substances and biocidal products.

This work is developed in the context of the European project BIOHYPO (Proposal No 227258 of the Programme ``FP7 Cooperation Work Programme: Food, Agriculture and Fisheries, and Biotechnologies'') (Dr. Marco Oggioni, PI). The main goal is the evaluation of the risk for clinically significant increase or spread of antibiotic resistance in food pathogens due to biocide use. Statistical analyses are performed in a large data set of Staphylococcus aureus in order to have insight about the real clinical relevance of any antibiotic/biocide co- and cross-resistance.

Research challenges from Free Software Distributions

07/06/2011 - 09:00
07/06/2011 - 10:00

Free Software distributions, like Debian, RedHat, or Ubuntu, are some of the largest component based software systems, and they all use packages as their building blocks, together with tools for selecting, installing and removing packages on a running system.Evolving such complex software systems is a daunting task that carries significant challenges: in this talk, after providing a simple formalisation of packages and distributions, we will survey some recent results and algorithms developed to answer questions like "which is the most important package among the 27000 ones in Debian squeeze?", or "what version change is most likely to have an impact on the system"?

Local identifiability of a HIV-1 infection model using a sensitivity approach

07/01/2011 - 14:00
07/01/2011 - 15:00

The dynamic modeling of the Human Immunodeficiency Virus 1 (HIV-1) infection is still one of the great challenges in systems biology. The high prevalence of Acquired Immune Deficiency Syndrome (AIDS), known to be caused by HIV, and the fact that no cure has yet been discovered, confers relevancy to this area of study. In this paper, a dynamic model for the HIV-1 infection is analyzed. The sensitivity and identifiability issues are addressed with the purpose of optimizing the time points at which patients' blood samples should be drawn. This paper shows that there are time periods far more informative than others, thus improving parameter identifiability and estimability in the reverse engineering step.

A mixture-of-experts approach to biclustering

06/17/2011 - 14:00
06/17/2011 - 15:00

Biclustering is the unsupervised learning task of mining a data matrix for submatrices, known as biclusters, with desirable properties. For instance, the goal can be to find groups of genes that are co-expressed under particular biological conditions. Many biclustering methods do not allow biclusters to overlap; others do, but need to specify how the biclusters interact at the overlapping regions. It is therefore of interest to devise methods that allow flexible, overlapping bicluster structures while not forcing the practitioner to specify bicluster interaction models. We propose a mixture modelling framework allowing biclusters to overlap but not requiring the practitioner to postulate any parameter interaction models between biclusters. Sharing a similar intuition to mixture-of-experts models, our model allows biclusters to specify partly overlapping regions of expertise in which the biclusters are able to model the data adequately. The uncertainty over assignments of data points to biclusters depends on the membership of data points to these regions of expertise. We perform inference and parameter estimation via a variational expectation-maximization framework. The model is easily adaptable to different data types and compares favorably to other approaches, both in a binary DNA copy number variation data set and in a miRNA expression data set.