Knowledge discovery is the process of developing strategies to discover useful and ideally all previously unknown knowledge from historical or real-time data.
Data structures are called succinct when they take little space (meaning usually of lower order) compared to the data they give access to. A more ambitious challenge is that of compressed data structures, which aim at operating within space proportional to that of the compressed data they give access to. Designing compressed data structures goes beyond compression in the sense that the data must be manageable in compressed form without first decompressing it. This is a trend that has gained much attention in recent years.
Data integration in life sciences is, presently, at a conundrum. On the one hand the diversity of data is increasing as explosively as its volume but on the other hand the value of individual data sets can only be appreciated when enough of those distinct pieces of the systemic puzzle are put together. Consequently, it is just as imperative to have agreeable standard formats as it is that they are not enforced so strictly as to be an obstacle to reporting the very novel data that brings value to systemic integration.
Descreve-se a modulação de redes de regulação genética através de sistemas dinâmicos seccionalmente afins com tempo discreto. Apresentam-se os resultados da análise desta modelação ao caso de circuitos simples, nomeadamente os circuitos positivos e negativos com um e dois genes.
Traditional multimedia indexing methods are based on the principle of hierarchical clustering of the data space, in which metric properties are used to build a tree that then can be used to prune branches while processing the queries. However, the performance of these methods will deteriorate rapidly when the dimensionality of the data space is increased.
Bacterial activity in nature is predominantly associated with surface-bound microbial communities that form complex and heterogeneous assemblages consisting of single or multiplespecies biofilms. These aggregates are involved in several human activities, ranging from the detrimental effects of unwanted biofilms in human health and industry to beneficial uses in environmental treatment processes.
Using signal processing, we wish to gain knowledge about biological complexity, as well as using this knowledge to engineer better technology. Three areas are identified as critical to understanding bio-complexity: 1) understanding DNA, 2) understanding protein pathways, and 3) evaluating overall biological function subject to external conditions. First, DNA is investigated for coding structure and redundancy, and a new tandem repeat region, an indicator of a neurodegenerative disease, is discovered.
Using a More Powerful Teacher to Reduce the Number of Queries of the L* Algorithm in Practical ApplicationsSubmitted by aml on Sun, 02/10/2008 - 12:27.
We propose to use a more powerful teacher to effectively apply query learning algorithms for regular languages in practical, real-world problems. More specifically, we define a more powerful set of replies to the membership queries posed by the L* algorithm that reduces the number of such queries by several orders of magnitude. The basic idea is to avoid the needless repetition of membership queries in cases where the reply will be negative as long as a particular condition is met by the string in the membership query.
Syntons, metabolons and interactons: an exact graph-theoretical approach for exploring neighbourhood between genomic and functioSubmitted by aml on Sun, 02/10/2008 - 11:11.
Modern comparative genomics does not restrict to sequence but involves the comparison of metabolic pathways or protein-protein interactions as well. Central in this approach is the concept of neighbourhood between entities (genes, proteins, chemical compounds). Therefore there is a growing need for new methods aiming at merging the connectivity information from different biological sources in order to infer functional coupling. We present a generic approach to merge the information from two or more graphs representing biological data. The method is based on two concepts.
A full-text index provides fast search for any pattern in a text. Tradicional full-text indexes, however, have a serious problem with space usage. A recent trend is to develop indexes that exploit the compressibility of the text so that their size is a function of the size of the compressed text. This field has introduced the concept of self-indexes, which, in addition to providing index capabilities, are capable of replacing the original text. The two most successful lines of research are the ones exploring compressed suffix arrays and the Burrows-Wheeler transform.