54 research outputs found

    CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules

    Get PDF
    Nowadays, knowledge extraction methods from Next Generation Sequencing data are highly requested. In this work, we focus on RNA-seq gene expression analysis and specifically on case-control studies with rule-based supervised classification algorithms that build a model able to discriminate cases from controls. State of the art algorithms compute a single classification model that contains few features (genes). On the contrary, our goal is to elicit a higher amount of knowledge by computing many classification models, and therefore to identify most of the genes related to the predicted class

    A global optimization algorithm for protein surface alignment

    Get PDF
    Background A relevant problem in drug design is the comparison and recognition of protein binding sites. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for the interaction with a specific ligand. Several matching strategies have been designed for the recognition of protein-ligand binding sites and of protein-protein interfaces but the problem cannot be considered solved. Results In this paper we propose a new method for local structural alignment of protein surfaces based on continuous global optimization techniques. Given the three-dimensional structures of two proteins, the method finds the isometric transformation (rotation plus translation) that best superimposes active regions of two structures. We draw our inspiration from the well-known Iterative Closest Point (ICP) method for three-dimensional (3D) shapes registration. Our main contribution is in the adoption of a controlled random search as a more efficient global optimization approach along with a new dissimilarity measure. The reported computational experience and comparison show viability of the proposed approach. Conclusions Our method performs well to detect similarity in binding sites when this in fact exists. In the future we plan to do a more comprehensive evaluation of the method by considering large datasets of non-redundant proteins and applying a clustering technique to the results of all comparisons to classify binding sites

    MISSEL: a method to identify a large number of small species-specific genomic subsequences and its application to viruses classification

    Get PDF
    Continuous improvements in next generation sequencing technologies led to ever-increasing collections of genomic sequences, which have not been easily characterized by biologists, and whose analysis requires huge computational effort. The classification of species emerged as one of the main applications of DNA analysis and has been addressed with several approaches, e.g., multiple alignments-, phylogenetic trees-, statistical- and character-based methods

    Comparing Alzheimer’s and Parkinson’s diseases networks using graph communities structure

    Get PDF
    Background: Recent advances in large datasets analysis offer new insights to modern biology allowing system-level investigation of pathologies. Here we describe a novel computational method that exploits the ever-growing amount of "omics" data to shed light on Alzheimer's and Parkinson's diseases. Neurological disorders exhibit a huge number of molecular alterations due to a complex interplay between genetic and environmental factors. Classical reductionist approaches are focused on a few elements, providing a narrow overview of the etiopathogenic complexity of multifactorial diseases. On the other hand, high-throughput technologies allow the evaluation of many components of biological systems and their behaviors. Analysis of Parkinson's Disease (PD) and Alzheimer's Disease (AD) from a network perspective can highlight proteins or pathways common but differently represented that can be discriminating between the two pathological conditions, thus highlight similarities and differences. Results: In this work we propose a strategy that exploits network community structure identified with a state-of-the-art network community discovery algorithm called InfoMap, which takes advantage of information theory principles. We used two similarity measurements to quantify functional and topological similarities between the two pathologies. We built a Similarity Matrix to highlight similar communities and we analyzed statistically significant GO terms found in clustered areas of the matrix and in network communities. Our strategy allowed us to identify common known and unknown processes including DNA repair, RNA metabolism and glucose metabolism not detected with simple GO enrichment analysis. In particular, we were able to capture the connection between mitochondrial dysfunction and metabolism (glucose and glutamate/glutamine). Conclusions: This approach allows the identification of communities present in both pathologies which highlight common biological processes. Conversely, the identification of communities without any counterpart can be used to investigate processes that are characteristic of only one of the two pathologies. In general, the same strategy can be applied to compare any pair of biological networks

    Combining EEG signal processing with supervised methods for Alzheimer’s patients classification

    Get PDF
    Background Alzheimer’s Disease (AD) is a neurodegenaritive disorder characterized by a progressive dementia, for which actually no cure is known. An early detection of patients affected by AD can be obtained by analyzing their electroencephalography (EEG) signals, which show a reduction of the complexity, a perturbation of the synchrony, and a slowing down of the rhythms. Methods In this work, we apply a procedure that exploits feature extraction and classification techniques to EEG signals, whose aim is to distinguish patient affected by AD from the ones affected by Mild Cognitive Impairment (MCI) and healthy control (HC) samples. Specifically, we perform a time-frequency analysis by applying both the Fourier and Wavelet Transforms on 109 samples belonging to AD, MCI, and HC classes. The classification procedure is designed with the following steps: (i) preprocessing of EEG signals; (ii) feature extraction by means of the Discrete Fourier and Wavelet Transforms; and (iii) classification with tree-based supervised methods. Results By applying our procedure, we are able to extract reliable human-interpretable classification models that allow to automatically assign the patients into their belonging class. In particular, by exploiting a Wavelet feature extraction we achieve 83%, 92%, and 79% of accuracy when dealing with HC vs AD, HC vs MCI, and MCI vs AD classification problems, respectively. Conclusions Finally, by comparing the classification performances with both feature extraction methods, we find out that Wavelets analysis outperforms Fourier. Hence, we suggest it in combination with supervised methods for automatic patients classification based on their EEG signals for aiding the medical diagnosis of dementia

    Rationale and design of an independent randomised controlled trial evaluating the effectiveness of aripiprazole or haloperidol in combination with clozapine for treatment-resistant schizophrenia

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One third to two thirds of people with schizophrenia have persistent psychotic symptoms despite clozapine treatment. Under real-world circumstances, the need to provide effective therapeutic interventions to patients who do not have an optimal response to clozapine has been cited as the most common reason for simultaneously prescribing a second antipsychotic drug in combination treatment strategies. In a clinical area where the pressing need of providing therapeutic answers has progressively increased the occurrence of antipsychotic polypharmacy, despite the lack of robust evidence of its efficacy, we sought to implement a pre-planned protocol where two alternative therapeutic answers are systematically provided and evaluated within the context of a pragmatic, multicentre, independent randomised study.</p> <p>Methods/Design</p> <p>The principal clinical question to be answered by the present project is the relative efficacy and tolerability of combination treatment with clozapine plus aripiprazole compared with combination treatment with clozapine plus haloperidol in patients with an incomplete response to treatment with clozapine over an appropriate period of time. This project is a prospective, multicentre, randomized, parallel-group, superiority trial that follow patients over a period of 12 months. Withdrawal from allocated treatment within 3 months is the primary outcome.</p> <p>Discussion</p> <p>The implementation of the protocol presented here shows that it is possible to create a network of community psychiatric services that accept the idea of using their everyday clinical practice to produce randomised knowledge. The employed pragmatic attitude allowed to randomly allocate more than 100 individuals, which means that this study is the largest antipsychotic combination trial conducted so far in Western countries. We expect that the current project, by generating evidence on whether it is clinically useful to combine clozapine with aripiprazole rather than with haloperidol, provides physicians with a solid evidence base to be directly applied in the routine care of patients with schizophrenia.</p> <p>Trial Registration</p> <p><b>Clincaltrials.gov Identifier</b>: NCT00395915</p

    AN O(MN) ALGORITHM FOR REGULAR SET-COVERING PROBLEMS

    No full text
    AbstractA clutter L is a collection of m subsets of a ground set E(L) = {x1,…, xn} with the property that, for every pair Ai, Aj ϵ L, Ai is neither contained nor contains Aj, A transversal of L is a subset of E(L) intersecting every member of L.If we associate with each element xj ϵ E(L) a weight cj, the problem of finding a transversal having minimum weight is equivalent to the following set-covering problem min{cTx|MLx ⩾ 1m, xj ϵ {0, 1}, j = 1,…, n} where ML is the matrix whose rows are the incidence vectors of the subsets Ai ϵ L and 1m denotes the vector with m ones.A set-covering problem is regular if there exists an ordering of the variables σ = (x1,…, xn) such that, for every feasible solution x with xi = 1, xj = 0, j < i, the vector x + ej − ei is also a feasible solution, where ei is the ith unit vector. The matrix M of a regular set-covering problem is said to be regular.A regular clutter is any clutter whose incidence matrix is regular. In this paper we describe some properties of regular clutters and propose an algorithm which, in O(mn) steps, generates all the minimal transversals of a regular clutter L and produces the transversal having minimum weight
    corecore