118 research outputs found

    Shape and Boundary Similarity Features for Accurate HCC Image Recognition

    Get PDF

    Computational Methods for the Analysis of Genomic Data and Biological Processes

    Get PDF
    In recent decades, new technologies have made remarkable progress in helping to understand biological systems. Rapid advances in genomic profiling techniques such as microarrays or high-performance sequencing have brought new opportunities and challenges in the fields of computational biology and bioinformatics. Such genetic sequencing techniques allow large amounts of data to be produced, whose analysis and cross-integration could provide a complete view of organisms. As a result, it is necessary to develop new techniques and algorithms that carry out an analysis of these data with reliability and efficiency. This Special Issue collected the latest advances in the field of computational methods for the analysis of gene expression data, and, in particular, the modeling of biological processes. Here we present eleven works selected to be published in this Special Issue due to their interest, quality, and originality

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Data fusion techniques for biomedical informatics and clinical decision support

    Get PDF
    Data fusion can be used to combine multiple data sources or modalities to facilitate enhanced visualization, analysis, detection, estimation, or classification. Data fusion can be applied at the raw-data, feature-based, and decision-based levels. Data fusion applications of different sorts have been built up in areas such as statistics, computer vision and other machine learning aspects. It has been employed in a variety of realistic scenarios such as medical diagnosis, clinical decision support, and structural health monitoring. This dissertation includes investigation and development of methods to perform data fusion for cervical cancer intraepithelial neoplasia (CIN) and a clinical decision support system. The general framework for these applications includes image processing followed by feature development and classification of the detected region of interest (ROI). Image processing methods such as k-means clustering based on color information, dilation, erosion and centroid locating methods were used for ROI detection. The features extracted include texture, color, nuclei-based and triangle features. Analysis and classification was performed using feature- and decision-level data fusion techniques such as support vector machine, statistical methods such as logistic regression, linear discriminant analysis and voting algorithms --Abstract, page iv

    Finding new genes and pathways involved in cancer development by analysing insertional mutagenesis data

    Get PDF
    Dissertação de mestrado em BioinformáticaCancer emerges froman uncontrollable division of the organism’s cells, creating a tumour. These tumours can emerge fromany part of the human body. The increase of cellular division and growth can be created by mutations in the genome. Several methodologies are approached, in the research, to finding new cancer genes. The insertional mutagenesis (IM) has been one of the most used, in which the mouse is infected by a retrovirus or a transposon, increasing the gene expression in the insertions’ vicinity. The data used in work essay are a collection of independent studies of IM inmice. After its processing, the data has 3,414 samples, having information of 7,751 genes. Each sample matches a type of cancer (colorectal, hematopoietic, hepatocellular carcinoma, lymphoma, malignant peripheral nerve sheath, medulloblastoma and pancreatic). The main goal of this project is to determine if there are specific genes for a particular type of cancer. And, if there are, which are the 15 most evolved genes for that type of cancer. Machine learning (ML) is a subject where its goal is to increase knowledge based on given experimental data, allowing it to execute predictions and accurate decisions. To answer our purpose, it is necessary the transform the data into a dissimilarity relation between samples. Different approaches were used: two of them are known from the literature (Hamming distance and Jaccard distance) and two new metrics were developed (Gene DependentMethod (GDM) and Gene IndependentMethod (GIM)).With these transformations, unsupervised learning methods (such as Principal Component Analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE)) and supervised learning approach, testing different classifiers by crossed validation, were used. The main results show that some genes may be specific to a particular type of cancer. Therefore, it is possible to create a ranking gene, according to its importance to a type of cancer. 105 genes are presented (15 genes of each type of cancer), of which 18 were not annotated yet and 19 have already been mentioned in the literature to be involved in the development of the selected cancer tissue. Afterwards it must be performed a proper in vitro and in vivo validation.O cancro surge da divisão incontrolável de células de um organismo, criando um tumor. Estes tumores podem surgir em qualquer parte do corpo do ser vivo. O aumento da divisão e crescimento celular pode dever-se a mutações no genoma. São várias as metodologias abordadas na investigação para a descoberta de novos genes de cancro. A mutação por inserção (IM) tem sido uma abordagem bastante utilizada, no qual o rato é infetado por um retrovírus ou um transposão, aumentando a expressão do gene que se encontra na vizinhança da inserção. Os dados usados neste trabalho correspondem a uma coleção de estudos independentes de IM em ratos. Após o seu processamento, os dados contêm 3,414 amostras, tendo informação de 7,751 genes. Cada umadas amostras corresponde a umtipo de cancro (colorectal, tecido hematopoiético, carcinoma hepatocelular, linfoma, tumor maligno de bainha nervosa, meduloblastoma e pâncreas). O objetivo principal deste projeto é determinar se existem genes específicos para um determinado tipo de cancro e, se sim, quais são os 15 genes mais envolvidos para o desenvolvimento do mesmo. A aprendizagem de máquina (ML) tem como objetivo ganhar conhecimento com base em dados experimentais fornecidos, permitindo que este possa realizar previsões e decisões precisas. Para se responder ao objetivo, é necessária a transformação dos dados numa relação de dissimilaridade entre amostras. Foram usadas quatro abordagens: duas delas são descritas na literatura (a distância de Hamming e a distância de Jaccard) e duas novas métricas foram desenvolvidas (o método de gene dependente (GDM) e o método de gene independente (GIM)). A partir destas transformações foram usadas metodologias de aprendizagem não supervisionada (a Análise de Componentes Principais (PCA) e o tdistributed stochastic neighbor embedding (t-SNE)), e a metodologia supervisionada, testando diferentes classificadores por validação cruzada. Os resultados principais mostram que existem genes que poderão ser específicos para umdado tipo de cancro. Assim sendo, é possível criar uma ordenação dos genes de acordo com a sua importância face a umtipo de cancro. São apresentados 105 genes (15 genes para cada tipo de cancro), dos quais 18 ainda não foram anotados e 19 já foram mencionados na literatura por estarem envolvidos no desenvolvimento do cancro do tecido selecionado. Posteriormente deverá ser realizada a devida validação in vitro e in vivo

    Non-communicable Diseases, Big Data and Artificial Intelligence

    Get PDF
    This reprint includes 15 articles in the field of non-communicable Diseases, big data, and artificial intelligence, overviewing the most recent advances in the field of AI and their application potential in 3P medicine

    Mass spectral imaging of clinical samples using deep learning

    Get PDF
    A better interpretation of tumour heterogeneity and variability is vital for the improvement of novel diagnostic techniques and personalized cancer treatments. Tumour tissue heterogeneity is characterized by biochemical heterogeneity, which can be investigated by unsupervised metabolomics. Mass Spectrometry Imaging (MSI) combined with Machine Learning techniques have generated increasing interest as analytical and diagnostic tools for the analysis of spatial molecular patterns in tissue samples. Considering the high complexity of data produced by the application of MSI, which can consist of many thousands of spectral peaks, statistical analysis and in particular machine learning and deep learning have been investigated as novel approaches to deduce the relationships between the measured molecular patterns and the local structural and biological properties of the tissues. Machine learning have historically been divided into two main categories: Supervised and Unsupervised learning. In MSI, supervised learning methods may be used to segment tissues into histologically relevant areas e.g. the classification of tissue regions in H&E (Haemotoxylin and Eosin) stained samples. Initial classification by an expert histopathologist, through visual inspection enables the development of univariate or multivariate models, based on tissue regions that have significantly up/down-regulated ions. However, complex data may result in underdetermined models, and alternative methods that can cope with high dimensionality and noisy data are required. Here, we describe, apply, and test a novel diagnostic procedure built using a combination of MSI and deep learning with the objective of delineating and identifying biochemical differences between cancerous and non-cancerous tissue in metastatic liver cancer and epithelial ovarian cancer. The workflow investigates the robustness of single (1D) to multidimensional (3D) tumour analyses and also highlights possible biomarkers which are not accessible from classical visual analysis of the H&E images. The identification of key molecular markers may provide a deeper understanding of tumour heterogeneity and potential targets for intervention.Open Acces
    • …
    corecore