443 research outputs found

    Radiomics and machine learning methods for 2-year overall survival prediction in non-small cell lung cancer patients

    Get PDF
    In recent years, the application of radiomics to lung cancer show encouraging results in the prediction of histological outcomes, survival times, staging of the disease and so on. In this thesis, radiomics and deep learning applications are compared by analyzing their performance in the prediction of the 2-year overall survival (OS) in patients affected by non-small cell lung cancer (NSCLC). The dataset under exam contains 417 patients with both clinical data and computed tomography (CT) examinations of the chest. Radiomics extracts handcrafted radiomic features from the three-dimensional tumor region of interest (ROI). It is the approach that better predicts the 2-year overall survival with a training and test area under the receiver operating characteristic curve (AUC) equal to 0.683 and 0.652. Concerning deep learning applications, two methods are considered in this thesis: deep features and convolutional neural networks (CNN). The first method is similar to radiomics, but substitutes handcrafted features with deep features extracted from the bi-dimensional slices that build the three-dimensional tumor ROI. In particular, two different main classes of deep features are considered: the latent variables returned by a convolutional autoencoder (CAE) and the inner features learnt by a pre-trained CNN. The results for latent variables returned by CAE show an AUC of 0.692 in training set and 0.631 in test set. The second method considers the direct classification of the CT images themselves by means of CNN. They perform better than deep features and they reach an AUC equal to 0.692 in training set and 0.644 in test set. For CNN, the impact of using generative adversarial networks (GAN) to increase the dataset dimension is also investigated. This analysis results in poorly defined images, where the synthesis of the bones is incompatible with the actual structure of the tumor mass. In general, deep learning applications perform worse than radiomics, both in terms of lower AUC and greater generalization gap between training and test sets. The main issue encountered in their training is the limited number of patients that is responsible for overfitting on CNN, inacurrate reconstructions on CAE and poor synthetic images on GAN. This limit is reflected in the necessity to reduce the complexity of the models by implementing a two-dimensional analysis of the tumor masses, in contrast with the three-dimensional study performed by radiomics. However, the bi-dimensional restriction is responsible for an incomplete description of the tumor masses, reducing the predictive capabilities of deep learning applications. In summary, our analysis, spanning a wide set of more than 7000 combinations, shows that with the current dataset it is only possible to match the performances of previous works. This detailed survey suggests that we have reached the state of the art in terms of analysis and that more data are needed to improve the predictions

    Profesora Anita Soto recibió premio nacional a la contribución de la mujer al desarrollo rural

    Get PDF

    Interpretable machine learning of amino acid patterns in proteins: a statistical ensemble approach

    Full text link
    Explainable and interpretable unsupervised machine learning helps understand the underlying structure of data. We introduce an ensemble analysis of machine learning models to consolidate their interpretation. Its application shows that restricted Boltzmann machines compress consistently into a few bits the information stored in a sequence of five amino acids at the start or end of α\alpha-helices or β\beta-sheets. The weights learned by the machines reveal unexpected properties of the amino acids and the secondary structure of proteins: (i) His and Thr have a negligible contribution to the amphiphilic pattern of α\alpha-helices; (ii) there is a class of α\alpha-helices particularly rich in Ala at their end; (iii) Pro occupies most often slots otherwise occupied by polar or charged amino acids, and its presence at the start of helices is relevant; (iv) Glu and especially Asp on one side, and Val, Leu, Iso, and Phe on the other, display the strongest tendency to mark amphiphilic patterns, i.e., extreme values of an "effective hydrophobicity", though they are not the most powerful (non) hydrophobic amino acids.Comment: 15 pages, 9 figure

    A Time-Saving Technique for Specimen Extraction in Sleeve Gastrectomy

    Full text link

    Machine learning understands knotted polymers

    Full text link
    Simulated configurations of flexible knotted rings confined inside a spherical cavity are fed into long-short term memory neural networks (LSTM NNs) designed to distinguish knot types. The results show that they perform well in knot recognition even if tested against flexible, strongly confined and therefore highly geometrically entangled rings. In agreement with the expectation that knots are delocalized in dense polymers, a suitable coarse-graining procedure on configurations boosts the performance of the LSTMs when knot identification is applied to rings much longer than those used for training. Notably, when the NNs fail, usually the wrong prediction still belongs to the same topological family of the correct one. The fact that the LSTMs are able to grasp some basic properties of the ring's topology is corroborated by a test on knot types not used for training. We also show that the choice of the NN architecture is important: simpler convolutional NNs do not perform so well. Finally, all results depend on the features used for input: surprisingly, coordinates or bond directions of the configurations provide the best accuracy to the NNs, even if they are not invariant under rotations (while the knot type is invariant). Other rotational invariant features we tested are based on distances, angles, and dihedral angles

    A Symposium on Management of Barrett’s in Patients Having Bariatric Surgery

    Full text link

    A rigorous approach to facilitate and guarantee the correctness of the genetic testing management in human genome information systems

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent medical and biological technology advances have stimulated the development of new testing systems that have been providing huge, varied amounts of molecular and clinical data. Growing data volumes pose significant challenges for information processing systems in research centers. Additionally, the routines of genomics laboratory are typically characterized by high parallelism in testing and constant procedure changes.</p> <p>Results</p> <p>This paper describes a formal approach to address this challenge through the implementation of a genetic testing management system applied to human genome laboratory. We introduced the Human Genome Research Center Information System (CEGH) in Brazil, a system that is able to support constant changes in human genome testing and can provide patients updated results based on the most recent and validated genetic knowledge. Our approach uses a common repository for process planning to ensure reusability, specification, instantiation, monitoring, and execution of processes, which are defined using a relational database and rigorous control flow specifications based on process algebra (ACP). The main difference between our approach and related works is that we were able to join two important aspects: 1) process scalability achieved through relational database implementation, and 2) correctness of processes using process algebra. Furthermore, the software allows end users to define genetic testing without requiring any knowledge about business process notation or process algebra.</p> <p>Conclusions</p> <p>This paper presents the CEGH information system that is a Laboratory Information Management System (LIMS) based on a formal framework to support genetic testing management for Mendelian disorder studies. We have proved the feasibility and showed usability benefits of a rigorous approach that is able to specify, validate, and perform genetic testing using easy end user interfaces.</p

    Da ciência à e-ciência: paradigmas da descoberta do conhecimento

    Get PDF
    Gradualmente, a computação está deixando de ser apenas uma “ferramenta de apoio” a novas pesquisas para se tornar parte fundamental das ciências com que interage e de seus métodos científicos. A sinergia entre ciência da computação e as outras áreas do conhecimento criou um novo modo de se fazer ciência – a e-science (ou e-ciência) – que unifica teoria, experimentos e simulação, ao mesmo tempo em que lida com uma quantidade enorme de informação. O uso de computação em nuvem tem o potencial de permitir que pesquisas antes restritas àqueles com acesso a supercomputadores possam ser realizadas por qualquer pesquisador. Este artigo apresenta uma breve descrição da evolução dos paradigmas do modo de se fazer ciência (do empirismo ao panorama atual da e-science) e aborda o potencial da computação em nuvem como ferramenta catalisadora de pesquisa transformativa.Computer Science is gradually evolving from a mere “supporting tool” for research in other fields and turning into an intrinsic part of the very methods of the sciences with which it interacts. The synergy between Computer Science and other fields of knowledge created a novel way of doing science – called eScience – which unifies theory, experiments, and simulations, enabling researchers to deal with huge amounts of information. The use of cloud computing has the potential to allow any researcher to conduct works previously restricted to those with access to supercomputers. This article presents a brief history of the evolution of scientific paradigms (from empiricism to the current landscape of eScience) and discusses the potential of cloud computing as a tool capable of catalyzing transformative research
    corecore