18 research outputs found

    Machine learning in multiscale modeling and simulations of molecular systems

    Get PDF
    Collective variables (CVs) are low-dimensional representations of the state of a complex system, which help us rationalize molecular conformations and sample free energy landscapes with molecular dynamics simulations. However, identifying a representative set of CVs for a given system is far from obvious, and most often relies on physical intuition or partial knowledge about the systems. An inappropriate choice of CVs is misleading and can lead to inefficient sampling. Thus, there is a need for systematic approaches to effectively identify CVs. In recent years, machine learning techniques, especially nonlinear dimensionality reduction (NLDR), have shown their ability to automatically identify the most important collective behavior of molecular systems. These methods have been widely used to visualize molecular trajectories. However, in general they do not provide a differentiable mapping from high-dimensional configuration space to their low-dimensional representation, as required in enhanced sampling methods, and they cannot deal with systems with inherently nontrivial conformational manifolds. In the fist part of this dissertation, we introduce a methodology that, starting from an ensemble representative of molecular flexibility, builds smooth and nonlinear data-driven collective variables (SandCV) from the output of nonlinear manifold learning algorithms. We demonstrate the method with a standard benchmark molecule and show how it can be non-intrusively combined with off-the-shelf enhanced sampling methods, here the adaptive biasing force method. SandCV identifies the system's conformational manifold, handles out-of-manifold conformations by a closest point projection, and exactly computes the Jacobian of the resulting CVs. We also illustrate how enhanced sampling simulations with SandCV can explore regions that were poorly sampled in the original molecular ensemble. We then demonstrate that NLDR methods face serious obstacles when the underlying CVs present periodicities, e.g.~arising from proper dihedral angles. As a result, NLDR methods collapse very distant configurations, thus leading to misinterpretations and inefficiencies in enhanced sampling. Here, we identify this largely overlooked problem, and discuss possible approaches to overcome it. Additionally, we characterize flexibility of alanine dipeptide molecule and show that it evolves around a flat torus in four-dimensional space. In the final part of this thesis, we propose a novel method, atlas of collective variables, that systematically overcomes topological obstacles, ameliorates the geometrical distortions and thus allows NLDR techniques to perform optimally in molecular simulations. This method automatically partitions the configuration space and treats each partition separately. Then, it connects these partitions from the statistical mechanics standpoint.Las variables colectivas (CVs, acrónimo inglés de collective variables) son representaciones de baja dimensionalidad del estado de un sistema complejo, que nos ayudan a racionalizar conformaciones moleculares y muestrear paisajes de energía libre con simulaciones de dinámica molecular. Sin embargo, identificar unas CVs representativas para un sistema dado dista de ser evidente, por lo que a menudo se confía en la intuición física o en el conocimiento parcial de los sistemas bajo estudio. Una elección inadecuada de las CVs puede dar a interpretaciones engañosas y conducir a un muestreo ineficiente. Por lo tanto, hay una necesidad de desarrollar enfoques sistemáticos para identificar CVs de manera efectiva. En los últimos años, las técnicas de aprendizaje de máquina, especialmente las técnicas de reducción de dimensionalidad no lineal (NLDR, acrónimo inglés de nonlinear dimensionality reduction), han demostrado su capacidad para identificar automáticamente el comportamiento colectivo de sistemas moleculares. Estos métodos han sido ampliamente utilizados para visualizar las trayectorias moleculares. No obstante, en general las técnicas de NLDR no proporcionan una aplicación diferenciable de las configuraciones de alta dimensión a su representación de baja dimensión, condición que es requerida en los métodos mejorados de muestreo, por lo que no pueden hacer frente a sistemas con variedades conformacionales inherentemente no triviales. En la primer parte de esta tesis doctoral, introducimos una metodología que, a partir de un conjunto de conformaciones representativo de la flexibilidad del sistema molecular, construye variables colectivas suaves y no lineales basadas en datos (SandCV, acrónimo en inglés de smooth and nonlinear data-driven collective variables) obtenidos utilizando algoritmos de aprendizaje de variedades no lineales. Demostramos el método con una molécula de referencia estándar y mostramos cómo puede ser combinado de forma no intrusiva con métodos mejorados de muestreo ya existentes, aquí el método de la fuerza de sesgo adaptativa. SandCV identifica la variedad conformacional del sistema, maneja conformaciones fuera de la variedad por una proyección al punto más cercano de la variedad, y calcula exactamente el Jacobiano de las CVs resultantes. También ilustramos cómo simulaciones de muestreo mejoradas pueden, mediante SandCV, explorar regiones que fueron mal muestreadas en el conjunto molecular inicial. A continuación, demostramos que los métodos NLDR se enfrentan a serios obstáculos cuando las CVs subyacentes presentan periodicidad, por ejemplo, derivados de ángulos diedrales. Como consecuencia, los métodos NLDR colapsan configuraciones muy distantes, lo que conduce a interpretaciones erróneas y a ineficiencias en el muestreo mejorado. Aquí, identificamos este problema en gran medida pasado por alto, y discutimos los posibles enfoques para superarlo. Además, caracterizamos la flexibilidad de la molécula de dipéptido alanina y demostramos que evoluciona en torno a un toro plano en cuatro dimensiones. En la parte final de esta tesis, proponemos una metodología novedosa, atlas de variables colectivas, que supera sistemáticamente obstáculos topológicos, aminora las distorsiones geométricas y por lo tanto permite que las técnicas NLDR trabajen de manera óptima en simulaciones moleculares. Este método divide de forma automática el espacio configuracional y trata a cada partición por separado. Después, conecta estas particiones del punto de vista de mecánica estadística

    Histopathologic Findings of Olfactory Mucosa in COVID-19 Patients

    Get PDF
    Background: Olfactory manifestations are common findings during the course of COVID-19, while exact physiopathology is not known Aim: We review histological changes of the nasal olfactory mucosa in COVID-19 non-survivors who died in the ICU. Methods: Sampling was done within 1 hour of death under direct vision. Specimens were taken medial to the middle turbinate in the cribriform area and embedded in paraffin blocks and stained by haematoxylin and eosin. Results: The most frequent histologic finding was the infiltration of inflammatory cells mostly comprised of lymphocytes. Inflammatory infiltration of mucosa was seen in all 11 patients with ulceration in 9 cases and neuritis in 3 cases.  Conclusion: Inflammatory infiltration of olfactory mucosa may be associated with smell manifestations. Further histological studies will clarify the role of the nasal mucosa in the physiopathology of COVID-19 especially olfactory involvement

    Federated Learning for Breast Density Classification: A Real-World Implementation

    Full text link
    Building robust deep learning-based models requires large quantities of diverse training data. In this study, we investigate the use of federated learning (FL) to build medical imaging classification models in a real-world collaborative setting. Seven clinical institutions from across the world joined this FL effort to train a model for breast density classification based on Breast Imaging, Reporting & Data System (BI-RADS). We show that despite substantial differences among the datasets from all sites (mammography system, class distribution, and data set size) and without centralizing data, we can successfully train AI models in federation. The results show that models trained using FL perform 6.3% on average better than their counterparts trained on an institute's local data alone. Furthermore, we show a 45.8% relative improvement in the models' generalizability when evaluated on the other participating sites' testing data.Comment: Accepted at the 1st MICCAI Workshop on "Distributed And Collaborative Learning"; add citation to Fig. 1 & 2 and update Fig.

    MONAI: An open-source framework for deep learning in healthcare

    Full text link
    Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.Comment: www.monai.i

    Topological obstructions in the way of data-driven collective variables

    No full text
    Nonlinear dimensionality reduction (NLDR) techniques are increasingly used to visualize molecular trajectories and to create data-driven collective variables for enhanced sampling simulations. The success of these methods relies on their ability to identify the essential degrees of freedom characterizing conformational changes. Here, we show that NLDR methods face serious obstacles when the underlying collective variables present periodicities, e.g., arising from proper dihedral angles. As a result, NLDR methods collapse very distant configurations, thus leading to misinterpretations and inefficiencies in enhanced sampling. Here, we identify this largely overlooked problem and discuss possible approaches to overcome it. We also characterize the geometry and topology of conformational changes of alanine dipeptide, a benchmark system for testing new methods to identify collective variables.Peer ReviewedPostprint (author’s final draft

    Socioeconomic characterization of regions through the lens of individual financial transactions

    No full text
    People are increasingly leaving digital traces of their daily activities through interacting with their digital environment. Among these traces, financial transactions are of paramount interest since they provide a panoramic view of human life through the lens of purchases, from food and clothes to sport and travel. Although many analyses have been done to study the individual preferences based on credit card transaction, characterizing human behavior at larger scales remains largely unexplored. This is mainly due to the lack of models that can relate individual transactions to macro-socioeconomic indicators. Building these models, not only can we obtain a nearly real-time information about socioeconomic characteristics of regions, usually available yearly or quarterly through official statistics, but also it can reveal hidden social and economic structures that cannot be captured by official indicators. In this paper, we aim to elucidate how macro-socioeconomic patterns could be understood based on individual financial decisions. To this end, we reveal the underlying interconnection of the network of spending leveraging anonymized individual credit/debit card transactions data, craft micro-socioeconomic indices that consists of various social and economic aspects of human life, and propose a machine learning framework to predict macro-socioeconomic indicators

    Does Airway Pressure Release Ventilation Mode Make Difference in Cardiopulmonary Function of ICU Patients?

    No full text
    Introduction: Tuberculosis (TB), with different types of respiratory tract involvements, has a high rate of mortality all around the world. Endobronchial involvement, which is a slightly common tuberculous infection, requires special attention due to its severe complications such as bronchostenosis. Aim of study of this study was describes, one type of pulmonary tuberculosis with less diagnosed and delayed treatment. High suspicious needs to diagnose and may be need bronchoscopy for confirmed the diagnosis. It can be associated with sever complication and early diagnosis and treatment are necessary for prevention of adverse effect. Materials and Methods: This retrospective study was conducted in a teaching hospital during 2005-2010. Patients diagnosed with endobronchial tuberculosis through bronchoscopic biopsy were included in the study. Diagnosis was confirmed by observation of caseous necrosis, bronchial lavage fluid or positive acid-fast staining in tissue samples obtained through bronchial biopsy. Moreover, demographic information, endobronchial view, lab tests, as well as clinical and radiographic findings were reviewed and evaluated retrospectively. Results: A total of 20 cases were confirmed with endobronchial tuberculosis, 75% of whom were female with the mean age of 60 years. The results showed that the most common clinical symptom was cough (80%), the most common finding in the chest X-ray was consolidation (75%), and the most common bronchoscopic feature was anthracosis (55%). Conclusion: TB is still a major concern, particularly in the developing countries. Thus, in order for early diagnosis and prevention of this disease, we need to pay meticulous attention to its clinical manifestations and bronchoscopic features

    Comparison of performance of proposed methodology using different regression methods.

    No full text
    <p>The proposed method can utilizes any regression method at its core and here its performance using four common regression methods on predicting GDP per capita based on micro-socioeconomic indices is illustrated. The rightmost points can be interpreted as the values without using dimensionality reduction phase.</p
    corecore