101 research outputs found

    Design and implementation of a platform for predicting pharmacological properties of molecules

    Get PDF
    Tese de mestrado, Bioinformática e Biologia Computacional, Universidade de Lisboa, Faculdade de Ciências, 2019O processo de descoberta e desenvolvimento de novos medicamentos prolonga-se por vários anos e implica o gasto de imensos recursos monetários. Como tal, vários métodos in silico são aplicados com o intuito de dimiuir os custos e tornar o processo mais eficiente. Estes métodos incluem triagem virtual, um processo pelo qual vastas coleções de compostos são examinadas para encontrar potencial terapêutico. QSAR (Quantitative Structure Activity Relationship) é uma das tecnologias utilizada em triagem virtual e em optimização de potencial farmacológico, em que a informação estrutural de ligandos conhecidos do alvo terapêutico é utilizada para prever a actividade biológica de um novo composto para com o alvo. Vários investigadores desenvolvem modelos de aprendizagem automática de QSAR para múltiplos alvos terapêuticos. Mas o seu uso está dependente do acesso aos mesmos e da facilidade em ter os modelos funcionais, o que pode ser complexo quando existem várias dependências ou quando o ambiente de desenvolvimento difere bastante do ambiente em que é usado. A aplicação ao qual este documento se refere foi desenvolvida para lidar com esta questão. Esta é uma plataforma centralizada onde investigadores podem aceder a vários modelos de QSAR, podendo testar os seus datasets para uma multitude de alvos terapêuticos. A aplicação permite usar identificadores moleculares como SMILES e InChI, e gere a sua integração em descritores moleculares para usar como input nos modelos. A plataforma pode ser acedida através de uma aplicação web com interface gráfica desenvolvida com o pacote Shiny para R e directamente através de uma REST API desenvolvida com o pacote flask-restful para Python. Toda a aplicação está modularizada através de teconologia de “contentores”, especificamente o Docker. O objectivo desta plataforma é divulgar o acesso aos modelos criados pela comunidade, condensando-os num só local e removendo a necessidade do utilizador de instalar ou parametrizar qualquer tipo de software. Fomentando assim o desenvolvimento de conhecimento e facilitando o processo de investigação.The drug discovery and design process is expensive, time-consuming and resource-intensive. Various in silico methods are used to make the process more efficient and productive. Methods such as Virtual Screening often take advantage of QSAR machine learning models to more easily pinpoint the most promising drug candidates, from large pools of compounds. QSAR, which means Quantitative Structure Activity Relationship, is a ligand-based method where structural information of known ligands of a specific target is used to predict the biological activity of another molecule against that target. They are also used to improve upon an existing molecule’s pharmacologic potential by elucidating the structural composition with desirable properties. Several researchers create and develop QSAR machine learning models for a variety of different therapeutic targets. However, their use is limited by lack of access to said models. Beyond access, there are often difficulties in using published software given the need to manage dependencies and replicating the development environment. To address this issue, the application documented here was designed and developed. In this centralized platform, researchers can access several QSAR machine learning models and test their own datasets for interaction with various therapeutic targets. The platform allows the use of widespread molecule identifiers as input, such as SMILES and InChI, handling the necessary integration into the appropriate molecular descriptors to be used in the model. The platform can be accessed through a Web Application with a full graphical user interface developed with the R package Shiny and through a REST API developed with the Flask Restful package for Python. The complete application is packaged up in container technology, specifically Docker. The main goal of this platform is to grant widespread access to the QSAR models developed by the scientific community, by concentrating them in a single location and removing the user’s need to install or set up software unfamiliar to them. This intends to incite knowledge creation and facilitate the research process

    Otimização do processo de monitorização de elétrodos seletivos de ião

    Get PDF
    During the Ion-selective electrode research process, there are a lot of manual calibrations needed in order to achieve better and more effective results. The main goal of most of the researchers in these areas is about the material behaviours and reactions, but there is no way to test them without building the electrode and testing against various possibilities and mediums. Due to this, the research process usually cannot be a continuous process, since there are some interruptions for the manual process to validate all the previous developments. The human based calibration process is fundamental for many conclusions, but, most of the calibrations logic can be implemented in a machine that automates the process, collects the data, and generates the necessary output. The main goal of this MSc thesis is to optimize the calibration process building a calibration box that is fully configurable, and can be replicated, replacing the researcher manual steps during the membrane testing. Herein, it is possible to find the selected hardware and software architecture used, and also the business logic implemented to achieve the automation of these process. The architecture and implementation were designed to be able to work with a digital potentiometer (Crison GLP 21 pH Potentiometer) and a digital precise fluid pump (Legato 100). The solution contains four main components: • External Devices (Digital pH Meter, Digital Peristaltic Pump); • Server side web services and web application; • Cloud Based Deployment (Serverless, Storage, Database); • Hardware automation box; The implementation of this thesis uses the following technologies: • Code: Java, Spring Boot, Spring Data; • Data: JSON, Excel XML OpenFormat, PostgresSQL; • Hardware: RaspberryPi, Relay Boards, LCD Display, Button, LED; • I/O: Ethernet, USB to RS232, RCA, BNC; • Cloud: Heroku Serverless; Heroku PostgresSQL; Amazon S3 Storage; All the membranes used were made by BioMark researchers.Durante o processo de pesquisa de eletrodos seletivos de íons, há muitas calibrações manuais necessárias para obter resultados melhores e mais eficazes. O objetivo principal da maioria dos investigadores nestas áreas ´e relativa aos comportamentos e reações dos materiais, mas não existe uma maneira de testá-los sem construir o elétrodo e testar contra várias possibilidades e meios. Por este motivo, o processo de pesquisa geralmente não pode ser um processo contínuo, uma vez que existem algumas interrupções no mesmo para manualmente validar todos os desenvolvimentos anteriores. O processo de calibração baseado em trabalho manual ´e fundamental para uma grande parte das conclusões, mas a maior parte da lógica de calibração pode ser implementada e executada por uma máquina que automatiza o processo, recolhe os dados e gera a saída necessária para posterior analise do investigador. O objetivo principal desta tese de mestrado ´e otimizar o processo de calibração construindo uma caixa de calibração totalmente configurável e que pode ser replicada, substituindo as etapas manuais do investigador durante o teste das membranas. Neste relatório, é possível encontrar a arquitetura de hardware e software selecionada e utilizada bem como a lógica implementada para alcançar a automação desses processos. A arquitetura e a implementação foram projetadas para poder trabalhar com um potenciómetro digital (Potenciómetro de pH Crison GLP 21) e uma bomba de fluido (Legato 100). A solução contém quatro componentes principais: • Dispositivos Externos (Medidor de pH Digital, Bomba Peristáltica Digital); • Server side web services e aplicação Web; • Cloud Based Deployment (Serverless, Storage, Database); • Caixa de automação (Hardware); A implementação desta tese utiliza as seguintes tecnologias: • Código: Java, Spring Boot, Spring Data; • Dados: JSON, Excel XML OpenFormat, PostgresSQL; • Hardware: RaspberryPi, Relay Boards, LCD Display, Button, LED; • I/O: Ethernet, USB to RS232, RCA, BNC ; • Cloud: Heroku Serverless; Heroku PostgresSQL; Amazon S3 Storage; Todas as membranas utilizadas foram desenvolvidas por investigadores da BioMark

    Explorar kubernetes e devOps num contexto de IoT

    Get PDF
    Containerized solutions and container orchestration technologies have recently been of great interest to organizations as a way of accelerating both software development and delivery processes. However, adopting these is a rather complex shift that may impact an organization and teams that were already established. This is where development cultures such as DevOps emerge to ease such shift amongst teams, promoting collaboration and automation of development and deployment processes throughout. The purpose of the current dissertation is to illustrate the path that led to the use of DevOps and containerization as means to support the development and deployment of a proof of concept system, Firefighter Sync – an Internet of Things based solution applied to a firefighting monitoring scenario. The goal, besides implementing Firefighter Sync, was to propose and deploy a development and operations ecosystem based on DevOps practices to achieve a full automation pipeline for both the development and operations processes. Firefighter Sync enabled the exploration of such state-of-the-art solutions such as Kubernetes to support container-based deployment and Jenkins for a fully automated CI/CD pipeline. Firefighter Sync clearly illustrates that addressing the development of a system from a DevOps perspective from the very beginning, although it requires an accentuated learning curve due to the large range of concepts and technologies addressed throughout, has illustrated to effectively impact the development process as well as ease the solution for future evolution. A good example is the automation process pipeline, that whilst allowing an easy integration of new features within a DevOps process – implies addressing the development and operations as a whole – it abstracts specific technological concerns turning these transversals to the traditional stages from development to deployment.Soluções de contentores e orquestração de contentores têm vindo a tornar-se de grande interesse para as organizações como uma forma de acelerar os processos de desenvolvimento e entrega de software. No entanto, adotá-las é uma mudança bastante complexa que pode impactar uma organização e equipas já estabelecidas. É aqui que surgem culturas como o DevOps para facilitar essa mudança, promovendo a colaboração e a automação dos processos de desenvolvimento e deployment entre equipas. O objetivo desta dissertação é ilustrar o caminho que levou ao uso de DevOps e à conteinerização de modo a apoiar o desenvolvimento e o deployment de um sistema como prova de conceito, o Firefighter Sync – uma solução baseada na Internet das Coisas aplicada a um cenário de monitorização de combate a incêndios. Além de implementar o Firefighter Sync, o objetivo era também propor e implementar um ecossistema de desenvolvimento e operações com base nas práticas de DevOps para alcançar uma pipeline de automação completa para os processos de desenvolvimento e operações. O Firefighter Sync permitiu explorar soluções que constituem o estado da arte neste contexto, como o Kubernetes para apoiar o deployment baseado em contentores e o Jenkins para suportar a pipeline de CI/CD totalmente automatizada. O Firefighter Sync ilustra claramente que abordar o desenvolvimento de um sistema a partir da perspectiva de DevOps, embora exija uma curva de aprendizagem acentuada devido à grande variedade de conceitos e tecnologias inerentes ao longo do processo, demonstrou tornar mais eficiente o processo de desenvolvimento, bem como facilitar evolução futura. Um exemplo é a pipeline de automação, que permite uma fácil integração de novos recursos dentro de um processo de DevOps – que implica abordar o desenvolvimento e as operações como um todo – abstraindo assim preocupações tecnológicas específicas, transformando essas transversais nas fases tradicionais do desenvolvimento ao deployment.Mestrado em Engenharia Informátic

    On-premise containerized, light-weight software solutions for Biomedicine

    Get PDF
    Bioinformatics software systems are critical tools for analysing large-scale biological data, but their design and implementation can be challenging due to the need for reliability, scalability, and performance. This thesis investigates the impact of several software approaches on the design and implementation of bioinformatics software systems. These approaches include software patterns, microservices, distributed computing, containerisation and container orchestration. The research focuses on understanding how these techniques affect bioinformatics software systems’ reliability, scalability, performance, and efficiency. Furthermore, this research highlights the challenges and considerations involved in their implementation. This study also examines potential solutions for implementing container orchestration in bioinformatics research teams with limited resources and the challenges of using container orchestration. Additionally, the thesis considers microservices and distributed computing and how these can be optimised in the design and implementation process to enhance the productivity and performance of bioinformatics software systems. The research was conducted using a combination of software development, experimentation, and evaluation. The results show that implementing software patterns can significantly improve the code accessibility and structure of bioinformatics software systems. Specifically, microservices and containerisation also enhanced system reliability, scalability, and performance. Additionally, the study indicates that adopting advanced software engineering practices, such as model-driven design and container orchestration, can facilitate efficient and productive deployment and management of bioinformatics software systems, even for researchers with limited resources. Overall, we develop a software system integrating all our findings. Our proposed system demonstrated the ability to address challenges in bioinformatics. The thesis makes several key contributions in addressing the research questions surrounding the design, implementation, and optimisation of bioinformatics software systems using software patterns, microservices, containerisation, and advanced software engineering principles and practices. Our findings suggest that incorporating these technologies can significantly improve bioinformatics software systems’ reliability, scalability, performance, efficiency, and productivity.Bioinformatische Software-Systeme stellen bedeutende Werkzeuge für die Analyse umfangreicher biologischer Daten dar. Ihre Entwicklung und Implementierung kann jedoch aufgrund der erforderlichen Zuverlässigkeit, Skalierbarkeit und Leistungsfähigkeit eine Herausforderung darstellen. Das Ziel dieser Arbeit ist es, die Auswirkungen von Software-Mustern, Microservices, verteilten Systemen, Containerisierung und Container-Orchestrierung auf die Architektur und Implementierung von bioinformatischen Software-Systemen zu untersuchen. Die Forschung konzentriert sich darauf, zu verstehen, wie sich diese Techniken auf die Zuverlässigkeit, Skalierbarkeit, Leistungsfähigkeit und Effizienz von bioinformatischen Software-Systemen auswirken und welche Herausforderungen mit ihrer Konzeptualisierungen und Implementierung verbunden sind. Diese Arbeit untersucht auch potenzielle Lösungen zur Implementierung von Container-Orchestrierung in bioinformatischen Forschungsteams mit begrenzten Ressourcen und die Einschränkungen bei deren Verwendung in diesem Kontext. Des Weiteren werden die Schlüsselfaktoren, die den Erfolg von bioinformatischen Software-Systemen mit Containerisierung, Microservices und verteiltem Computing beeinflussen, untersucht und wie diese im Design- und Implementierungsprozess optimiert werden können, um die Produktivität und Leistung bioinformatischer Software-Systeme zu steigern. Die vorliegende Arbeit wurde mittels einer Kombination aus Software-Entwicklung, Experimenten und Evaluation durchgeführt. Die erzielten Ergebnisse zeigen, dass die Implementierung von Software-Mustern, die Zuverlässigkeit und Skalierbarkeit von bioinformatischen Software-Systemen erheblich verbessern kann. Der Einsatz von Microservices und Containerisierung trug ebenfalls zur Steigerung der Zuverlässigkeit, Skalierbarkeit und Leistungsfähigkeit des Systems bei. Darüber hinaus legt die Arbeit dar, dass die Anwendung von SoftwareEngineering-Praktiken, wie modellgesteuertem Design und Container-Orchestrierung, die effiziente und produktive Bereitstellung und Verwaltung von bioinformatischen Software-Systemen erleichtern kann. Zudem löst die Implementierung dieses SoftwareSystems, Herausforderungen für Forschungsgruppen mit begrenzten Ressourcen. Insgesamt hat das System gezeigt, dass es in der Lage ist, Herausforderungen im Bereich der Bioinformatik zu bewältigen und stellt somit ein wertvolles Werkzeug für Forscher in diesem Bereich dar. Die vorliegende Arbeit leistet mehrere wichtige Beiträge zur Beantwortung von Forschungsfragen im Zusammenhang mit dem Entwurf, der Implementierung und der Optimierung von Software-Systemen für die Bioinformatik unter Verwendung von Prinzipien und Praktiken der Softwaretechnik. Unsere Ergebnisse deuten darauf hin, dass die Einbindung dieser Technologien die Zuverlässigkeit, Skalierbarkeit, Leistungsfähigkeit, Effizienz und Produktivität bioinformatischer Software-Systeme erheblich verbessern kann

    Implementation of new tools and approaches for the reconstruction of genome-scale metabolic models

    Get PDF
    Dissertação de mestrado em BioinformáticaThe reconstruction of high-quality genome-scale metabolic (GSM) models can have a rele vant role in the investigation and study of an organism, since these mathematical models can be used to phenotypically manipulate an organism and predict its response, in silico, under different environmental conditions or genetic modifications. Several bioinformatics tools and software have been developed since then to facilitate and accelerate the reconstruction of these models by automating some steps that compose the traditional reconstruction process. “Metabolic Models Reconstruction Using Genome-Scale Information” (merlin) is a free, user-friendly, JavaTM application that automates the main stages of the reconstruction of a GSM model for any microorganism. Although it has already been used successfully in several works, many plugins are still being developed to improve its resources and make it more accessible to any user. In this work, the new tools integrated in merlin will be described in detail, as well as the improvement of other features present on the platform. The general improvements performed and the implementation of the new tools, improve the overall user experience during the process of reconstructing GSM models in merlin. The main feature implemented in this work is the incorporation of the BiGG Integration Tool (BIT) in merlin. This plugin allows the collection of metabolic data that integrates the models present in the BiGG Models database and its association with the genome of the organism in study, by homology, creating, if possible, the boolean rule for each BiGG reaction in the model under construction. All the computation required to execute merlin’s BIT takes place remotely, to accelerate the process. Within a few minutes, the results are returned by the server and imported into the user’s workspace. Running the tool outside the user’s machine also brings advantages in terms of information storage, since the BiGG data structure that supports the entire tool is available remotely. The implementation of this tool provides an alternative to obtaining metabolic information from the KEGG database, the only option available in merlin so far. To test the implemented tool, several draft genome-scale metabolic networks were generated and analyzed.A reconstrução de modelos metabólicos à escala genómica (MEG) de alta qualidade, pode desempenhar um papel relevante na investigação e estudo de um organismo, uma vez que estes modelos matemáticos podem ser utilizados para manipular fenotipicamente um organ ismo e prever a sua resposta, in silico, sob diferentes condições ambientais ou modificações genéticas. Várias ferramentas bioinformáticas e software têm sido desenvolvidos desde então para facilitar e acelerar a reconstrução desses modelos por automatização de algumas etapas que constituem o processo de reconstrução tradicional. O “Metabolic Models Reconstruction Using Genome-Scale Information” (merlin) é uma aplicação JavaTM gratuita, e fácil de utilizar, que automatiza as principais etapas de recon strução de um modelo MEG para qualquer microrganismo. Apesar de já ter sido utilizado com sucesso em vários trabalhos, muitos plugins ainda estão a ser desenvolvidas para aprimorar os seus recursos e torná-lo mais acessível a qualquer utilizador. Neste trabalho, serão descritas em detalhe as novas ferramentas integradas no merlin, bem como a melhoria de outras funcionalidades presentes na plataforma. As melhorias gerais realizadas e a implementação das novas ferramentas permitem melhorar a experiência global do utilizador durante o processo de reconstrução de modelos MEG no merlin. O principal recurso implementado neste trabalho é a integração da BiGG Integration Tool (BIT) no merlin. Este plugin permite a recolha dos dados metabólicos que integram os modelos presentes na base de dados BiGG Models e a sua associação ao genoma do organismo em estudo, por homologia, criando, se possível, a boolean rule para cada reação BiGG presente no modelo sob construção. Todo o processamento exigido para executar a BIT do merlin ocorre remotamente, para acelerar o processo. Em poucos minutos, os resultados são devolvidos pelo servidor e importados para o ambiente de trabalho do utilizador. A execução da ferramenta fora da máquina do utilizador traz também vantagens ao nível do armazenamento da informação, já que a estrutura de dados BiGG que sustenta toda a ferramenta está disponível remotamente. A implementação desta ferramenta fornece uma alternativa à obtenção de informação metabólica a partir da base de dados KEGG, única opção disponibilizada pelo merlin até ao momento. Para testar a ferramenta implementada, várias redes metabólicas à escala genómica rascunho foram geradas e analisadas

    Workflow models for heterogeneous distributed systems

    Get PDF
    The role of data in modern scientific workflows becomes more and more crucial. The unprecedented amount of data available in the digital era, combined with the recent advancements in Machine Learning and High-Performance Computing (HPC), let computers surpass human performances in a wide range of fields, such as Computer Vision, Natural Language Processing and Bioinformatics. However, a solid data management strategy becomes crucial for key aspects like performance optimisation, privacy preservation and security. Most modern programming paradigms for Big Data analysis adhere to the principle of data locality: moving computation closer to the data to remove transfer-related overheads and risks. Still, there are scenarios in which it is worth, or even unavoidable, to transfer data between different steps of a complex workflow. The contribution of this dissertation is twofold. First, it defines a novel methodology for distributed modular applications, allowing topology-aware scheduling and data management while separating business logic, data dependencies, parallel patterns and execution environments. In addition, it introduces computational notebooks as a high-level and user-friendly interface to this new kind of workflow, aiming to flatten the learning curve and improve the adoption of such methodology. Each of these contributions is accompanied by a full-fledged, Open Source implementation, which has been used for evaluation purposes and allows the interested reader to experience the related methodology first-hand. The validity of the proposed approaches has been demonstrated on a total of five real scientific applications in the domains of Deep Learning, Bioinformatics and Molecular Dynamics Simulation, executing them on large-scale mixed cloud-High-Performance Computing (HPC) infrastructures

    Tox21Enricher-Shiny: an R Shiny application for toxicity functional annotation analysis

    Get PDF
    Inference of toxicological and mechanistic properties of untested chemicals through structural or biological similarity is a commonly employed approach for initial chemical characterization and hypothesis generation. We previously developed a web-based application, Tox21Enricher-Grails, on the Grails framework that identifies enriched biological/toxicological properties of chemical sets for the purpose of inferring properties of untested chemicals within the set. It was able to detect significantly overrepresented biological (e.g., receptor binding), toxicological (e.g., carcinogenicity), and chemical (e.g., toxicologically relevant chemical substructures) annotations within sets of chemicals screened in the Tox21 platform. Here, we present an R Shiny application version of Tox21Enricher-Grails, Tox21Enricher-Shiny, with more robust features and updated annotations. Tox21Enricher-Shiny allows users to interact with the web application component (available at http://hurlab.med.und.edu/Tox21Enricher/) through a user-friendly graphical user interface or to directly access the application’s functions through an application programming interface. This version now supports InChI strings as input in addition to CASRN and SMILES identifiers. Input chemicals that contain certain reactive functional groups (nitrile, aldehyde, epoxide, and isocyanate groups) may react with proteins in cell-based Tox21 assays: this could cause Tox21Enricher-Shiny to produce spurious enrichment analysis results. Therefore, this version of the application can now automatically detect and ignore such problematic chemicals in a user’s input. The application also offers new data visualizations, and the architecture has been greatly simplified to allow for simple deployment, version control, and porting. The application may be deployed onto a Posit Connect or Shiny server, and it uses Postgres for database management. As other Tox21-related tools are being migrated to the R Shiny platform, the development of Tox21Enricher-Shiny is a logical transition to use R’s strong data analysis and visualization capacities and to provide aesthetic and developmental consistency with other Tox21 applications developed by the Division of Translational Toxicology (DTT) at the National Institute of Environmental Health Sciences (NIEHS)

    Web-Interface for querying and visualizing Alcoholic Liver Disease Patients’ data from database using GraphQL

    Get PDF
    Ο αλκοολισμός αποτελεί́ ένα από τα σοβαρότερα και συχνότερα προβλήματα που αντιμετωπίζουν οι σύγχρονες κοινωνίες. 5%-10% του πληθυσμού στις ευρωπαϊκές χώρες κάνει κατάχρηση αλκοόλ, με την παρατεταμένη κατανάλωση αλκοόλ να επιφέρει ίνωση και κίρρωση του ήπατος (αλκοολική νόσος, Alcohol Liver Disease, ALD). Η αλκοολική νόσος συνίσταται στην ανάπτυξη του λιπώδους ήπατος, στην αλκοολική ηπατίτιδα, και τελικά στην κίρρωση του ήπατος. Τα πρώτα στάδια της ίνωσης και της αλκοολικής ηπατίτιδας είναι ασυμπωματικά ενώ όταν τελικά εκδηλωθεί η νόσος, η κλινική εικόνα είναι οξεία. Στην κλινική πράξη η διάγνωση της ALD βασίζεται στο ιστορικό χρήσης αλκοόλ, στην συμπτωματολογία του ασθενούς, και σε εργαστηριακές εξετάσεις (π.χ. ηπατικά ένζυμα, αρτηριακή πίεση, γλυκόζη αίματος, κ.α.). Η διπλωματική εργασία αποσκοπεί στη δημιουργία μιας βάσης δεδομένων για την συλλογή και ταξινόμηση όλων των εργαστηριακών, κλινικών, κ.α. εξετάσεων των ασθενών. Η αναζήτηση δεδομένων και δημιουργία γραφημάτων γίνεται σε πραγματικό χρόνο μέσω της χρήσης GraphQL επερωτήσεων. Η σχεδίαση της διεπαφής λαμβάνει υπόψη την αλλαγή των δεδομένων καθώς επίσης και την επαναχρησιμοποίηση σε διαφορετικού είδους δεδομένα από άλλα πειράματα και τη χρήση από άλλα υπολογιστικά συστήματα. Με αυτό το βιοπληροφορικό εργαλείο θα απλοποιηθεί η διαδικασία επιλογής δεδομένων, ανάλυσης και προβολής με χρήση γραφημάτων και διαγραμμάτων όλων των δεδομένων από ιατρούς και ερευνητές. Αυτό έχει ως αποτέλεσμα το εργαλείο να διευκολύνει την καθημερινότητα των ιατρών και ερευνητών ώστε να επικεντρώνονται περισσότερο στην ουσία της έρευνας, δηλαδή στην εξαγωγή συμπερασμάτων για τις βασικότερες κατηγορίες των δεδομένων που οδηγούν τους ασθενείς στην πάθηση της αλκοολικής ηπατικής νόσου, και λιγότερο στις διαδικασίες.Alcoholism is one of the most serious and most common problems faced by modern societies. Approximately, 5%-10% of the population in European countries do alcohol abuse, with prolonged alcohol consumption causing liver fibrosis and cirrhosis (alcoholic liver disease, ALD). Alcoholic disease is the development of fatty liver, alcoholic hepatitis, and finally cirrhosis of the liver. The early stages of fibrosis and alcoholic hepatitis are symptomless, and when the disease is finally manifested, the clinical picture is acute. In clinical practice, the diagnosis of ALD is based on the historical alcohol ingestion, patient symptomatology and laboratory tests (e.g. liver enzymes, blood pressure, blood glucose, etc.). The dissertation aims to create a database for the collection and classification of all laboratorial, clinical, etc. examinations of patients. Data search and graph plots and charts are created in real-time with the use of GraphQL queries and middleware query caching. The design process of the interface takes into account data changes as well as reusability of this tool in different kind of data from other tests or experiments and can be used in all types computing systems as it is containerized and responsive. This bioinformatic tool will help physicians and researchers to simplify the process of data selection, analysis and visualization by using graphs and diagrams of all data. As a result, the tool facilitates the day-to-day physicians and researchers schedule and as has the effect of letting them focus more on the essence of research, i.e. to draw conclusions about the main categories of information that lead patients to alcoholic liver disease, and less on processes

    Automated Identification of Targeted Therapy Strategies in Precision Oncology

    Get PDF
    Individualisierung in der Krebsbehandlung beruht auf gezielten Strategien, die auf die genomischen Merkmale der Patienten zugeschnitten sind, die pathologische Anomalien verursachen. Die Extrapolation von Phänotyp-Genotyp-Beziehungen auf onkologische Kliniken hat zu einem weniger kostspieligen und effizienteren Krebsbehandlungsmodell geführt. Die Umsetzung ist jedoch nach wie vor schwierig, da für die komplexe Analyse verschiedene Bioinformatik-Tools und Datenbanken erforderlich sind. Sie beruht auf dem individuellen Fachwissen der MTBs, die einen nicht standardisierten Rahmen mit einer begrenzten Anzahl von Quellen ausführen. Die Nachteile bestehender Tools bestehen darin, dass sie Programmierkenntnisse erfordern, den Datenschutz nicht berücksichtigen, eine Vielzahl von Datenbanken mit klinischer Evidenz enthalten und keine auf die Arbeitsabläufe von Molekularen Tumorboards (MTBs) zugeschnittene Benutzeroberfläche haben. Wir haben ClinVAP entwickelt, ein kohärentes Framework für die klinische Annotation von Genomvarianten, das den Prozess der Erstellung patientenspezifischer Diagnoseberichte automatisiert, indem es die lange Liste von Mutationen in klinische Implikationen übersetzt.Wir haben es mit den Gen-Gen-Interaktionen angereichert, die auch den Inhalt der gestörten Signalwege aufzeigen. Wir haben die kombinierten Ergebnisse in einer interaktiven grafischen Benutzeroberfläche (GUI) bereitgestellt, die die Backend-Operationen von den Nutzern isoliert und es ihnen ermöglicht, die Ergebnisse zu bearbeiten. Wir haben die Anpassungsfähigkeit von ClinVAP anhand von retrospektiven Fällen gemessen, um ihre inhaltliche Gleichheit mit der manuellen Implementierung im MTB zu vergleichen. Die Unterschiede beruhten hauptsächlich auf Expertenmeinungen. Der Inhalt und die Struktur der automatisierten Patientenberichts-Tools sind eine umfassende Grundlage für die Entscheidungsfindung. Die Zukunft der Präzisionsonkologie hängt von der Zugänglichkeit des gesammelten molekularen Wissens über die krankheitsverursachenden Faktoren ab. Die Vielzahl der Bioinformatik-Tools und die schiere Größe der Genomdaten stellen ein Hindernis für die Bereitstellung dieser Informationen in Krankenhäusern dar. Unsere Lösungen erhöhen nicht nur ihre klinische Anwendbarkeit, sondern zeigen auch, dass das Feld bereit ist, automatisierte Lösungen zu entwickeln. Darüber hinaus werden Standardisierung und Archivierung Populationsstudien erleichtern, da molekulare Analysen archiviert und als Informationen an das System zurückgegeben werden können.Precision in cancer treatment builds upon targeted strategies tailored to the genomic traits of patients instigating pathological abnormalities. Extrapolating phenotype to genotype translations to oncology clinics has led to a less costly and more efficient cancer care model. However, its implementation remains challenging due to the complex analysis trajectory requiring various bioinformatics tools and databases. It relies on the individual expertise of MTBs executing a non-standard framework with a limited number of pharmacogenomics sources. The disadvantages of existing tools emanate from requiring programmatic skills, not addressing data privacy concerns, the large number of clinical evidence databases, and the lack of GUI tailored to MTB’s workflow. We created ClinVAP, a cohesive framework for clinical annotation of genomic variants which automates the process of generating patient-specific diagnostic reports by translating the long list of mutations to clinical implications. We enriched it with the gene-gene interactions that also reveal the content of disrupted pathways. We provided the combined results in an interactive GUI which isolates backend operations from the users and allows them to operate through the results. We measured the adaptability of ClinVAP using retrospective cases to compare their contentwise equality to the MTB’s implementation. The differences were mainly based on expert opinion. The content and the structure of the automated patient reporting tools form a comprehensive foundation to be used in decision making. The future of precision oncology depends on the accessibility of the accumulated molecular knowledge of the disease-contributing factors. The number of bioinformatics tools and the sheer size of genome data is a barrier to making this information available in hospitals. Our solutions not only increase their clinical applicability, but also demonstrate the field’s readiness to generate automated solutions. Moreover, standardization and archiving will facilitate population studies, allowing molecular analyses to be archived and returned to the system as information

    Decoding Microbial Genomes: Novel User-Friendly Tools Applied to Fermented Foods

    Get PDF
    Over the past two decades, the cost of DNA sequencing per base has significantly outpaced Moore's law. Many organizations and research groups have exploited this trend and generated large amounts of genomic data and made it possible to tackle new research questions. This growth also brings challenges, including the need for faster algorithms, more efficient ways to visualize and explore data, more automatized data processing, and systematic data management. For more than a century, Agroscope collects lactic acid bacteria (LAB) extracted from the Swiss dairy environment. Today, the collection comprises more than 10’000 strains and so far, for about 15% of the strains the genome was sequenced. The over-arching goal of this thesis is to find new ways of exploiting this genetic potential to design new fermented food products including potential additional health benefits, and to understand the underlying mechanisms. One compound with potential health benefits is indole. Previous experiments have shown that indole compounds modulate the gut immune system via the aryl hydrocarbon receptor (AhR). Our objective was to create a yoghurt enriched in indole metabolites through fermentation, and then to examine whether maternal consumption of this yoghurt would enhance gut immune system maturation in germ-free mice. To reduce the number of strains to test, I developed comparative genomics tools to pre-select strains from the strain collection. This led to the successful development of a yoghurt with significantly increased AhR activation activity. In germ-free mice, we could show the expected effect. Based on these comparative genomics tools, I developed the software OpenGenomeBrowser to enable biologists, who know their organisms of interest in great detail, to efficiently explore the genomic data by themselves, without bioinformatics skills or the need for a middleman bioinformatician. The foundation of OpenGenomeBrowser is a simple system for transparent data management of microbial genomes which makes the automation of common bioinformatics workflows possible. In addition, I built a user-friendly website based on modern web technologies to facilitate common bioinformatics workflows. Because of OpenGenomeBrowser's solid foundation, it is the first software of its kind that can be self-hosted and is dataset-independent, making it potentially useful for many similar genome datasets. During the project, we measured thousands of metabolites in yoghurts made using different strains. However, we experienced that no existing tools could adequately connect such a high-dimensional phenotypic dataset to the genomic information, i.e., presence-absence of orthogenes. Finding high-confidence causative links between these datasets is challenging because of the properties of microbial genomes. For instance, clonal reproduction leads to genome-wide linkage disequilibrium, which prohibits the use of techniques developed for human genome-wide association studies (hGWAS). To this end, I developed Scoary2, a complete rewrite and extension of the original microbial GWAS (mGWAS) software Scoary. The key improvements include an implementation of the core algorithm that is orders of magnitude faster and an interactive web-app that enables efficient data exploration of the output, which is crucial given the size of the dataset. With this software, we discovered two previously uncharacterized genes involved in the carnitine metabolism
    corecore