380 research outputs found

    Structural Cheminformatics for Kinase-Centric Drug Design

    Get PDF
    Drug development is a long, expensive, and iterative process with a high failure rate, while patients wait impatiently for treatment. Kinases are one of the main drug targets studied for the last decades to combat cancer, the second leading cause of death worldwide. These efforts resulted in a plethora of structural, chemical, and pharmacological kinase data, which are collected in the KLIFS database. In this thesis, we apply ideas from structural cheminformatics to the rich KLIFS dataset, aiming to provide computational tools that speed up the complex drug discovery process. We focus on methods for target prediction and fragment-based drug design that study characteristics of kinase binding sites (also called pockets). First, we introduce the concept of computational target prediction, which is vital in the early stages of drug discovery. This approach identifies biological entities such as proteins that may (i) modulate a disease of interest (targets or on-targets) or (ii) cause unwanted side effects due to their similarity to on-targets (off-targets). We focus on the research field of binding site comparison, which lacked a freely available and efficient tool to determine similarities between the highly conserved kinase pockets. We fill this gap with the novel method KiSSim, which encodes and compares spatial and physicochemical pocket properties for all kinases (kinome) that are structurally resolved. We study kinase similarities in the form of kinome-wide phylogenetic trees and detect expected and unexpected off-targets. To allow multiple perspectives on kinase similarity, we propose an automated and production-ready pipeline; user-defined kinases can be inspected complementarily based on their pocket sequence and structure (KiSSim), pocket-ligand interactions, and ligand profiles. Second, we introduce the concept of fragment-based drug design, which is useful to identify and optimize active and promising molecules (hits and leads). This approach identifies low-molecular-weight molecules (fragments) that bind weakly to a target and are then grown into larger high-affinity drug-like molecules. With the novel method KinFragLib, we provide a fragment dataset for kinases (fragment library) by viewing kinase inhibitors as combinations of fragments. Kinases have a highly conserved pocket with well-defined regions (subpockets); based on the subpockets that they occupy, we fragment kinase inhibitors in experimentally resolved protein-ligand complexes. The resulting dataset is used to generate novel kinase-focused molecules that are recombinations of the previously fragmented kinase inhibitors while considering their subpockets. The KinFragLib and KiSSim methods are published as freely available Python tools. Third, we advocate for open and reproducible research that applies FAIR principles ---data and software shall be findable, accessible, interoperable, and reusable--- and software best practices. In this context, we present the TeachOpenCADD platform that contains pipelines for computer-aided drug design. We use open source software and data to demonstrate ligand-based applications from cheminformatics and structure-based applications from structural bioinformatics. To emphasize the importance of FAIR data, we dedicate several topics to accessing life science databases such as ChEMBL, PubChem, PDB, and KLIFS. These pipelines are not only useful to novices in the field to gain domain-specific skills but can also serve as a starting point to study research questions. Furthermore, we show an example of how to build a stand-alone tool that formalizes reoccurring project-overarching tasks: OpenCADD-KLIFS offers a clean and user-friendly Python API to interact with the KLIFS database and fetch different kinase data types. This tool has been used in this thesis and beyond to support kinase-focused projects. We believe that the FAIR-based methods, tools, and pipelines presented in this thesis (i) are valuable additions to the toolbox for kinase research, (ii) provide relevant material for scientists who seek to learn, teach, or answer questions in the realm of computer-aided drug design, and (iii) contribute to making drug discovery more efficient, reproducible, and reusable

    Технология комплексной поддержки жизненного цикла семантически совместимых интеллектуальных компьютерных систем нового поколения

    Get PDF
    В издании представлено описание текущей версии открытой технологии онтологического проектирования, производства и эксплуатации семантически совместимых гибридных интеллектуальных компьютерных систем (Технологии OSTIS). Предложена стандартизация интеллектуальных компьютерных систем, а также стандартизация методов и средств их проектирования, что является важнейшим фактором, обеспечивающим семантическую совместимость интеллектуальных компьютерных систем и их компонентов, что существенное снижение трудоемкости разработки таких систем. Книга предназначена всем, кто интересуется проблемами искусственного интеллекта, а также специалистам в области интеллектуальных компьютерных систем и инженерии знаний. Может быть использована студентами, магистрантами и аспирантами специальности «Искусственный интеллект». Табл. 8. Ил. 223. Библиогр.: 665 назв

    A Digital Triplet for Utilizing Offline Environments to Train Condition Monitoring Systems for Rolling Element Bearings

    Get PDF
    Manufacturing competitiveness is related to making a quality product while incurring the lowest costs. Unexpected downtime caused by equipment failure negatively impacts manufacturing competitiveness due to the ensuing defects and delays caused by the downtime. Manufacturers have adopted condition monitoring (CM) techniques to reduce unexpected downtime to augment maintenance strategies. The CM adoption has transitioned maintenance from Breakdown Maintenance (BM) to Condition-Based Maintenance (CbM) to anticipate impending failures and provide maintenance actions before equipment failure. CbM is the umbrella term for maintenance strategies that use condition monitoring techniques such as Preventive Maintenance (PM) and Predictive Maintenance (PdM). Preventive Maintenance involves providing periodic checks based on either time or sensory input. Predictive Maintenance utilizes continuous or periodic sensory inputs to determine the machine health state to predict the equipment failure. The overall goal of the work is to improve bearing diagnostic and prognostic predictions for equipment health by utilizing surrogate systems to generate failure data that represents production equipment failure, thereby providing training data for condition monitoring solutions without waiting for real world failure data. This research seeks to address the challenges of obtaining failure data for CM systems by incorporating a third system into monitoring strategies to create a Digital Triplet (DTr) for condition monitoring to increase the amount of possible data for condition monitoring. Bearings are a critical component in rotational manufacturing systems with wide application to other industries outside of manufacturing, such as energy and defense. The reinvented DTr system considers three components: the physical, surrogate, and digital systems. The physical system represents the real-world application in production that cannot fail. The surrogate system represents a physical component in a test system in an offline environment where data is generated to fill in gaps from data unavailable in the real-world system. The digital system is the CM system, which provides maintenance recommendations based on the ingested data from the real world and surrogate systems. In pursuing the research goal, a comprehensive bearing dataset detailing these four failure modes over different collection operating parameters was created. Subsequently, the collections occurred under different operating conditions, such as speed-varying, load-varying, and steadystate. Different frequency and time measures were used to analyze and identify differentiating criteria between the different failure classes over the differing operating conditions. These empirical observations were recreated using simulations to filter out potential outliers. The outputs of the physical model were combined with knowledge from the empirical observations to create ”spectral deltas” to augment existing bearing data and create new failure data that resemble similar frequency criteria to the original data. The primary verification occurred on a laboratory-bearing test stand. A conjecture is provided on how to scale to a larger system by analyzing a larger system from a local manufacturer. From the subsequent analysis of machine learning diagnosis and prognosis models, the original and augmented bearing data can complement each other during model training. The subsequent data substitution verifies that bearing data collected under different operating conditions and sizes can be substituted between different systems. Ostensibly, the full formulation of the digital triplet system is that bearing data generated at a smaller size can be scaled to train predictive failure models for larger bearing sizes. Future work should consider implementing this method for other systems outside of bearings, such as gears, non-rotational equipment, such as pumps, or even larger complex systems, such as computer numerically controlled machine tools or car engines. In addition, the method and process should not be restricted to only mechanical systems and could be applied to electrical systems, such as batteries. Furthermore, an investigation should consider further data-driven approximations to specific bearing characteristics related to the stiffness and damping parameters needed in modeling. A final consideration is for further investigation into the scalability quantities within the data and how to track these changes through different system levels

    Novel Architectures for Offloading and Accelerating Computations in Artificial Intelligence and Big Data

    Get PDF
    Due to the end of Moore's Law and Dennard Scaling, performance gains in general-purpose architectures have significantly slowed in recent years. While raising the number of cores has been a viable approach for further performance increases, Amdahl's Law and its implications on parallelization also limit further performance gains. Consequently, research has shifted towards different approaches, including domain-specific custom architectures tailored to specific workloads. This has led to a new golden age for computer architecture, as noted in the Turing Award Lecture by Hennessy and Patterson, which has spawned several new architectures and architectural advances specifically targeted at highly current workloads, including Machine Learning. This thesis introduces a hierarchy of architectural improvements ranging from minor incremental changes, such as High-Bandwidth Memory, to more complex architectural extensions that offload workloads from the general-purpose CPU towards more specialized accelerators. Finally, we introduce novel architectural paradigms, namely Near-Data or In-Network Processing, as the most complex architectural improvements. This cumulative dissertation then investigates several architectural improvements to accelerate Sum-Product Networks, a novel Machine Learning approach from the class of Probabilistic Graphical Models. Furthermore, we use these improvements as case studies to discuss the impact of novel architectures, showing that minor and major architectural changes can significantly increase performance in Machine Learning applications. In addition, this thesis presents recent works on Near-Data Processing, which introduces Smart Storage Devices as a novel architectural paradigm that is especially interesting in the context of Big Data. We discuss how Near-Data Processing can be applied to improve performance in different database settings by offloading database operations to smart storage devices. Offloading data-reductive operations, such as selections, reduces the amount of data transferred, thus improving performance and alleviating bandwidth-related bottlenecks. Using Near-Data Processing as a use-case, we also discuss how Machine Learning approaches, like Sum-Product Networks, can improve novel architectures. Specifically, we introduce an approach for offloading Cardinality Estimation using Sum-Product Networks that could enable more intelligent decision-making in smart storage devices. Overall, we show that Machine Learning can benefit from developing novel architectures while also showing that Machine Learning can be applied to improve the applications of novel architectures

    Systematic Approaches for Telemedicine and Data Coordination for COVID-19 in Baja California, Mexico

    Get PDF
    Conference proceedings info: ICICT 2023: 2023 The 6th International Conference on Information and Computer Technologies Raleigh, HI, United States, March 24-26, 2023 Pages 529-542We provide a model for systematic implementation of telemedicine within a large evaluation center for COVID-19 in the area of Baja California, Mexico. Our model is based on human-centric design factors and cross disciplinary collaborations for scalable data-driven enablement of smartphone, cellular, and video Teleconsul-tation technologies to link hospitals, clinics, and emergency medical services for point-of-care assessments of COVID testing, and for subsequent treatment and quar-antine decisions. A multidisciplinary team was rapidly created, in cooperation with different institutions, including: the Autonomous University of Baja California, the Ministry of Health, the Command, Communication and Computer Control Center of the Ministry of the State of Baja California (C4), Colleges of Medicine, and the College of Psychologists. Our objective is to provide information to the public and to evaluate COVID-19 in real time and to track, regional, municipal, and state-wide data in real time that informs supply chains and resource allocation with the anticipation of a surge in COVID-19 cases. RESUMEN Proporcionamos un modelo para la implementación sistemática de la telemedicina dentro de un gran centro de evaluación de COVID-19 en el área de Baja California, México. Nuestro modelo se basa en factores de diseño centrados en el ser humano y colaboraciones interdisciplinarias para la habilitación escalable basada en datos de tecnologías de teleconsulta de teléfonos inteligentes, celulares y video para vincular hospitales, clínicas y servicios médicos de emergencia para evaluaciones de COVID en el punto de atención. pruebas, y para el tratamiento posterior y decisiones de cuarentena. Rápidamente se creó un equipo multidisciplinario, en cooperación con diferentes instituciones, entre ellas: la Universidad Autónoma de Baja California, la Secretaría de Salud, el Centro de Comando, Comunicaciones y Control Informático. de la Secretaría del Estado de Baja California (C4), Facultades de Medicina y Colegio de Psicólogos. Nuestro objetivo es proporcionar información al público y evaluar COVID-19 en tiempo real y rastrear datos regionales, municipales y estatales en tiempo real que informan las cadenas de suministro y la asignación de recursos con la anticipación de un aumento de COVID-19. 19 casos.ICICT 2023: 2023 The 6th International Conference on Information and Computer Technologieshttps://doi.org/10.1007/978-981-99-3236-

    D7.5 FIRST consolidated project results

    Get PDF
    The FIRST project commenced in January 2017 and concluded in December 2022, including a 24-month suspension period due to the COVID-19 pandemic. Throughout the project, we successfully delivered seven technical reports, conducted three workshops on Key Enabling Technologies for Digital Factories in conjunction with CAiSE (in 2019, 2020, and 2022), produced a number of PhD theses, and published over 56 papers (and numbers of summitted journal papers). The purpose of this deliverable is to provide an updated account of the findings from our previous deliverables and publications. It involves compiling the original deliverables with necessary revisions to accurately reflect the final scientific outcomes of the project

    Modeling Deception for Cyber Security

    Get PDF
    In the era of software-intensive, smart and connected systems, the growing power and so- phistication of cyber attacks poses increasing challenges to software security. The reactive posture of traditional security mechanisms, such as anti-virus and intrusion detection systems, has not been sufficient to combat a wide range of advanced persistent threats that currently jeopardize systems operation. To mitigate these extant threats, more ac- tive defensive approaches are necessary. Such approaches rely on the concept of actively hindering and deceiving attackers. Deceptive techniques allow for additional defense by thwarting attackers’ advances through the manipulation of their perceptions. Manipu- lation is achieved through the use of deceitful responses, feints, misdirection, and other falsehoods in a system. Of course, such deception mechanisms may result in side-effects that must be handled. Current methods for planning deception chiefly portray attempts to bridge military deception to cyber deception, providing only high-level instructions that largely ignore deception as part of the software security development life cycle. Con- sequently, little practical guidance is provided on how to engineering deception-based techniques for defense. This PhD thesis contributes with a systematic approach to specify and design cyber deception requirements, tactics, and strategies. This deception approach consists of (i) a multi-paradigm modeling for representing deception requirements, tac- tics, and strategies, (ii) a reference architecture to support the integration of deception strategies into system operation, and (iii) a method to guide engineers in deception mod- eling. A tool prototype, a case study, and an experimental evaluation show encouraging results for the application of the approach in practice. Finally, a conceptual coverage map- ping was developed to assess the expressivity of the deception modeling language created.Na era digital o crescente poder e sofisticação dos ataques cibernéticos apresenta constan- tes desafios para a segurança do software. A postura reativa dos mecanismos tradicionais de segurança, como os sistemas antivírus e de detecção de intrusão, não têm sido suficien- tes para combater a ampla gama de ameaças que comprometem a operação dos sistemas de software actuais. Para mitigar estas ameaças são necessárias abordagens ativas de defesa. Tais abordagens baseiam-se na ideia de adicionar mecanismos para enganar os adversários (do inglês deception). As técnicas de enganação (em português, "ato ou efeito de enganar, de induzir em erro; artimanha usada para iludir") contribuem para a defesa frustrando o avanço dos atacantes por manipulação das suas perceções. A manipula- ção é conseguida através de respostas enganadoras, de "fintas", ou indicações erróneas e outras falsidades adicionadas intencionalmente num sistema. É claro que esses meca- nismos de enganação podem resultar em efeitos colaterais que devem ser tratados. Os métodos atuais usados para enganar um atacante inspiram-se fundamentalmente nas técnicas da área militar, fornecendo apenas instruções de alto nível que ignoram, em grande parte, a enganação como parte do ciclo de vida do desenvolvimento de software seguro. Consequentemente, há poucas referências práticas em como gerar técnicas de defesa baseadas em enganação. Esta tese de doutoramento contribui com uma aborda- gem sistemática para especificar e desenhar requisitos, táticas e estratégias de enganação cibernéticas. Esta abordagem é composta por (i) uma modelação multi-paradigma para re- presentar requisitos, táticas e estratégias de enganação, (ii) uma arquitetura de referência para apoiar a integração de estratégias de enganação na operação dum sistema, e (iii) um método para orientar os engenheiros na modelação de enganação. Uma ferramenta protó- tipo, um estudo de caso e uma avaliação experimental mostram resultados encorajadores para a aplicação da abordagem na prática. Finalmente, a expressividade da linguagem de modelação de enganação é avaliada por um mapeamento de cobertura de conceitos
    corecore