814 research outputs found

    Inter-individual variation of the human epigenome & applications

    Get PDF

    Inter-individual variation of the human epigenome & applications

    Get PDF
    Genome-wide association studies (GWAS) have led to the discovery of genetic variants influencing human phenotypes in health and disease. However, almost two decades later, most human traits can still not be accurately predicted from common genetic variants. Moreover, genetic variants discovered via GWAS mostly map to the non-coding genome and have historically resisted interpretation via mechanistic models. Alternatively, the epigenome lies in the cross-roads between genetics and the environment. Thus, there is great excitement towards the mapping of epigenetic inter-individual variation since its study may link environmental factors to human traits that remain unexplained by genetic variants. For instance, the environmental component of the epigenome may serve as a source of biomarkers for accurate, robust and interpretable phenotypic prediction on low-heritability traits that cannot be attained by classical genetic-based models. Additionally, its research may provide mechanisms of action for genetic associations at non-coding regions that mediate their effect via the epigenome. The aim of this thesis was to explore epigenetic inter-individual variation and to mitigate some of the methodological limitations faced towards its future valorisation.Chapter 1 is dedicated to the scope and aims of the thesis. It begins by describing historical milestones and basic concepts in human genetics, statistical genetics, the heritability problem and polygenic risk scores. It then moves towards epigenetics, covering the several dimensions it encompasses. It subsequently focuses on DNA methylation with topics like mitotic stability, epigenetic reprogramming, X-inactivation or imprinting. This is followed by concepts from epigenetic epidemiology such as epigenome-wide association studies (EWAS), epigenetic clocks, Mendelian randomization, methylation risk scores and methylation quantitative trait loci (mQTL). The chapter ends by introducing the aims of the thesis.Chapter 2 focuses on stochastic epigenetic inter-individual variation resulting from processes occurring post-twinning, during embryonic development and early life. Specifically, it describes the discovery and characterisation of hundreds of variably methylated CpGs in the blood of healthy adolescent monozygotic (MZ) twins showing equivalent variation among co-twins and unrelated individuals (evCpGs) that could not be explained only by measurement error on the DNA methylation microarray. DNA methylation levels at evCpGs were shown to be stable short-term but susceptible to aging and epigenetic drift in the long-term. The identified sites were significantly enriched at the clustered protocadherin loci, known for stochastic methylation in neurons in the context of embryonic neurodevelopment. Critically, evCpGs were capable of clustering technical and longitudinal replicates while differentiating young MZ twins. Thus, discovered evCpGs can be considered as a first prototype towards universal epigenetic fingerprint, relevant in the discrimination of MZ twins for forensic purposes, currently impossible with standard DNA profiling. Besides, DNA methylation microarrays are the preferred technology for EWAS and mQTL mapping studies. However, their probe design inherently assumes that the assayed genomic DNA is identical to the reference genome, leading to genetic artifacts whenever this assumption is not fulfilled. Building upon the previous experience analysing microarray data, Chapter 3 covers the development and benchmarking of UMtools, an R-package for the quantification and qualification of genetic artifacts on DNA methylation microarrays based on the unprocessed fluorescence intensity signals. These tools were used to assemble an atlas on genetic artifacts encountered on DNA methylation microarrays, including interactions between artifacts or with X-inactivation, imprinting and tissue-specific regulation. Additionally, to distinguish artifacts from genuine epigenetic variation, a co-methylation-based approach was proposed. Overall, this study revealed that genetic artifacts continue to filter through into the reported literature since current methodologies to address them have overlooked this challenge.Furthermore, EWAS, mQTL and allele-specific methylation (ASM) mapping studies have all been employed to map epigenetic variation but require matching phenotypic/genotypic data and can only map specific components of epigenetic inter-individual variation. Inspired by the previously proposed co-methylation strategy, Chapter 4 describes a novel method to simultaneously map inter-haplotype, inter-cell and inter-individual variation without these requirements. Specifically, binomial likelihood function-based bootstrap hypothesis test for co-methylation within reads (Binokulars) is a randomization test that can identify jointly regulated CpGs (JRCs) from pooled whole genome bisulfite sequencing (WGBS) data by solely relying on joint DNA methylation information available in reads spanning multiple CpGs. Binokulars was tested on pooled WGBS data in whole blood, sperm and combined, and benchmarked against EWAS and ASM. Our comparisons revealed that Binokulars can integrate a wide range of epigenetic phenomena under the same umbrella since it simultaneously discovered regions associated with imprinting, cell type- and tissue-specific regulation, mQTL, ageing or even unknown epigenetic processes. Finally, we verified examples of mQTL and polymorphic imprinting by employing another novel tool, JRC_sorter, to classify regions based on epigenotype models and non-pooled WGBS data in cord blood. In the future, we envision how this cost-effective approach can be applied on larger pools to simultaneously highlight regions of interest in the methylome, a highly relevant task in the light of the post-GWAS era.Moving towards future applications of epigenetic inter-individual variation, Chapters 5 and 6 are dedicated to solving some of methodological issues faced in translational epigenomics.Firstly, due to its simplicity and well-known properties, linear regression is the starting point methodology when performing prediction of a continuous outcome given a set of predictors. However, linear regression is incompatible with missing data, a common phenomenon and a huge threat to the integrity of data analysis in empirical sciences, including (epi)genomics. Chapter 5 describes the development of combinatorial linear models (cmb-lm), an imputation-free, CPU/RAM-efficient and privacy-preserving statistical method for linear regression prediction on datasets with missing values. Cmb-lm provide prediction errors that take into account the pattern of missing values in the incomplete data, even at extreme missingness. As a proof-of-concept, we tested cmb-lm in the context of epigenetic ageing clocks, one of the most popular applications of epigenetic inter-individual variation. Overall, cmb-lm offer a simple and flexible methodology with a wide range of applications that can provide a smooth transition towards the valorisation of linear models in the real world, where missing data is almost inevitable. Beyond microarrays, due to its high accuracy, reliability and sample multiplexing capabilities, massively parallel sequencing (MPS) is currently the preferred methodology of choice to translate prediction models for traits of interests into practice. At the same time, tobacco smoking is a frequent habit sustained by more than 1.3 billion people in 2020 and a leading (and preventable) health risk factor in the modern world. Predicting smoking habits from a persistent biomarker, such as DNA methylation, is not only relevant to account for self-reporting bias in public health and personalized medicine studies, but may also allow broadening forensic DNA phenotyping. Previously, a model to predict whether someone is a current, former, or never smoker had been published based on solely 13 CpGs from the hundreds of thousands included in the DNA methylation microarray. However, a matching lab tool with lower marker throughput, and higher accuracy and sensitivity was missing towards translating the model in practice. Chapter 6 describes the development of an MPS assay and data analysis pipeline to quantify DNA methylation on these 13 smoking-associated biomarkers for the prediction of smoking status. Though our systematic evaluation on DNA standards of known methylation levels revealed marker-specific amplification bias, our novel tool was still able to provide highly accurate and reproducible DNA methylation quantification and smoking habit prediction. Overall, our MPS assay allows the technological transfer of DNA methylation microarray findings and models to practical settings, one step closer towards future applications.Finally, Chapter 7 provides a general discussion on the results and topics discussed across Chapters 2-6. It begins by summarizing the main findings across the thesis, including proposals for follow-up studies. It then covers technical limitations pertaining bisulfite conversion and DNA methylation microarrays, but also more general considerations such as restricted data access. This chapter ends by covering the outlook of this PhD thesis, including topics such as bisulfite-free methods, third-generation sequencing, single-cell methylomics, multi-omics and systems biology.<br/

    Microcredentials to support PBL

    Get PDF

    Contributions to time series analysis, modelling and forecasting to increase reliability in industrial environments.

    Get PDF
    356 p.La integración del Internet of Things en el sector industrial es clave para alcanzar la inteligencia empresarial. Este estudio se enfoca en mejorar o proponer nuevos enfoques para aumentar la confiabilidad de las soluciones de IA basadas en datos de series temporales en la industria. Se abordan tres fases: mejora de la calidad de los datos, modelos y errores. Se propone una definición estándar de métricas de calidad y se incluyen en el paquete dqts de R. Se exploran los pasos del modelado de series temporales, desde la extracción de características hasta la elección y aplicación del modelo de predicción más eficiente. El método KNPTS, basado en la búsqueda de patrones en el histórico, se presenta como un paquete de R para estimar datos futuros. Además, se sugiere el uso de medidas elásticas de similitud para evaluar modelos de regresión y la importancia de métricas adecuadas en problemas de clases desbalanceadas. Las contribuciones se validaron en casos de uso industrial de diferentes campos: calidad de producto, previsión de consumo eléctrico, detección de porosidad y diagnóstico de máquinas

    Security Technologies and Methods for Advanced Cyber Threat Intelligence, Detection and Mitigation

    Get PDF
    The rapid growth of the Internet interconnectivity and complexity of communication systems has led us to a significant growth of cyberattacks globally often with severe and disastrous consequences. The swift development of more innovative and effective (cyber)security solutions and approaches are vital which can detect, mitigate and prevent from these serious consequences. Cybersecurity is gaining momentum and is scaling up in very many areas. This book builds on the experience of the Cyber-Trust EU project’s methods, use cases, technology development, testing and validation and extends into a broader science, lead IT industry market and applied research with practical cases. It offers new perspectives on advanced (cyber) security innovation (eco) systems covering key different perspectives. The book provides insights on new security technologies and methods for advanced cyber threat intelligence, detection and mitigation. We cover topics such as cyber-security and AI, cyber-threat intelligence, digital forensics, moving target defense, intrusion detection systems, post-quantum security, privacy and data protection, security visualization, smart contracts security, software security, blockchain, security architectures, system and data integrity, trust management systems, distributed systems security, dynamic risk management, privacy and ethics

    The effects of user assistance systems on user perception and behavior

    Get PDF
    The rapid development of information technology (IT) is changing how people approach and interact with IT systems (Maedche et al. 2016). IT systems can increasingly support people in performing ever more complex tasks (Vtyurina and Fourney 2018). However, people's cognitive abilities have not evolved as quickly as technology (Maedche et al. 2016). Thus, different external factors (e.g., complexity or uncertainty) and internal conditions (e.g., cognitive load or stress) reduce decision quality (Acciarini et al. 2021; Caputo 2013; Hilbert 2012). User-assistance systems (UASs) can help to compensate for human weaknesses and cope with new challenges. UASs aim to improve the user's cognition and capabilities, benefiting individuals, organizations, and society. To achieve this goal, UASs collect, prepare, aggregate, analyze information, and communicate results according to user preferences (Maedche et al. 2019). This support can relieve users and improve the quality of decision-making. Using UASs offers many benefits but requires successful interaction between the user and the UAS. However, this interaction introduces social and technical challenges, such as loss of control or reduced explainability, which can affect user trust and willingness to use the UAS (Maedche et al. 2019). To realize the benefits, UASs must be developed based on an understanding and incorporation of users' needs. Users and UASs are part of a socio-technical system to complete a specific task (Maedche et al. 2019). To create a benefit from the interaction, it is necessary to understand the interaction within the socio-technical system, i.e., the interaction between the user, UAS, and task, and to align the different components. For this reason, this dissertation aims to extend the existing knowledge on UAS design by better understanding the effects and mechanisms during the interaction between UASs and users in different application contexts. Therefore, theory and findings from different disciplines are combined and new theoretical knowledge is derived. In addition, data is collected and analyzed to validate the new theoretical knowledge empirically. The findings can be used to reduce adaptation barriers and realize a positive outcome. Overall this dissertation addresses the four classes of UASs presented by Maedche et al. (2016): basic UASs, interactive UASs, intelligent UASs, and anticipating UASs. First, this dissertation contributes to understanding how users interact with basic UASs. Basic UASs do not process contextual information and interact little with the user (Maedche et al. 2016). This behavior makes basic UASs suitable for application contexts, such as social media, where little interaction is desired. Social media is primarily used for entertainment and focuses on content consumption (Moravec et al. 2018). As a result, social media has become an essential source of news but also a target for fake news, with negative consequences for individuals and society (Clarke et al. 2021; Laato et al. 2020). Thus, this thesis presents two approaches to how basic UASs can be used to reduce the negative influence of fake news. Firstly, basic UASs can provide interventions by warning users of questionable content and providing verified information but the order in which the intervention elements are displayed influences the fake news perception. The intervention elements should be displayed after the fake news story to achieve an efficient intervention. Secondly, basic UASs can provide social norms to motivate users to report fake news and thereby stop the spread of fake news. However, social norms should be used carefully, as they can backfire and reduce the willingness to report fake news. Second, this dissertation contributes to understanding how users interact with interactive UASs. Interactive UASs incorporate limited information from the application context but focus on close interaction with the user to achieve a specific goal or behavior (Maedche et al. 2016). Typical goals include more physical activity, a healthier diet, and less tobacco and alcohol consumption to prevent disease and premature death (World Health Organization 2020). To increase goal achievement, previous researchers often utilize digital human representations (DHRs) such as avatars and embodied agents to form a socio-technical relationship between the user and the interactive UAS (Kim and Sundar 2012a; Pfeuffer et al. 2019). However, understanding how the design features of an interactive UAS affect the interaction with the user is crucial, as each design feature has a distinct impact on the user's perception. Based on existing knowledge, this thesis highlights the most widely used design features and analyzes their effects on behavior. The findings reveal important implications for future interactive UAS design. Third, this dissertation contributes to understanding how users interact with intelligent UASs. Intelligent UASs prioritize processing user and contextual information to adapt to the user's needs rather than focusing on an intensive interaction with the user (Maedche et al. 2016). Thus, intelligent UASs with emotional intelligence can provide people with task-oriented and emotional support, making them ideal for situations where interpersonal relationships are neglected, such as crowd working. Crowd workers frequently work independently without any significant interactions with other people (Jäger et al. 2019). In crowd work environments, traditional leader-employee relationships are usually not established, which can have a negative impact on employee motivation and performance (Cavazotte et al. 2012). Thus, this thesis examines the impact of an intelligent UAS with leadership and emotional capabilities on employee performance and enjoyment. The leadership capabilities of the intelligent UAS lead to an increase in enjoyment but a decrease in performance. The emotional capabilities of the intelligent UAS reduce the stimulating effect of leadership characteristics. Fourth, this dissertation contributes to understanding how users interact with anticipating UASs. Anticipating UASs are intelligent and interactive, providing users with task-related and emotional stimuli (Maedche et al. 2016). They also have advanced communication interfaces and can adapt to current situations and predict future events (Knote et al. 2018). Because of these advanced capabilities anticipating UASs enable collaborative work settings and often use anthropomorphic design cues to make the interaction more intuitive and comfortable (André et al. 2019). However, these anthropomorphic design cues can also raise expectations too high, leading to disappointment and rejection if they are not met (Bartneck et al. 2009; Mori 1970). To create a successful collaborative relationship between anticipating UASs and users, it is important to understand the impact of anthropomorphic design cues on the interaction and decision-making processes. This dissertation presents a theoretical model that explains the interaction between anthropomorphic anticipating UASs and users and an experimental procedure for empirical evaluation. The experiment design lays the groundwork for empirically testing the theoretical model in future research. To sum up, this dissertation contributes to information systems knowledge by improving understanding of the interaction between UASs and users in different application contexts. It develops new theoretical knowledge based on previous research and empirically evaluates user behavior to explain and predict it. In addition, this dissertation generates new knowledge by prototypically developing UASs and provides new insights for different classes of UASs. These insights can be used by researchers and practitioners to design more user-centric UASs and realize their potential benefits

    Production Optimization Indexed to the Market Demand Through Neural Networks

    Get PDF
    Connectivity, mobility and real-time data analytics are the prerequisites for a new model of intelligent production management that facilitates communication between machines, people and processes and uses technology as the main driver. Many works in the literature treat maintenance and production management in separate approaches, but there is a link between these areas, with maintenance and its actions aimed at ensuring the smooth operation of equipment to avoid unnecessary downtime in production. With the advent of technology, companies are rushing to solve their problems by resorting to technologies in order to fit into the most advanced technological concepts, such as industries 4.0 and 5.0, which are based on the principle of process automation. This approach brings together database technologies, making it possible to monitor the operation of equipment and have the opportunity to study patterns of data behavior that can alert us to possible failures. The present thesis intends to forecast the pulp production indexed to the stock market value.The forecast will be made by means of the pulp production variables of the presses and the stock exchange variables supported by artificial intelligence (AI) technologies, aiming to achieve an effective planning. To support the decision of efficient production management, in this thesis algorithms were developed and validated with from five pulp presses, as well as data from other sources, such as steel production and stock exchange, which were relevant to validate the robustness of the model. This thesis demonstrated the importance of data processing methods and that they have great relevance in the model input since they facilitate the process of training and testing the models. The chosen technologies demonstrated good efficiency and versatility in performing the prediction of the values of the variables of the equipment, also demonstrating robustness and optimization in computational processing. The thesis also presents proposals for future developments, namely in further exploration of these technologies, so that there are market variables that can calibrate production through forecasts supported on these same variables.Conectividade, mobilidade e análise de dados em tempo real são pré-requisitos para um novo modelo de gestão inteligente da produção que facilita a comunicação entre máquinas, pessoas e processos, e usa a tecnologia como motor principal. Muitos trabalhos na literatura tratam a manutenção e a gestão da produção em abordagens separadas, mas existe uma correlação entre estas áreas, sendo que a manutenção e as suas políticas têm como premissa garantir o bom funcionamento dos equipamentos de modo a evitar paragens desnecessárias na linha de produção. Com o advento da tecnologia há uma corrida das empresas para solucionar os seus problemas recorrendo às tecnologias, visando a sua inserção nos conceitos tecnológicos, mais avançados, tais como as indústrias 4.0 e 5.0, as quais têm como princípio a automatização dos processos. Esta abordagem junta as tecnologias de sistema de informação, sendo possível fazer o acompanhamento do funcionamento dos equipamentos e ter a possibilidade de realizar o estudo de padrões de comportamento dos dados que nos possam alertar para possíveis falhas. A presente tese pretende prever a produção da pasta de papel indexada às bolsas de valores. A previsão será feita por via das variáveis da produção da pasta de papel das prensas e das variáveis da bolsa de valores suportadas em tecnologias de artificial intelligence (IA), tendo como objectivo conseguir um planeamento eficaz. Para suportar a decisão de uma gestão da produção eficiente, na presente tese foram desenvolvidos algoritmos, validados em dados de cinco prensas de pasta de papel, bem como dados de outras fontes, tais como, de Produção de Aço e de Bolsas de Valores, os quais se mostraram relevantes para a validação da robustez dos modelos. A presente tese demonstrou a importância dos métodos de tratamento de dados e que os mesmos têm uma grande relevância na entrada do modelo, visto que facilita o processo de treino e testes dos modelos. As tecnologias escolhidas demonstraram uma boa eficiência e versatilidade na realização da previsão dos valores das variáveis dos equipamentos, demonstrando ainda robustez e otimização no processamento computacional. A tese apresenta ainda propostas para futuros desenvolvimentos, designadamente na exploração mais aprofundada destas tecnologias, de modo a que haja variáveis de mercado que possam calibrar a produção através de previsões suportadas nestas mesmas variáveis

    Describing Faces for Identification: Getting the Message, But Not The Picture

    Full text link
    Although humans rely on faces and language for social communication, the role of language in communicating about faces is poorly understood. Describing faces and identifying faces from verbal descriptions are important tasks in social and criminal justice settings. Prior research indicates that people have difficulty relaying face identity to others via verbal description, however little is known about the process, correlates, or content of communication about faces (hereafter ‘face communication’). In Chapter Two, I investigated face communication accuracy and its relationship with an individual’s perceptual face skill. I also examined the efficacy of a brief training intervention for improving face description ability. I found that individuals could complete face communication tasks with above chance levels of accuracy, in both interactive and non-interactive conditions, and that abilities in describing faces and using face descriptions for identification were related to an individual’s perceptual face skill. However, training was not effective for improving face description ability. In Chapter Three, I investigated qualitative attributes of face descriptions. I found no evidence of qualitative differences in face descriptions as a function of the describer’s perceptual skill with faces, the identification utility of descriptions, or the describer’s familiarity with the face. In Chapters Two and Three, the reliability of measures may have limited the ability to detect relationships between face communication accuracy and potential correlates of performance. Consequently, in Chapter Four, I examined face communication accuracy when using constrained face descriptions, derived using a rating scale, and the relationship between the identification utility of such descriptions and their reliability (test-retest and multi-rater). I found that constrained face descriptions were less useful for identification than free descriptions and the reliability of a description was unrelated to its identification utility. Together, findings in this thesis indicate that face communication is very challenging – both for individuals undertaking the task, and for researchers seeking to measure performance reliably. Given the mechanisms contributing to variance in face communication accuracy remain largely elusive, legal stakeholders would be wise to use caution when relying on evidence involving face description
    • …
    corecore