Search CORE

619 research outputs found

Robust Algorithms for Detecting Hidden Structure in Biological Data

Author: Sloutsky Roman
Publication venue: Washington University Open Scholarship
Publication date: 15/08/2017
Field of study

Biological data, such as molecular abundance measurements and protein sequences, harbor complex hidden structure that reflects its underlying biological mechanisms. For example, high-throughput abundance measurements provide a snapshot the global state of a living cell, while homologous protein sequences encode the residue-level logic of the proteins\u27 function and provide a snapshot of the evolutionary trajectory of the protein family. In this work I describe algorithmic approaches and analysis software I developed for uncovering hidden structure in both kinds of data. Clustering is an unsurpervised machine learning technique commonly used to map the structure of data collected in high-throughput experiments, such as quantification of gene expression by DNA microarrays or short-read sequencing. Clustering algorithms always yield a partitioning of the data, but relying on a single partitioning solution can lead to spurious conclusions. In particular, noise in the data can cause objects to fall into the same cluster by chance rather than due to meaningful association. In the first part of this thesis I demonstrate approaches to clustering data robustly in the presence of noise and apply robust clustering to analyze the transcriptional response to injury in a neuron cell. In the second part of this thesis I describe identifying hidden specificity determining residues (SDPs) from alignments of protein sequences descended through gene duplication from a common ancestor (paralogs) and apply the approach to identify numerous putative SDPs in bacterial transcription factors in the LacI family. Finally, I describe and demonstrate a new algorithm for reconstructing the history of duplications by which paralogs descended from their common ancestor. This algorithm addresses the complexity of such reconstruction due to indeterminate or erroneous homology assignments made by sequence alignment algorithms and to the vast prevalence of divergence through speciation over divergence through gene duplication in protein evolution

Washington University St. Louis: Open Scholarship

Computational Intelligence in Healthcare

Author: Casalino Gabriella
Castellano Giovanna
Giovanna Castellano Gabriella Casalino
Publication venue: country:CHE
Publication date: 01/01/2021
Field of study

This book is a printed edition of the Special Issue Computational Intelligence in Healthcare that was published in Electronic

Archivio istituzionale della ricerca - Università di Bari

Computational Intelligence in Healthcare

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

The number of patient health data has been estimated to have reached 2314 exabytes by 2020. Traditional data analysis techniques are unsuitable to extract useful information from such a vast quantity of data. Thus, intelligent data analysis methods combining human expertise and computational models for accurate and in-depth data analysis are necessary. The technological revolution and medical advances made by combining vast quantities of available data, cloud computing services, and AI-based solutions can provide expert insight and analysis on a mass scale and at a relatively low cost. Computational intelligence (CI) methods, such as fuzzy models, artificial neural networks, evolutionary algorithms, and probabilistic methods, have recently emerged as promising tools for the development and application of intelligent systems in healthcare practice. CI-based systems can learn from data and evolve according to changes in the environments by taking into account the uncertainty characterizing health data, including omics data, clinical data, sensor, and imaging data. The use of CI in healthcare can improve the processing of such data to develop intelligent solutions for prevention, diagnosis, treatment, and follow-up, as well as for the analysis of administrative processes. The present Special Issue on computational intelligence for healthcare is intended to show the potential and the practical impacts of CI techniques in challenging healthcare applications

Directory of Open Access Books (DOAB)

1992 NASA/ASEE Summer Faculty Fellowship Program

Author: Chappell Charles R.
Freeman L. Michael
Karr Gerald R.
Six Frank
Publication venue
Publication date
Field of study

For the 28th consecutive year, a NASA/ASEE Summer Faculty Fellowship Program was conducted at the Marshall Space Flight Center (MSFC). The program was conducted by the University of Alabama and MSFC during the period June 1, 1992 through August 7, 1992. Operated under the auspices of the American Society for Engineering Education, the MSFC program, was well as those at other centers, was sponsored by the Office of Educational Affairs, NASA Headquarters, Washington, DC. The basic objectives of the programs, which are the 29th year of operation nationally, are (1) to further the professional knowledge of qualified engineering and science faculty members; (2) to stimulate and exchange ideas between participants and NASA; (3) to enrich and refresh the research and teaching activities of the participants' institutions; and (4) to contribute to the research objectives of the NASA centers

NASA Technical Reports Server

Working Notes from the 1992 AAAI Spring Symposium on Practical Approaches to Scheduling and Planning

Author: Drummond Mark
Fox Mark
Tate Austin
Zweben Monte
Publication venue
Publication date
Field of study

The symposium presented issues involved in the development of scheduling systems that can deal with resource and time limitations. To qualify, a system must be implemented and tested to some degree on non-trivial problems (ideally, on real-world problems). However, a system need not be fully deployed to qualify. Systems that schedule actions in terms of metric time constraints typically represent and reason about an external numeric clock or calendar and can be contrasted with those systems that represent time purely symbolically. The following topics are discussed: integrating planning and scheduling; integrating symbolic goals and numerical utilities; managing uncertainty; incremental rescheduling; managing limited computation time; anytime scheduling and planning algorithms, systems; dependency analysis and schedule reuse; management of schedule and plan execution; and incorporation of discrete event techniques

NASA Technical Reports Server

Acta Cybernetica : Volume 19. Number 1.

Author
Publication venue
Publication date: 01/01/2009
Field of study

University of Szeged

High-Performance Modelling and Simulation for Big Data Applications

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

Directory of Open Access Books (DOAB)

Flood Management in a Complex River Basin with a Real-Time Decision Support System Based on Hydrological Forecasts

Author: García Hernández Javier
Publication venue: Lausanne, EPFL
Publication date: 26/05/2011
Field of study

During the last decades, the Upper Rhone River basin has been hit by several flood events causing significant damages in excess of 500 million Swiss Francs. From this situation, the 3rd Rhône river training project was planned in order to improve the flood protection in the Upper Rhone River basin in Vaud and Valais Cantons. In this framework, the MINERVE forecast system aims to contribute to a better flow control during flood events in this catchment area, taking advantage of the existing hydropower multi-reservoir network. This system also fits into the OWARNA national project of the Swiss Federal Office of Environment by establishing a national platform on natural hazards alarms. The Upper Rhone River basin has a catchment area with high mountains and large glaciers. The surface of the basin is 5521 km2 and its elevation varies between 400 and 4634 m a.s.l. Numerous hydropower schemes with large dams and reservoirs are located in the catchment area, influencing the hydrological regime. Their impact during floods can be significant as appropriate preventive operations can decrease the peak discharges in the Rhone River and its main tributaries, thus reducing the damages. The MINERVE forecast system exploits flow measurements, data from reservoirs and hydropower plants as well as probabilistic (COSMO-LEPS) and deterministic (COSMO-2 and COSMO-7) numerical weather predictions from MeteoSwiss. The MINERVE hydrological model of the catchment area follows a semi-distributed approach. The basin is split into 239 sub-catchments which are further sub-divided into 500 m elevation bands, for a total of 1050 bands. For each elevation band, precipitation, temperature and potential evapotranspiration are calculated. They are considered in order to describe the temperature-driven processes accurately, such as snow and glaciers melt. The hydrological model was implemented in the Routing System software. The object oriented programming environment allows a user-friendly modelling of the hydrological, hydraulic and operating processes. Numerical meteorological data (observed or predicted) are introduced as input in the model. Over the calibration and validation periods of the model, only observed data (precipitation, temperature and flows) was used. For operational flood forecast, the observed measurements are used to update the initial conditions of the hydrological model and the weather forecasts for the hydrological simulations. Routing System provides then hydrological predictions in the whole catchment area. Subsequently, a warning system was developed especially for the basin to provide a flood warning report. The warning system predicts the evolution of the hydrological situation at selected main check points in the catchment area. It displays three warning levels during a flood event depending on respective critical discharge thresholds. Furthermore, the multi-reservoir system is managed in an optimal way in order to limit or avoid damages during floods. A decision support tool called MINDS (MINERVE Interactive Decision Support System) has been developed for real-time decision making based on the hydrological forecasts. This tool defines preventive operation measures for the hydropower plants such as turbine and bottom outlet releases able to provide an optimal water storage during the flood peak. The overall goal of MINDS is then to retain the inflowing floods in reservoirs and to avoid spillway and turbine operations during the peak flow, taking into account all restrictions and current conditions of the network. Such a reservoir management system can therefore significantly decrease flood damages in the catchment area. The reservoir management optimisation during floods is achieved with deterministic and probabilistic forecasts. The definition of the objective function to optimise is realised with a multi-attribute decision making approach. Then, the optimisation is performed with an iterative Greedy algorithm or a SCE-UA (Shuffled Complex Evolution – University of Arizona) algorithm. The developed decision support system combines the high-quality optimisation system with its user-friendly interface. The purpose is to help decision makers by being directly involve in main steps of the decision making process as well as by understanding the measures undertaken and their consequences

Infoscience - École polytechnique fédérale de Lausanne

하천 오염물질 혼합 해석을 위한 저장대 모형의 매개변수 산정법 및 경험식 개발

Author: 노효섭
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 건설환경공학부,2019. 8. 서일원.Analyses of solute transport and retention mechanism are essential to manage water quality and river ecosystem. As reported by tracer injection studies that have been conducted to identify solute transport mechanism, concentration curves measured in natural stream have steep rising and long tail parts. This phenomenon is due to solute exchange process between transient storage zones and the main river stream. The transient storage model (TSM) is one of the most widely used models for describing solute transport in natural stream, taking transient storage exchange process into consideration. In order to use this model, calibration of four TSM parameters is necessary. Inverse modelling using measured breakthrough curves (BTCs) from tracer injection test is general method for TSM parameter calibration. However, it is not feasible to carry out performing tracer injection tests, for every parameter calibration. For that reasons, empirical formulae with hydraulic data, which is comparatively easier to obtain, have been proposed for the purpose of parameter estimation. This study presents two methods for TSM parameter estimation. At first, inverse modelling method employing global optimization framework Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL), that incorporating famous evolutionary algorithms in water resource management field, was suggested. Second, TSM parameter empirical equations were derived adopting Multigene Genetic Programming (MGGP) based symbolic regression library GPTIPS and using Principal Components Regression (PCR). In terms of general performance, equations of this study were superior to published empirical equations.하천의 수질을 관리하기 위해서는 자연하천에서 유입된 물질이 이송되고 지체되는 메카니즘을 규명하고 이해하는 것이 필요하다. 하천에서의 물질 혼합을 이해하기 위해 수행된 추적자 실험 연구들에 따르면 자연하천에서 계측되는 농도곡선에서는 가파른 상승부와 긴 꼬리기 관측되는 것으로 알려졌다. 이러한 현상은 주로 물질이 흐르는 본류대와 잠시 물질이 포획되었다가 재방출되는 본류대와 저장대 간의 물질교환 효과 때문에 일어난다고 알려져 있다. 이러한 저장대 물질교환 효과를 모사하는 저장대모형 중 Transient Storage zone Model (TSM)은 가장 광범위하게 이용되는 모형으로, 이를 이용하기 위해선 네 가지의 저장대 매개변수를 보정하여야 한다. 네 가지 저장대 매개변수를 결정하는 방법으로는 일반적으로 현장실험에서 측정된 농도곡선을 이용한 역산모형이 이용된다. 그러나 매개변수가 필요할 때마다 추적자실험을 수행하여 역산모형을 이용하는 것은 현실적으로 불가능한 경우가 있어 이러한 경우에는 비교적 취득하기 쉬운 수리지형학적 인자들을 이용해 매개변수를 산정하는 방법이 이용될 수 있다. 따라서 본 연구에서는 TSM 매개변수를 결정하기 위해 두 가지 방법을 제시하였다. 첫 번째로, 전역 최적화 프레임워크인 Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL)을 이용한 역산모형 기반 TSM 매개변수 산정 프레임워크를 제시하였다. 둘째로는 기호회귀법 라이브러리인 GPTIPS를 이용한 다중유전자 유전 프로그래밍(Multigene Genetic Programming, MGGP) 과 주성분회귀법(Principal Components Regression, PCR)을 통해 네 가지 매개변수 별로 각 두 개씩의 경험식이 개발되었다. 개발된 경험식들의 성능평가 결과, 선행 연구에서 제시된 저장대 매개변수 식에 비해 본 연구에서 제시된 방법이 대체적으로 우수한 것으로 나타났다. 결과적으로 본 연구에서는 분석을 통해 실무적으로 활용 가능한 TSM 매개변수 산정 프레임워크와 경험식들이 제시되었으며, 이 방법들은 추적자 실험 자료의 유무에 따라 TSM의 매개변수 결정에 유용하게 사용될 것으로 기대된다.Chapter 1. Introduction 1 1.1 Necessity and Background of Research 1 1.2 Objectives 12 Chapter 2. Theoretical Background 15 2.1 Transient Storage Model 15 2.1.1. Mechanisms of Transient Storage 15 2.1.2. Models Accounting for Transient Storage 21 2.1.2.1 The one Zone Transient Storage Model (1Z-TSM) 24 2.1.2.2 The two Zone Transient Storage Model (2Z-TSM) 25 2.1.2.3 The Continuous Time Random Walk Approach (CTRW) 26 2.1.2.4 The Modified Advection Dispersion Model (MADE) 27 2.1.2.5 The Fractional Advection Dispersion Equation Model (FADE) 28 2.1.2.6 The Multirate Mass Transfer Model (MRMT) 29 2.1.2.7 The Advective Storage Path Model (ASP) 30 2.1.2.8 The Solute Transport in Rivers Model (STIR) 31 2.1.2.9 The Aggregate Dead Zone Model (ADZ) 34 2.2 Empirical Equations for Predicting Transient Storage Model Parameters 39 2.3 Parameter Estimation 47 2.3.1. The SC-SAHEL Framework 50 2.3.1.1 Modified Competitive Complex Evolution (MCCE) 52 2.3.1.2 Modified Frog Leaping (MFL) 52 2.3.1.3 Modified Grey Wolf Optimizer (GWO) 53 2.3.1.4 Modified Differential Evolution (DE) 53 2.4 Regression Method 54 2.4.1. The Multi-Gene Genetic Programming (MGGP) 56 2.4.1.1 The Simple Genetic Programming 56 2.4.1.2 Scaled Symbolic Regression via Multi-Gene Genetic Programming 57 2.4.2. Evolutionary Polynomial Regression (EPR) 61 2.4.2.1 Main Flow of EPR Procedure 62 Chapter 3. Model Development 66 3.1 Numerical Model 66 3.1.1. Model Validation 69 3.2 Merger of TSM-SC-SAHEL 73 3.3 Further assessments for the parameter estimation framework 76 3.3.1. Tracer Test Description 76 3.3.2. Grid Independency of Estimation 81 3.3.3. Choice of Optimization Setting 85 Chapter 4. Development of Formulae for Predicting TSM Parameter 91 4.1 Dimensional Analysis 91 4.2 Data Collection via Meta Analysis 95 4.3 Formulae Development 106 Chapter 5. Result and Discussion 110 5.1 Model Performances 110 5.2 Sensitivity Analysis 118 5.3 In-stream Application of Empirical Equations 130 Chapter 6. Conclusion 140 References 144 Appendix. I. The mean, minimum, and maximum values of the model fitness value and number of evolution using the SC-SAHEL with single-EA and multi-EA 159 Appendix. II. Used dimensionless datasets for development of empirical equations 161 국문초록 165Maste

SNU Open Repository and Archive

P5 eHealth: An Agenda for the Health Technologies of the Future

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access volume focuses on the development of a P5 eHealth, or better, a methodological resource for developing the health technologies of the future, based on patients’ personal characteristics and needs as the fundamental guidelines for design. It provides practical guidelines and evidence based examples on how to design, implement, use and elevate new technologies for healthcare to support the management of incurable, chronic conditions. The volume further discusses the criticalities of eHealth, why it is difficult to employ eHealth from an organizational point of view or why patients do not always accept the technology, and how eHealth interventions can be improved in the future. By dealing with the state-of-the-art in eHealth technologies, this volume is of great interest to researchers in the field of physical and mental healthcare, psychologists, stakeholders and policymakers as well as technology developers working in the healthcare sector

OAPEN Library