93,302 research outputs found
Recommended from our members
Expert-augmented machine learning.
Machine learning is proving invaluable across disciplines. However, its success is often limited by the quality and quantity of available data, while its adoption is limited by the level of trust afforded by given models. Human vs. machine performance is commonly compared empirically to decide whether a certain task should be performed by a computer or an expert. In reality, the optimal learning strategy may involve combining the complementary strengths of humans and machines. Here, we present expert-augmented machine learning (EAML), an automated method that guides the extraction of expert knowledge and its integration into machine-learned models. We used a large dataset of intensive-care patient data to derive 126 decision rules that predict hospital mortality. Using an online platform, we asked 15 clinicians to assess the relative risk of the subpopulation defined by each rule compared to the total sample. We compared the clinician-assessed risk to the empirical risk and found that, while clinicians agreed with the data in most cases, there were notable exceptions where they overestimated or underestimated the true risk. Studying the rules with greatest disagreement, we identified problems with the training data, including one miscoded variable and one hidden confounder. Filtering the rules based on the extent of disagreement between clinician-assessed risk and empirical risk, we improved performance on out-of-sample data and were able to train with less data. EAML provides a platform for automated creation of problem-specific priors, which help build robust and dependable machine-learning models in critical applications
Enhancing Energy Production with Exascale HPC Methods
High Performance Computing (HPC) resources have become the key actor for achieving more ambitious challenges in many disciplines. In this step beyond, an explosion on the available parallelism and the use of special purpose
processors are crucial. With such a goal, the HPC4E project applies new exascale HPC techniques to energy industry simulations, customizing them if necessary, and going beyond the state-of-the-art in the required HPC exascale
simulations for different energy sources. In this paper, a general overview of these methods is presented as well as some specific preliminary results.The research leading to these results has received funding from the European Union's Horizon 2020 Programme (2014-2020) under the HPC4E Project (www.hpc4e.eu), grant agreement n° 689772, the Spanish Ministry of
Economy and Competitiveness under the CODEC2 project (TIN2015-63562-R), and
from the Brazilian Ministry of Science, Technology and Innovation through Rede
Nacional de Pesquisa (RNP). Computer time on Endeavour cluster is provided by the
Intel Corporation, which enabled us to obtain the presented experimental results in
uncertainty quantification in seismic imagingPostprint (author's final draft
Planetary Geology: Goals, Future Directions, and Recommendations
Planetary exploration has provided a torrent of discoveries and a recognition that planets are not inert objects. This expanded view has led to the notion of comparative planetology, in which the differences and similarities among planetary objects are assessed. Solar system exploration is undergoing a change from an era of reconnaissance to one of intensive exploration and focused study. Analyses of planetary surfaces are playing a key role in this transition, especially as attention is focused on such exploration goals as returned samples from Mars. To assess how the science of planetary geology can best contribute to the goals of solar system exploration, a workshop was held at Arizona State University in January 1987. The participants discussed previous accomplishments of the planetary geology program, assessed the current studies in planetary geology, and considered the requirements to meet near-term and long-term exploration goals
Artificial intelligence approaches to software engineering
Artificial intelligence approaches to software engineering are examined. The software development life cycle is a sequence of not so well-defined phases. Improved techniques for developing systems have been formulated over the past 15 years, but pressure continues to attempt to reduce current costs. Software development technology seems to be standing still. The primary objective of the knowledge-based approach to software development presented in this paper is to avoid problem areas that lead to schedule slippages, cost overruns, or software products that fall short of their desired goals. Identifying and resolving software problems early, often in the phase in which they first occur, has been shown to contribute significantly to reducing risks in software development. Software development is not a mechanical process but a basic human activity. It requires clear thinking, work, and rework to be successful. The artificial intelligence approaches to software engineering presented support the software development life cycle through the use of software development techniques and methodologies in terms of changing current practices and methods. These should be replaced by better techniques that that improve the process of of software development and the quality of the resulting products. The software development process can be structured into well-defined steps, of which the interfaces are standardized, supported and checked by automated procedures that provide error detection, production of the documentation and ultimately support the actual design of complex programs
Expert-Augmented Machine Learning
Machine Learning is proving invaluable across disciplines. However, its
success is often limited by the quality and quantity of available data, while
its adoption by the level of trust that models afford users. Human vs. machine
performance is commonly compared empirically to decide whether a certain task
should be performed by a computer or an expert. In reality, the optimal
learning strategy may involve combining the complementary strengths of man and
machine. Here we present Expert-Augmented Machine Learning (EAML), an automated
method that guides the extraction of expert knowledge and its integration into
machine-learned models. We use a large dataset of intensive care patient data
to predict mortality and show that we can extract expert knowledge using an
online platform, help reveal hidden confounders, improve generalizability on a
different population and learn using less data. EAML presents a novel framework
for high performance and dependable machine learning in critical applications
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
- …