9,072 research outputs found
A General Spatio-Temporal Clustering-Based Non-local Formulation for Multiscale Modeling of Compartmentalized Reservoirs
Representing the reservoir as a network of discrete compartments with
neighbor and non-neighbor connections is a fast, yet accurate method for
analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale
compartments with distinct static and dynamic properties is an integral part of
such high-level reservoir analysis. In this work, we present a hybrid framework
specific to reservoir analysis for an automatic detection of clusters in space
using spatial and temporal field data, coupled with a physics-based multiscale
modeling approach. In this work a novel hybrid approach is presented in which
we couple a physics-based non-local modeling framework with data-driven
clustering techniques to provide a fast and accurate multiscale modeling of
compartmentalized reservoirs. This research also adds to the literature by
presenting a comprehensive work on spatio-temporal clustering for reservoir
studies applications that well considers the clustering complexities, the
intrinsic sparse and noisy nature of the data, and the interpretability of the
outcome.
Keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal
Clustering; Physics-Based Data-Driven Formulation; Multiscale Modelin
A systematic review of data quality issues in knowledge discovery tasks
Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust
Self-adjustable domain adaptation in personalized ECG monitoring integrated with IR-UWB radar
To enhance electrocardiogram (ECG) monitoring systems in personalized detections, deep neural networks (DNNs) are applied to overcome individual differences by periodical retraining. As introduced previously [4], DNNs relieve individual differences by fusing ECG with impulse radio ultra-wide band (IR-UWB) radar. However, such DNN-based ECG monitoring system tends to overfit into personal small datasets and is difficult to generalize to newly collected unlabeled data. This paper proposes a self-adjustable domain adaptation (SADA) strategy to prevent from overfitting and exploit unlabeled data. Firstly, this paper enlarges the database of ECG and radar data with actual records acquired from 28 testers and expanded by the data augmentation. Secondly, to utilize unlabeled data, SADA combines self organizing maps with the transfer learning in predicting labels. Thirdly, SADA integrates the one-class classification with domain adaptation algorithms to reduce overfitting. Based on our enlarged database and standard databases, a large dataset of 73200 records and a small one of 1849 records are built up to verify our proposal. Results show SADA\u27s effectiveness in predicting labels and increments in the sensitivity of DNNs by 14.4% compared with existing domain adaptation algorithms
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Path Similarity Analysis: a Method for Quantifying Macromolecular Pathways
Diverse classes of proteins function through large-scale conformational
changes; sophisticated enhanced sampling methods have been proposed to generate
these macromolecular transition paths. As such paths are curves in a
high-dimensional space, they have been difficult to compare quantitatively, a
prerequisite to, for instance, assess the quality of different sampling
algorithms. The Path Similarity Analysis (PSA) approach alleviates these
difficulties by utilizing the full information in 3N-dimensional trajectories
in configuration space. PSA employs the Hausdorff or Fr\'echet path
metrics---adopted from computational geometry---enabling us to quantify path
(dis)similarity, while the new concept of a Hausdorff-pair map permits the
extraction of atomic-scale determinants responsible for path differences.
Combined with clustering techniques, PSA facilitates the comparison of many
paths, including collections of transition ensembles. We use the closed-to-open
transition of the enzyme adenylate kinase (AdK)---a commonly used testbed for
the assessment enhanced sampling algorithms---to examine multiple microsecond
equilibrium molecular dynamics (MD) transitions of AdK in its substrate-free
form alongside transition ensembles from the MD-based dynamic importance
sampling (DIMS-MD) and targeted MD (TMD) methods, and a geometrical targeting
algorithm (FRODA). A Hausdorff pairs analysis of these ensembles revealed, for
instance, that differences in DIMS-MD and FRODA paths were mediated by a set of
conserved salt bridges whose charge-charge interactions are fully modeled in
DIMS-MD but not in FRODA. We also demonstrate how existing trajectory analysis
methods relying on pre-defined collective variables, such as native contacts or
geometric quantities, can be used synergistically with PSA, as well as the
application of PSA to more complex systems such as membrane transporter
proteins.Comment: 9 figures, 3 tables in the main manuscript; supplementary information
includes 7 texts (S1 Text - S7 Text) and 11 figures (S1 Fig - S11 Fig) (also
available from journal site
Slow slip and the transition from fast to slow fronts in the rupture of frictional interfaces
The failure of the population of micro-junctions forming the frictional
interface between two solids is central to fields ranging from biomechanics to
seismology. This failure is mediated by the propagation along the interface of
various types of rupture fronts, covering a wide range of velocities. Among
them are so-called slow fronts, which are recently discovered fronts much
slower than the materials' sound speeds. Despite intense modelling activity,
the mechanisms underlying slow fronts remain elusive. Here, we introduce a
multi-scale model capable of reproducing both the transition from fast to slow
fronts in a single rupture event and the short-time slip dynamics observed in
recent experiments. We identify slow slip immediately following the arrest of a
fast front as a phenomenon sufficient for the front to propagate further at a
much slower pace. Whether slow fronts are actually observed is controlled both
by the interfacial stresses and by the width of the local distribution of
forces among micro-junctions. Our results show that slow fronts are
qualitatively different from faster fronts. Since the transition from fast to
slow fronts is potentially as generic as slow slip, we anticipate that it might
occur in the wide range of systems in which slow slip has been reported,
including seismic faults.Comment: 35 pages, 5 primary figures, 6 supporting figures. Post-print version
with improvements from review process include
- …