9,072 research outputs found

    A General Spatio-Temporal Clustering-Based Non-local Formulation for Multiscale Modeling of Compartmentalized Reservoirs

    Full text link
    Representing the reservoir as a network of discrete compartments with neighbor and non-neighbor connections is a fast, yet accurate method for analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale compartments with distinct static and dynamic properties is an integral part of such high-level reservoir analysis. In this work, we present a hybrid framework specific to reservoir analysis for an automatic detection of clusters in space using spatial and temporal field data, coupled with a physics-based multiscale modeling approach. In this work a novel hybrid approach is presented in which we couple a physics-based non-local modeling framework with data-driven clustering techniques to provide a fast and accurate multiscale modeling of compartmentalized reservoirs. This research also adds to the literature by presenting a comprehensive work on spatio-temporal clustering for reservoir studies applications that well considers the clustering complexities, the intrinsic sparse and noisy nature of the data, and the interpretability of the outcome. Keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal Clustering; Physics-Based Data-Driven Formulation; Multiscale Modelin

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

    Self-adjustable domain adaptation in personalized ECG monitoring integrated with IR-UWB radar

    Get PDF
    To enhance electrocardiogram (ECG) monitoring systems in personalized detections, deep neural networks (DNNs) are applied to overcome individual differences by periodical retraining. As introduced previously [4], DNNs relieve individual differences by fusing ECG with impulse radio ultra-wide band (IR-UWB) radar. However, such DNN-based ECG monitoring system tends to overfit into personal small datasets and is difficult to generalize to newly collected unlabeled data. This paper proposes a self-adjustable domain adaptation (SADA) strategy to prevent from overfitting and exploit unlabeled data. Firstly, this paper enlarges the database of ECG and radar data with actual records acquired from 28 testers and expanded by the data augmentation. Secondly, to utilize unlabeled data, SADA combines self organizing maps with the transfer learning in predicting labels. Thirdly, SADA integrates the one-class classification with domain adaptation algorithms to reduce overfitting. Based on our enlarged database and standard databases, a large dataset of 73200 records and a small one of 1849 records are built up to verify our proposal. Results show SADA\u27s effectiveness in predicting labels and increments in the sensitivity of DNNs by 14.4% compared with existing domain adaptation algorithms

    Proceedings of the 2nd Computer Science Student Workshop: Microsoft Istanbul, Turkey, April 9, 2011

    Get PDF

    Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

    Full text link
    Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

    Path Similarity Analysis: a Method for Quantifying Macromolecular Pathways

    Full text link
    Diverse classes of proteins function through large-scale conformational changes; sophisticated enhanced sampling methods have been proposed to generate these macromolecular transition paths. As such paths are curves in a high-dimensional space, they have been difficult to compare quantitatively, a prerequisite to, for instance, assess the quality of different sampling algorithms. The Path Similarity Analysis (PSA) approach alleviates these difficulties by utilizing the full information in 3N-dimensional trajectories in configuration space. PSA employs the Hausdorff or Fr\'echet path metrics---adopted from computational geometry---enabling us to quantify path (dis)similarity, while the new concept of a Hausdorff-pair map permits the extraction of atomic-scale determinants responsible for path differences. Combined with clustering techniques, PSA facilitates the comparison of many paths, including collections of transition ensembles. We use the closed-to-open transition of the enzyme adenylate kinase (AdK)---a commonly used testbed for the assessment enhanced sampling algorithms---to examine multiple microsecond equilibrium molecular dynamics (MD) transitions of AdK in its substrate-free form alongside transition ensembles from the MD-based dynamic importance sampling (DIMS-MD) and targeted MD (TMD) methods, and a geometrical targeting algorithm (FRODA). A Hausdorff pairs analysis of these ensembles revealed, for instance, that differences in DIMS-MD and FRODA paths were mediated by a set of conserved salt bridges whose charge-charge interactions are fully modeled in DIMS-MD but not in FRODA. We also demonstrate how existing trajectory analysis methods relying on pre-defined collective variables, such as native contacts or geometric quantities, can be used synergistically with PSA, as well as the application of PSA to more complex systems such as membrane transporter proteins.Comment: 9 figures, 3 tables in the main manuscript; supplementary information includes 7 texts (S1 Text - S7 Text) and 11 figures (S1 Fig - S11 Fig) (also available from journal site

    Slow slip and the transition from fast to slow fronts in the rupture of frictional interfaces

    Full text link
    The failure of the population of micro-junctions forming the frictional interface between two solids is central to fields ranging from biomechanics to seismology. This failure is mediated by the propagation along the interface of various types of rupture fronts, covering a wide range of velocities. Among them are so-called slow fronts, which are recently discovered fronts much slower than the materials' sound speeds. Despite intense modelling activity, the mechanisms underlying slow fronts remain elusive. Here, we introduce a multi-scale model capable of reproducing both the transition from fast to slow fronts in a single rupture event and the short-time slip dynamics observed in recent experiments. We identify slow slip immediately following the arrest of a fast front as a phenomenon sufficient for the front to propagate further at a much slower pace. Whether slow fronts are actually observed is controlled both by the interfacial stresses and by the width of the local distribution of forces among micro-junctions. Our results show that slow fronts are qualitatively different from faster fronts. Since the transition from fast to slow fronts is potentially as generic as slow slip, we anticipate that it might occur in the wide range of systems in which slow slip has been reported, including seismic faults.Comment: 35 pages, 5 primary figures, 6 supporting figures. Post-print version with improvements from review process include
    corecore