1,501 research outputs found
Segmentation of Fault Networks Determined from Spatial Clustering of Earthquakes
We present a new method of data clustering applied to earthquake catalogs,
with the goal of reconstructing the seismically active part of fault networks.
We first use an original method to separate clustered events from uncorrelated
seismicity using the distribution of volumes of tetrahedra defined by closest
neighbor events in the original and randomized seismic catalogs. The spatial
disorder of the complex geometry of fault networks is then taken into account
by defining faults as probabilistic anisotropic kernels, whose structures are
motivated by properties of discontinuous tectonic deformation and previous
empirical observations of the geometry of faults and of earthquake clusters at
many spatial and temporal scales. Combining this a priori knowledge with
information theoretical arguments, we propose the Gaussian mixture approach
implemented in an Expectation-Maximization (EM) procedure. A cross-validation
scheme is then used and allows the determination of the number of kernels that
should be used to provide an optimal data clustering of the catalog. This
three-steps approach is applied to a high quality relocated catalog of the
seismicity following the 1986 Mount Lewis () event in California and
reveals that events cluster along planar patches of about 2 km, i.e.
comparable to the size of the main event. The finite thickness of those
clusters (about 290 m) suggests that events do not occur on well-defined
euclidean fault core surfaces, but rather that the damage zone surrounding
faults may be seismically active at depth. Finally, we propose a connection
between our methodology and multi-scale spatial analysis, based on the
derivation of spatial fractal dimension of about 1.8 for the set of hypocenters
in the Mnt Lewis area, consistent with recent observations on relocated
catalogs
Deep filter banks for texture recognition, description, and segmentation
Visual textures have played a key role in image understanding because they
convey important semantics of images, and because texture representations that
pool local image descriptors in an orderless manner have had a tremendous
impact in diverse applications. In this paper we make several contributions to
texture understanding. First, instead of focusing on texture instance and
material category recognition, we propose a human-interpretable vocabulary of
texture attributes to describe common texture patterns, complemented by a new
describable texture dataset for benchmarking. Second, we look at the problem of
recognizing materials and texture attributes in realistic imaging conditions,
including when textures appear in clutter, developing corresponding benchmarks
on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic
texture representations, including bag-of-visual-words and the Fisher vectors,
in the context of deep learning and show that these have excellent efficiency
and generalization properties if the convolutional layers of a deep model are
used as filter banks. We obtain in this manner state-of-the-art performance in
numerous datasets well beyond textures, an efficient method to apply deep
features to image regions, as well as benefit in transferring features from one
domain to another.Comment: 29 pages; 13 figures; 8 table
Breast Tumor Classification Using an Ensemble Machine Learning Method
Breast cancer is the most common cause of death for women worldwide. Thus, the ability of artificial intelligence systems to detect possible breast cancer is very important. In this paper, an ensemble classification mechanism is proposed based on a majority voting mechanism. First, the performance of different state-of-the-art machine learning classification algorithms were evaluated for the Wisconsin Breast Cancer Dataset (WBCD). The three best classifiers were then selected based on their F3 score. F3 score is used to emphasize the importance of false negatives (recall) in breast cancer classification. Then, these three classifiers, simple logistic regression learning, support vector machine learning with stochastic gradient descent optimization and multilayer perceptron network, are used for ensemble classification using a voting mechanism. We also evaluated the performance of hard and soft voting mechanism. For hard voting, majority-based voting mechanism was used and for soft voting we used average of probabilities, product of probabilities, maximum of probabilities and minimum of probabilities-based voting methods. The hard voting (majority-based voting) mechanism shows better performance with 99.42%, as compared to the state-of-the-art algorithm for WBCD
Data Science and Knowledge Discovery
Data Science (DS) is gaining significant importance in the decision process due to a mix of various areas, including Computer Science, Machine Learning, Math and Statistics, domain/business knowledge, software development, and traditional research. In the business field, DS's application allows using scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data to support the decision process. After collecting the data, it is crucial to discover the knowledge. In this step, Knowledge Discovery (KD) tasks are used to create knowledge from structured and unstructured sources (e.g., text, data, and images). The output needs to be in a readable and interpretable format. It must represent knowledge in a manner that facilitates inferencing. KD is applied in several areas, such as education, health, accounting, energy, and public administration. This book includes fourteen excellent articles which discuss this trending topic and present innovative solutions to show the importance of Data Science and Knowledge Discovery to researchers, managers, industry, society, and other communities. The chapters address several topics like Data mining, Deep Learning, Data Visualization and Analytics, Semantic data, Geospatial and Spatio-Temporal Data, Data Augmentation and Text Mining
Data mining in manufacturing: a review based on the kind of knowledge
In modern manufacturing environments, vast amounts of data are collected in database management systems and data warehouses from all involved areas, including product and process design, assembly, materials planning, quality control, scheduling, maintenance, fault detection etc. Data mining has emerged as an important tool for knowledge acquisition from the manufacturing databases. This paper reviews the literature dealing with knowledge discovery and data mining applications in the broad domain of manufacturing with a special emphasis on the type of functions to be performed on the data. The major data mining functions to be performed include characterization and description, association, classification, prediction, clustering and evolution analysis. The papers reviewed have therefore been categorized in these five categories. It has been shown that there is a rapid growth in the application of data mining in the context of manufacturing processes and enterprises in the last 3 years. This review reveals the progressive applications and existing gaps identified in the context of data mining in manufacturing. A novel text mining approach has also been used on the abstracts and keywords of 150 papers to identify the research gaps and find the linkages between knowledge area, knowledge type and the applied data mining tools and techniques
- …