1,341 research outputs found

    A Review of Codebook Models in Patch-Based Visual Object Recognition

    No full text
    The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods

    An Analytical Performance Evaluation on Multiview Clustering Approaches

    Get PDF
    The concept of machine learning encompasses a wide variety of different approaches, one of which is called clustering. The data points are grouped together in this approach to the problem. Using a clustering method, it is feasible, given a collection of data points, to classify each data point as belonging to a specific group. This can be done if the algorithm is given the collection of data points. In theory, data points that constitute the same group ought to have attributes and characteristics that are equivalent to one another, however data points that belong to other groups ought to have properties and characteristics that are very different from one another. The generation of multiview data is made possible by recent developments in information collecting technologies. The data were collected from Ă  variety of sources and were analysed using a variety of perspectives. The data in question are what are known as multiview data. On a single view, the conventional clustering algorithms are applied. In spite of this, real-world data are complicated and can be clustered in a variety of different ways, depending on how the data are interpreted. In practise, the real-world data are messy. In recent years, Multiview Clustering, often known as MVC, has garnered an increasing amount of attention due to its goal of utilising complimentary and consensus information derived from different points of view. On the other hand, the vast majority of the systems that are currently available only enable the single-clustering scenario, whereby only makes utilization of a single cluster to split the data. This is the case since there is only one cluster accessible. In light of this, it is absolutely necessary to carry out investigation on the multiview data format. The study work is centred on multiview clustering and how well it performs compared to these other strategies

    Data Driven Approach To Saltwater Disposal (SWD) Well Location Optimization In North Dakota

    Get PDF
    The sharp increase in oil and gas production in the Williston Basin of North Dakota since 2006 has resulted in a significant increase in produced water volumes. Primary mechanism for disposal of produced water is by injection into underground Inyan Kara formation through Class-II Saltwater Disposal (SWD) wells. With number of SWD wells anticipated to increase from 900 to over 1400 by 2035, localized pressurization and other potential issues that could affect performance of future oil and SWD wells, there was a need for a reliable model to select locations of future SWD wells for optimum performance. Since it is uncommon to develop traditional geological and simulation models for SWD wells, this research focused on developing data-driven proxy models based on the CRISP-Data Mining pipeline for understanding SWD well performance and optimizing future well locations. NDIC’s oil and gas division was identified as the primary data source. Significant efforts went towards identifying other secondary data sources, extracting required data from primary and secondary data sources using web scraping, integrating different data types including spatial data and creating the final data set. Orange visual programming application and Python programming language were used to carry out the required data mining activities. Exploratory Data Analysis and clustering analysis were used to gain a good understanding of the features in the data set and their relationships. Graph Data Science techniques such as Knowledge Graphs and graph-based clustering were used to gain further insights. Machine Learning regression algorithms such as Multi-Linear Regression, k-Nearest Neighbors and Random Forest were used to train machine learning models to predict average monthly barrels of saltwater disposed in a well. Model performance was optimized using the RMSE metric and the Random Forest model was selected as the final model for deployment to predict performance of a planned SWD well. A multi-target regression model was trained using deep neural network to predict water production in oil and gas wells drilled in the McKenzie county of North Dakota

    Unsupervised Representation Learning with Minimax Distance Measures

    Get PDF
    We investigate the use of Minimax distances to extract in a nonparametric way the features that capture the unknown underlying patterns and structures in the data. We develop a general-purpose and computationally efficient framework to employ Minimax distances with many machine learning methods that perform on numerical data. We study both computing the pairwise Minimax distances for all pairs of objects and as well as computing the Minimax distances of all the objects to/from a fixed (test) object. We first efficiently compute the pairwise Minimax distances between the objects, using the equivalence of Minimax distances over a graph and over a minimum spanning tree constructed on that. Then, we perform an embedding of the pairwise Minimax distances into a new vector space, such that their squared Euclidean distances in the new space equal to the pairwise Minimax distances in the original space. We also study the case of having multiple pairwise Minimax matrices, instead of a single one. Thereby, we propose an embedding via first summing up the centered matrices and then performing an eigenvalue decomposition to obtain the relevant features. In the following, we study computing Minimax distances from a fixed (test) object which can be used for instance in K-nearest neighbor search. Similar to the case of all-pair pairwise Minimax distances, we develop an efficient and general-purpose algorithm that is applicable with any arbitrary base distance measure. Moreover, we investigate in detail the edges selected by the Minimax distances and thereby explore the ability of Minimax distances in detecting outlier objects. Finally, for each setting, we perform several experiments to demonstrate the effectiveness of our framework.Comment: 32 page
    • …
    corecore