6,411 research outputs found
Finding the different patterns in buildings data using bag of words representation with clustering
The understanding of the buildings operation has become a challenging task
due to the large amount of data recorded in energy efficient buildings. Still,
today the experts use visual tools for analyzing the data. In order to make the
task realistic, a method has been proposed in this paper to automatically
detect the different patterns in buildings. The K Means clustering is used to
automatically identify the ON (operational) cycles of the chiller. In the next
step the ON cycles are transformed to symbolic representation by using Symbolic
Aggregate Approximation (SAX) method. Then the SAX symbols are converted to bag
of words representation for hierarchical clustering. Moreover, the proposed
technique is applied to real life data of adsorption chiller. Additionally, the
results from the proposed method and dynamic time warping (DTW) approach are
also discussed and compared
A Study on Variational Component Splitting approach for Mixture Models
Increase in use of mobile devices and the introduction of cloud-based services have resulted in the generation of enormous amount of data every day. This calls for the need to group these data appropriately into proper categories. Various clustering techniques have been introduced over the years to learn the patterns in data that might better facilitate the classification process. Finite mixture model is one of the crucial methods used for this task. The basic idea of mixture models is to fit the data at hand to an appropriate distribution. The design of mixture models hence involves finding the appropriate parameters of the distribution and estimating the number of clusters in the data. We use a variational component splitting framework to do this which could simultaneously
learn the parameters of the model and estimate the number of components in the model. The variational algorithm helps to overcome the computational complexity of purely Bayesian approaches and the over fitting problems experienced with Maximum Likelihood approaches guaranteeing convergence. The choice of distribution remains the core concern of mixture models in recent research. The efficiency of Dirichlet family of distributions for this purpose has been proved in latest studies especially for non-Gaussian data. This led us to study the impact of variational component splitting approach on mixture models based on several distributions. Hence, our contribution is the application of variational component splitting approach to design finite mixture models based on inverted Dirichlet, generalized inverted Dirichlet and inverted Beta-Liouville distributions. In addition, we also incorporate a simultaneous feature selection approach for generalized inverted Dirichlet mixture model along with component splitting as another experimental contribution. We evaluate the performance of our models with various real-life applications such as object, scene, texture, speech and video categorization
Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation
The task of a visual landmark recognition system is to identify photographed
buildings or objects in query photos and to provide the user with relevant
information on them. With their increasing coverage of the world's landmark
buildings and objects, Internet photo collections are now being used as a
source for building such systems in a fully automatic fashion. This process
typically consists of three steps: clustering large amounts of images by the
objects they depict; determining object names from user-provided tags; and
building a robust, compact, and efficient recognition index. To this date,
however, there is little empirical information on how well current approaches
for those steps perform in a large-scale open-set mining and recognition task.
Furthermore, there is little empirical information on how recognition
performance varies for different types of landmark objects and where there is
still potential for improvement. With this paper, we intend to fill these gaps.
Using a dataset of 500k images from Paris, we analyze each component of the
landmark recognition pipeline in order to answer the following questions: How
many and what kinds of objects can be discovered automatically? How can we best
use the resulting image clusters to recognize the object in a query? How can
the object be efficiently represented in memory for recognition? How reliably
can semantic information be extracted? And finally: What are the limiting
factors in the resulting pipeline from query to semantics? We evaluate how
different choices of methods and parameters for the individual pipeline steps
affect overall system performance and examine their effects for different query
categories such as buildings, paintings or sculptures
- …