691 research outputs found
An Analytical Performance Evaluation on Multiview Clustering Approaches
The concept of machine learning encompasses a wide variety of different approaches, one of which is called clustering. The data points are grouped together in this approach to the problem. Using a clustering method, it is feasible, given a collection of data points, to classify each data point as belonging to a specific group. This can be done if the algorithm is given the collection of data points. In theory, data points that constitute the same group ought to have attributes and characteristics that are equivalent to one another, however data points that belong to other groups ought to have properties and characteristics that are very different from one another. The generation of multiview data is made possible by recent developments in information collecting technologies. The data were collected from à variety of sources and were analysed using a variety of perspectives. The data in question are what are known as multiview data. On a single view, the conventional clustering algorithms are applied. In spite of this, real-world data are complicated and can be clustered in a variety of different ways, depending on how the data are interpreted. In practise, the real-world data are messy. In recent years, Multiview Clustering, often known as MVC, has garnered an increasing amount of attention due to its goal of utilising complimentary and consensus information derived from different points of view. On the other hand, the vast majority of the systems that are currently available only enable the single-clustering scenario, whereby only makes utilization of a single cluster to split the data. This is the case since there is only one cluster accessible. In light of this, it is absolutely necessary to carry out investigation on the multiview data format. The study work is centred on multiview clustering and how well it performs compared to these other strategies
Representation Learning for Words and Entities
This thesis presents new methods for unsupervised learning of distributed
representations of words and entities from text and knowledge bases. The first
algorithm presented in the thesis is a multi-view algorithm for learning
representations of words called Multiview Latent Semantic Analysis (MVLSA). By
incorporating up to 46 different types of co-occurrence statistics for the same
vocabulary of english words, I show that MVLSA outperforms other
state-of-the-art word embedding models. Next, I focus on learning entity
representations for search and recommendation and present the second method of
this thesis, Neural Variational Set Expansion (NVSE). NVSE is also an
unsupervised learning method, but it is based on the Variational Autoencoder
framework. Evaluations with human annotators show that NVSE can facilitate
better search and recommendation of information gathered from noisy, automatic
annotation of unstructured natural language corpora. Finally, I move from
unstructured data and focus on structured knowledge graphs. I present novel
approaches for learning embeddings of vertices and edges in a knowledge graph
that obey logical constraints.Comment: phd thesis, Machine Learning, Natural Language Processing,
Representation Learning, Knowledge Graphs, Entities, Word Embeddings, Entity
Embedding
- …