710 research outputs found
QUERY-SPECIFIC SUBTOPIC CLUSTERING IN RESPONSE TO BROAD QUERIES
Information Retrieval (IR) refers to obtaining valuable and relevant information from various sources in response to a specific information need. For the textual domain, the most common form of information sources is a collection of textual documents or text corpus. Depending on the scope of the information need, also referred to as the query, the relevant information can span a wide range of topical themes. Hence, the relevant information may often be scattered through multiple documents in the corpus, and each satisfies the information need to varying degrees. Traditional IR systems present the relevant set of documents in the form of a ranking where the rank of a particular document corresponds to its degree of relevance to the query.
If the query is sufficiently specific, the set of relevant documents will be more or less about similar topics. However, they will be much more topically diverse when the query is vague or about a generalized topic, e.g., ``Computer science. In such cases, multiple documents may be of equal importance as each represents a specific facade of the broad topic of the query. Consider, for example, documents related to information retrieval and machine learning for the query ``Computer Science. In this case, the decision to rank documents from these two subtopics would be ambiguous. Instead, presenting the retrieved results as a cluster of documents where each cluster represents one subtopic would be more appropriate. Subtopic clustering of search results has been explored in the domain of Web-search, where users receive relevant clusters of search results in response to their query.
This thesis explores query-specific subtopic clustering that incorporates queries into the clustering framework. We develop a query-specific similarity metric that governs a hierarchical clustering algorithm. The similarity metric is trained to predict whether a pair of relevant documents should also share the same subtopic cluster in the context of the query. Our empirical study shows that direct involvement of the query in the clustering model significantly improves the clustering performance over a state-of-the-art neural approach on two publicly available datasets. Further qualitative studies provide insights into the strengths and limitations of our proposed approach.
In addition to query-specific similarity metrics, this thesis also explores a new supervised clustering paradigm that directly optimizes for a clustering metric. Being discrete functions, existing approaches for supervised clustering find it difficult to use a clustering metric for optimization. We propose a scalable training strategy for document embedding models that directly optimizes for the RAND index, a clustering quality metric. Our method outperforms a strong neural approach and other unsupervised baselines on two publicly available datasets. This suggests that optimizing directly for the clustering outcome indeed yields better document representations suitable for clustering.
This thesis also studies the generalizability of our findings by incorporating the query-specific clustering approach and our clustering metric-based optimization technique into a single end-to-end supervised clustering model. Also, we extend our methods to different clustering algorithms to show that our approaches are not dependent on any specific clustering algorithm. Having such a generalized query-specific clustering model will help to revolutionize the way digital information is organized, archived, and presented to the user in a context-aware manner
Machine Learning Models for Context-Aware Recommender Systems
The mass adoption of the internet has resulted in the exponential growth of products and services on the world wide web. An individual consumer, faced with this data deluge, is expected to make reasonable choices saving time and money. Organizations are facing increased competition, and they are looking for innovative ways to increase revenue and customer loyalty. A business wants to target the right product or service to an individual consumer, and this drives personalized recommendation. Recommender systems, designed to provide personalized recommendations, initially focused only on the user-item interaction. However, these systems evolved to provide a context-aware recommendations. Context-aware recommender systems utilize additional context, such as genre for movie recommendation, while recommending items to users. Latent factor methods have been a popular choice for recommender systems. With the resurgence of neural networks, there has also been a trend towards applying deep learning methods to recommender systems.
This study proposes a novel contextual latent factor model that is capable of utilizing the context from a dual-perspective of both users and items. The proposed model, known as the Group-Aware Latent Factor Model (GLFM), is applied to the event recommendation task. The GLFM model is extensible, and it allows other contextual attributes to be easily be incorporated into the model. While latent-factor models have been extremely popular for recommender systems, they are unable to model the complex non-linear user-item relationships. This has resulted in the interest in applying deep learning methods to recommender systems. This study also proposes another novel method based on the denoising autoencoder architecture, which is referred to as the Attentive Contextual Denoising Autoencoder (ACDA). The ACDA model augments the basic denoising autoencoder with a context-driven attention mechanism to provide personalized recommendation. The ACDA model is applied to the event and movie recommendation tasks.
The effectiveness of the proposed models is demonstrated against real-world datasets from Meetup and Movielens, and the results are compared against the current state-of-the-art baseline methods
Machine Learning Models for Educational Platforms
Scaling up education online and onlife is presenting numerous key challenges, such as hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely. However, thanks to the wider availability of learning-related data and increasingly higher performance computing, Artificial Intelligence has the potential to turn such challenges into an unparalleled opportunity. One of its sub-fields, namely Machine Learning, is enabling machines to receive data and learn for themselves, without being programmed with rules. Bringing this intelligent support to education at large scale has a number of advantages, such as avoiding manual error-prone tasks and reducing the chance that learners do any misconduct. Planning, collecting, developing, and predicting become essential steps to make it concrete into real-world education.
This thesis deals with the design, implementation, and evaluation of Machine Learning models in the context of online educational platforms deployed at large scale. Constructing and assessing the performance of intelligent models is a crucial step towards increasing reliability and convenience of such an educational medium. The contributions result in large data sets and high-performing models that capitalize on Natural Language Processing, Human Behavior Mining, and Machine Perception. The model decisions aim to support stakeholders over the instructional pipeline, specifically on content categorization, content recommendation, learners’ identity verification, and learners’ sentiment analysis. Past research in this field often relied on statistical processes hardly applicable at large scale. Through our studies, we explore opportunities and challenges introduced by Machine Learning for the above goals, a relevant and timely topic in literature.
Supported by extensive experiments, our work reveals a clear opportunity in combining human and machine sensing for researchers interested in online education. Our findings illustrate the feasibility of designing and assessing Machine Learning models for categorization, recommendation, authentication, and sentiment prediction in this research area. Our results provide guidelines on model motivation, data collection, model design, and analysis techniques concerning the above applicative scenarios. Researchers can use our findings to improve data collection on educational platforms, to reduce bias in data and models, to increase model effectiveness, and to increase the reliability of their models, among others. We expect that this thesis can support the adoption of Machine Learning models in educational platforms even more, strengthening the role of data as a precious asset. The thesis outputs are publicly available at https://www.mirkomarras.com
RouteKG: A knowledge graph-based framework for route prediction on road networks
Short-term route prediction on road networks allows us to anticipate the
future trajectories of road users, enabling a plethora of intelligent
transportation applications such as dynamic traffic control or personalized
route recommendation. Despite recent advances in this area, existing methods
focus primarily on learning sequential transition patterns, neglecting the
inherent spatial structural relations in road networks that can affect human
routing decisions. To fill this gap, this paper introduces RouteKG, a novel
Knowledge Graph-based framework for route prediction. Specifically, we
construct a Knowledge Graph on the road network, thereby learning and
leveraging spatial relations, especially moving directions, which are crucial
for human navigation. Moreover, an n-ary tree-based algorithm is introduced to
efficiently generate top-K routes in a batch mode, enhancing scalability and
computational efficiency. To further optimize the prediction performance, a
rank refinement module is incorporated to fine-tune the candidate route
rankings. The model performance is evaluated using two real-world vehicle
trajectory datasets from two Chinese cities, Chengdu and Shanghai, under
various practical scenarios. The results demonstrate a significant improvement
in accuracy over baseline methods.We further validate our model through a case
study that utilizes the pre-trained model as a simulator for real-time traffic
flow estimation at the link level. The proposed RouteKG promises wide-ranging
applications in vehicle navigation, traffic management, and other intelligent
transportation tasks
Does Negative Sampling Matter? A Review with Insights into its Theory and Applications
Negative sampling has swiftly risen to prominence as a focal point of
research, with wide-ranging applications spanning machine learning, computer
vision, natural language processing, data mining, and recommender systems. This
growing interest raises several critical questions: Does negative sampling
really matter? Is there a general framework that can incorporate all existing
negative sampling methods? In what fields is it applied? Addressing these
questions, we propose a general framework that leverages negative sampling.
Delving into the history of negative sampling, we trace the development of
negative sampling through five evolutionary paths. We dissect and categorize
the strategies used to select negative sample candidates, detailing global,
local, mini-batch, hop, and memory-based approaches. Our review categorizes
current negative sampling methods into five types: static, hard, GAN-based,
Auxiliary-based, and In-batch methods, providing a clear structure for
understanding negative sampling. Beyond detailed categorization, we highlight
the application of negative sampling in various areas, offering insights into
its practical benefits. Finally, we briefly discuss open problems and future
directions for negative sampling.Comment: 20 pages, 11 figure
- …