678 research outputs found
Efficient Data Analytics on Augmented Similarity Triplets
Many machine learning methods (classification, clustering, etc.) start with a
known kernel that provides similarity or distance measure between two objects.
Recent work has extended this to situations where the information about objects
is limited to comparisons of distances between three objects (triplets). Humans
find the comparison task much easier than the estimation of absolute
similarities, so this kind of data can be easily obtained using crowd-sourcing.
In this work, we give an efficient method of augmenting the triplets data, by
utilizing additional implicit information inferred from the existing data.
Triplets augmentation improves the quality of kernel-based and kernel-free data
analytics tasks. Secondly, we also propose a novel set of algorithms for common
supervised and unsupervised machine learning tasks based on triplets. These
methods work directly with triplets, avoiding kernel evaluations. Experimental
evaluation on real and synthetic datasets shows that our methods are more
accurate than the current best-known techniques
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Many vision and language tasks require commonsense reasoning beyond
data-driven image and natural language processing. Here we adopt Visual
Question Answering (VQA) as an example task, where a system is expected to
answer a question in natural language about an image. Current state-of-the-art
systems attempted to solve the task using deep neural architectures and
achieved promising performance. However, the resulting systems are generally
opaque and they struggle in understanding questions for which extra knowledge
is required. In this paper, we present an explicit reasoning layer on top of a
set of penultimate neural network based systems. The reasoning layer enables
reasoning and answering questions where additional knowledge is required, and
at the same time provides an interpretable interface to the end users.
Specifically, the reasoning layer adopts a Probabilistic Soft Logic (PSL) based
engine to reason over a basket of inputs: visual relations, the semantic parse
of the question, and background ontological knowledge from word2vec and
ConceptNet. Experimental analysis of the answers and the key evidential
predicates generated on the VQA dataset validate our approach.Comment: 9 pages, 3 figures, AAAI 201
6MapNet: Representing soccer players from tracking data by a triplet network
Although the values of individual soccer players have become astronomical,
subjective judgments still play a big part in the player analysis. Recently,
there have been new attempts to quantitatively grasp players' styles using
video-based event stream data. However, they have some limitations in
scalability due to high annotation costs and sparsity of event stream data. In
this paper, we build a triplet network named 6MapNet that can effectively
capture the movement styles of players using in-game GPS data. Without any
annotation of soccer-specific actions, we use players' locations and velocities
to generate two types of heatmaps. Our subnetworks then map these heatmap pairs
into feature vectors whose similarity corresponds to the actual similarity of
playing styles. The experimental results show that players can be accurately
identified with only a small number of matches by our method.Comment: 12 pages, 4 figures, In 8th Workshop on Machine Learning and Data
Mining for Sports Analytics (MLSA21
Questionnaire integration system based on question classification and short text semantic textual similarity, A
2018 Fall.Includes bibliographical references.Semantic integration from heterogeneous sources involves a series of NLP tasks. Existing re- search has focused mainly on measuring two paired sentences. However, to find possible identical texts between two datasets, the sentences are not paired. To avoid pair-wise comparison, this thesis proposed a semantic similarity measuring system equipped with a precategorization module. It applies a hybrid question classification module, which subdivides all texts to coarse categories. The sentences are then paired from these subcategories. The core task is to detect identical texts between two sentences, which relates to the semantic textual similarity task in the NLP field. We built a short text semantic textual similarity measuring module. It combined conventional NLP techniques, including both semantic and syntactic features, with a Recurrent Convolutional Neural Network to accomplish an ensemble model. We also conducted a set of empirical evaluations. The results show that our system possesses a degree of generalization ability, and it performs well on heterogeneous sources
Visual knowledge representation of conceptual semantic networks
This article presents methods of using visual analysis to visually represent large amounts of massive, dynamic, ambiguous data allocated in a repository of learning objects. These methods are based on the semantic representation of these resources. We use a graphical model represented as a semantic graph. The formalization of the semantic graph has been intuitively built to solve a real problem which is browsing and searching for lectures in a vast repository of colleges/courses located at Western Kentucky University1. This study combines Formal Concept Analysis (FCA) with Semantic Factoring to decompose complex, vast concepts into their primitives in order to develop knowledge representation for the HyperManyMedia2 platform. Also, we argue that the most important factor in building the semantic representation is defining the hierarchical structure and the relationships among concepts and subconcepts. In addition, we investigate the association between concepts using Concept Analysis to generate a lattice graph. Our domain is considered as a graph, which represents the integrated ontology of the HyperManyMedia platform. This approach has been implemented and used by online students at WKU3
A Joint-Reasoning based Disease Q&A System
Medical question answer (QA) assistants respond to lay users' health-related
queries by synthesizing information from multiple sources using natural
language processing and related techniques. They can serve as vital tools to
alleviate issues of misinformation, information overload, and complexity of
medical language, thus addressing lay users' information needs while reducing
the burden on healthcare professionals. QA systems, the engines of such
assistants, have typically used either language models (LMs) or knowledge
graphs (KG), though the approaches could be complementary. LM-based QA systems
excel at understanding complex questions and providing well-formed answers, but
are prone to factual mistakes. KG-based QA systems, which represent facts well,
are mostly limited to answering short-answer questions with pre-created
templates. While a few studies have jointly used LM and KG approaches for
text-based QA, this was done to answer multiple-choice questions. Extant QA
systems also have limitations in terms of automation and performance. We
address these challenges by designing a novel, automated disease QA system
which effectively utilizes both LM and KG techniques through a joint-reasoning
approach to answer disease-related questions appropriate for lay users. Our
evaluation of the system using a range of quality metrics demonstrates its
efficacy over benchmark systems, including the popular ChatGPT.Comment: 36 pages, 6 figures, submitted to TMIS on 14 July 2023 (status: under
review
- …