14,404 research outputs found
Abstract conceptual feature ratings: the role of emotion, magnitude, and other cognitive domains in the organization of abstract conceptual knowledge.
This study harnessed control ratings of the contribution of different types of information (sensation, action, emotion, thought, social interaction, morality, time, space, quantity, and polarity) to 400 individual abstract and concrete verbal concepts. These abstract conceptual feature (ACF) ratings were used to generate a high dimensional semantic space, from which Euclidean distance measurements between individual concepts were extracted as a metric of the semantic relatedness of those words. The validity of these distances as a marker of semantic relatedness was then tested by evaluating whether they could predict the comprehension performance of a patient with global aphasia on two verbal comprehension tasks. It was hypothesized that if the high-dimensional space generated from ACF control ratings approximates the organization of abstract conceptual space, then words separated by small distances should be more semantically related than words separated by greater distances, and should therefore be more difficult to distinguish for the comprehension-impaired patient, SKO. SKO was significantly worse at identifying targets presented within word pairs with low ACF distances. Response accuracy was not predicted by Latent Semantic Analysis (LSA) cosines, any of the individual feature ratings, or any of the background variables. It is argued that this novel rating procedure provides a window on the semantic attributes of individual abstract concepts, and that multiple cognitive systems may influence the acquisition and organization of abstract conceptual knowledge. More broadly, it is suggested that cognitive models of abstract conceptual knowledge must account for the representation not only of the relationships between abstract concepts but also of the attributes which constitute those individual concepts
The MeSH-gram Neural Network Model: Extending Word Embedding Vectors with MeSH Concepts for UMLS Semantic Similarity and Relatedness in the Biomedical Domain
Eliciting semantic similarity between concepts in the biomedical domain
remains a challenging task. Recent approaches founded on embedding vectors have
gained in popularity as they risen to efficiently capture semantic
relationships The underlying idea is that two words that have close meaning
gather similar contexts. In this study, we propose a new neural network model
named MeSH-gram which relies on a straighforward approach that extends the
skip-gram neural network model by considering MeSH (Medical Subject Headings)
descriptors instead words. Trained on publicly available corpus PubMed MEDLINE,
MeSH-gram is evaluated on reference standards manually annotated for semantic
similarity. MeSH-gram is first compared to skip-gram with vectors of size 300
and at several windows contexts. A deeper comparison is performed with tewenty
existing models. All the obtained results of Spearman's rank correlations
between human scores and computed similarities show that MeSH-gram outperforms
the skip-gram model, and is comparable to the best methods but that need more
computation and external resources.Comment: 6 pages, 2 table
Finding co-solvers on Twitter, with a little help from Linked Data
In this paper we propose a method for suggesting potential collaborators for solving innovation challenges online, based on their competence, similarity of interests and social proximity with the user. We rely on Linked Data to derive a measure of semantic relatedness that we use to enrich both user profiles and innovation problems with additional relevant topics, thereby improving the performance of co-solver recommendation. We evaluate this approach against state of the art methods for query enrichment based on the distribution of topics in user profiles, and demonstrate its usefulness in recommending collaborators that are both complementary in competence and compatible with the user. Our experiments are grounded using data from the social networking service Twitter.com
Improving approximation of domain-focused, corpus-based, lexical semantic relatedness
Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts and concepts represented by these texts is an important part of many text and knowledge processing tasks of crucial importance in many domain-specific scenarios. The problem of most state-of-the-art methods for calculating domain-specific semantic relatedness is their dependence on highly specialized, structured knowledge resources, which makes these methods poorly adaptable for many usage scenarios. On the other hand, the domain knowledge in the fields such as Life Sciences has become more and more accessible, but mostly in its unstructured form - as texts in large document collections, which makes its use more challenging for automated processing.
In this dissertation, three new corpus-based methods for approximating domain-specific textual semantic relatedness are presented and evaluated with a set of standard benchmarks focused on the field of biomedicine. Nonetheless, the proposed measures are general enough to be adapted to other domain-focused scenarios. The evaluation involves comparisons with other relevant state-of-the-art measures for calculating semantic relatedness and the results suggest that the methods presented here perform comparably or better than other approaches.
Additionally, the dissertation also presents an experiment, in which one of the proposed methods is applied within an ontology matching system, DisMatch. The performance of the system was evaluated externally on a biomedically themed ‘Phenotype’ track of the Ontology Alignment Evaluation Initiative 2016 campaign. The results of the track indicate, that the use distributional semantic relatedness for ontology matching is promising, as the system presented in this thesis did stand out in detecting correct mappings that were not detected by any other systems participating in the track.
The work presented in the dissertation indicates an improvement achieved w.r.t. the stat-of-the-art through the domain adapted use of the distributional principle (i.e. the presented methods are corpus-based and do not require additional resources). The ontology matching experiment showcases practical implications of the presented theoretical body of work
SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation
We present SimLex-999, a gold standard resource for evaluating distributional
semantic models that improves on existing resources in several important ways.
First, in contrast to gold standards such as WordSim-353 and MEN, it explicitly
quantifies similarity rather than association or relatedness, so that pairs of
entities that are associated but not actually similar [Freud, psychology] have
a low rating. We show that, via this focus on similarity, SimLex-999
incentivizes the development of models with a different, and arguably wider
range of applications than those which reflect conceptual association. Second,
SimLex-999 contains a range of concrete and abstract adjective, noun and verb
pairs, together with an independent rating of concreteness and (free)
association strength for each pair. This diversity enables fine-grained
analyses of the performance of models on concepts of different types, and
consequently greater insight into how architectures can be improved. Further,
unlike existing gold standard evaluations, for which automatic approaches have
reached or surpassed the inter-annotator agreement ceiling, state-of-the-art
models perform well below this ceiling on SimLex-999. There is therefore plenty
of scope for SimLex-999 to quantify future improvements to distributional
semantic models, guiding the development of the next generation of
representation-learning architectures
- …