396 research outputs found

    Models to represent linguistic linked data

    Get PDF
    As the interest of the Semantic Web and computational linguistics communities in linguistic linked data (LLD) keeps increasing and the number of contributions that dwell on LLD rapidly grows, scholars (and linguists in particular) interested in the development of LLD resources sometimes find it difficult to determine which mechanism is suitable for their needs and which challenges have already been addressed. This review seeks to present the state of the art on the models, ontologies and their extensions to represent language resources as LLD by focusing on the nature of the linguistic content they aim to encode. Four basic groups of models are distinguished in this work: models to represent the main elements of lexical resources (group 1), vocabularies developed as extensions to models in group 1 and ontologies that provide more granularity on specific levels of linguistic analysis (group 2), catalogues of linguistic data categories (group 3) and other models such as corpora models or service-oriented ones (group 4). Contributions encompassed in these four groups are described, highlighting their reuse by the community and the modelling challenges that are still to be faced

    The Compositional Nature of Verb and Argument Representations in the Human Brain

    Get PDF
    How does the human brain represent simple compositions of objects, actors,and actions? We had subjects view action sequence videos during neuroimaging (fMRI) sessions and identified lexical descriptions of those videos by decoding (SVM) the brain representations based only on their fMRI activation patterns. As a precursor to this result, we had demonstrated that we could reliably and with high probability decode action labels corresponding to one of six action videos (dig, walk, etc.), again while subjects viewed the action sequence during scanning (fMRI). This result was replicated at two different brain imaging sites with common protocols but different subjects, showing common brain areas, including areas known for episodic memory (PHG, MTL, high level visual pathways, etc.,i.e. the 'what' and 'where' systems, and TPJ, i.e. 'theory of mind'). Given these results, we were also able to successfully show a key aspect of language compositionality based on simultaneous decoding of object class and actor identity. Finally, combining these novel steps in 'brain reading' allowed us to accurately estimate brain representations supporting compositional decoding of a complex event composed of an actor, a verb, a direction, and an object.Comment: 11 pages, 6 figure

    Testing the robustness of laws of polysemy and brevity versus frequency

    Get PDF
    The pioneering research of G.K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. Here we focus on a couple of them: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. Here we evaluate the robustness of these laws in contexts where they have not been explored yet to our knowledge. The recovery of the laws again in new conditions provides support for the hypothesis that they originate from abstract mechanisms.Peer ReviewedPostprint (author's final draft

    Semantic Matching Using the UMLS

    Full text link

    A framework for automatic semantic video annotation

    Get PDF
    The rapidly increasing quantity of publicly available videos has driven research into developing automatic tools for indexing, rating, searching and retrieval. Textual semantic representations, such as tagging, labelling and annotation, are often important factors in the process of indexing any video, because of their user-friendly way of representing the semantics appropriate for search and retrieval. Ideally, this annotation should be inspired by the human cognitive way of perceiving and of describing videos. The difference between the low-level visual contents and the corresponding human perception is referred to as the ‘semantic gap’. Tackling this gap is even harder in the case of unconstrained videos, mainly due to the lack of any previous information about the analyzed video on the one hand, and the huge amount of generic knowledge required on the other. This paper introduces a framework for the Automatic Semantic Annotation of unconstrained videos. The proposed framework utilizes two non-domain-specific layers: low-level visual similarity matching, and an annotation analysis that employs commonsense knowledgebases. Commonsense ontology is created by incorporating multiple-structured semantic relationships. Experiments and black-box tests are carried out on standard video databases for action recognition and video information retrieval. White-box tests examine the performance of the individual intermediate layers of the framework, and the evaluation of the results and the statistical analysis show that integrating visual similarity matching with commonsense semantic relationships provides an effective approach to automated video annotation

    Exploring the measurement of markedness and its relationship with other linguistic variables

    Get PDF
    Antonym pair members can be differentiated by each word's markedness-that distinction attributable to the presence or absence of features at morphological or semantic levels. Morphologically marked words incorporate their unmarked counterpart with additional morphs (e.g., "unlucky" vs. "lucky"); properties used to determine semantically marked words (e.g., "short" vs. "long") are less clearly defined. Despite extensive theoretical scrutiny, the lexical properties of markedness have received scant empirical study. The current paper employs an antonym sequencing approach to measure markedness: establishing markedness probabilities for individual words and evaluating their relationship with other lexical properties (e.g., length, frequency, valence). Regression analyses reveal that markedness probability is, as predicted, related to affixation and also strongly related to valence. Our results support the suggestion that antonym sequence is reflected in discourse, and further analysis demonstrates that markedness probabilities, derived from the antonym sequencing task, reflect the ordering of antonyms within natural language. In line with the Pollyanna Hypothesis, we argue that markedness is closely related to valence; language users demonstrate a tendency to present words evaluated positively ahead of those evaluated negatively if given the choice. Future research should consider the relationship of markedness and valence, and the influence of contextual information in determining which member of an antonym pair is marked or unmarked within discourse

    Detecting deceptive reviews using argumentation

    Get PDF
    The unstoppable rise of social networks and the web is facing a serious challenge: identifying the truthfulness of online opinions and reviews. In this paper we use Argumentation Frameworks (AFs) extracted from reviews and explore whether the use of these AFs can improve the performance of machine learning techniques in detecting deceptive behaviour, resulting from users lying in order to mislead readers. The AFs represent how arguments from reviews relate to arguments from other reviews as well as to arguments about the goodness of the items being reviewed

    Linking geographic vocabularies through WordNet

    Get PDF
    The linked open data (LOD) paradigm has emerged as a promising approach to structuring and sharing geospatial information. One of the major obstacles to this vision lies in the difficulties found in the automatic integration between heterogeneous vocabularies and ontologies that provides the semantic backbone of the growing constellation of open geo-knowledge bases. In this article, we show how to utilize WordNet as a semantic hub to increase the integration of LOD. With this purpose in mind, we devise Voc2WordNet, an unsupervised mapping technique between a given vocabulary and WordNet, combining intensional and extensional aspects of the geographic terms. Voc2WordNet is evaluated against a sample of human-generated alignments with the OpenStreetMap (OSM) Semantic Network, a crowdsourced geospatial resource, and the GeoNames ontology, the vocabulary of a large digital gazetteer. These empirical results indicate that the approach can obtain high precision and recall
    • …
    corecore