424 research outputs found

    Extracting Terms with EXTra

    Get PDF
    The identification and extraction of terms play an important role in many areas of knowledge-based applications, such as automatic indexing, knowledge discovery and management, as well as in computational approaches to terminology and lexicography. In this paper, we present EXTra, a tool designed to extract and calculate the degree of termhood of multiword expressions as a function of the statistical distribution of their parts and of the presence of other sub-terms. This work describes EXTra‘s algorithm, and provides the results of its evaluation on a task of term extraction from an Italian corpus of documents belonging to the domain of Public Administration

    Less is MORE: a MultimOdal system for tag REfinement

    Get PDF
    With the proliferation of image-based social media, an ex-tremely large amount of multimodal data is being produced. Very oftenimage contents are published together with a set of user defined meta-data such as tags and textual descriptions. Despite being very useful toenhance traditional image retrieval, user defined tags on social mediahave been proven to be noneffective to index images because they areinfluenced by personal experiences of the owners as well as their will ofpromoting the published contents. To be analyzed and indexed, multi-modal data require algorithms able to jointly deal with textual and visualdata. This research presents a multimodal approach to the problem of tagrefinement, which consists in separating the relevant descriptors (tags)of images from noisy ones. The proposed method exploits both Natu-ral Language Processing (NLP) and Computer Vision (CV) techniquesbased on deep learning to find a match between the textual informationand visual content of social media posts. Textual semantic features arerepresented with (multilingual) word embeddings, while visual ones areobtained with image classification. The proposed system is evaluated ona manually annotated Italian dataset extracted from Instagram achieving68% of weighted F1-scor

    ItEM: A Vector Space Model to Bootstrap an Italian Emotive Lexicon

    Get PDF
    In recent years computational linguistics has seen a rising interest in subjectivity, opinions, feelings and emotions. Even though great attention has been given to polarity recognition, the research in emotion detection has had to rely on small emotion resources. In this paper, we present a methodology to build emotive lexicons by jointly exploiting vector space models and human annotation, and we provide the first results of the evaluation with a crowdsourcing experiment

    CheckIT!:A Corpus of Expert Fact-checked Claims for Italian

    Get PDF
    This paper introduces CheckIT!, a resource of expert fact-checked claims, filling a gap for the development of fact-checking pipelines in Italian. We further investigate the use of three state-of-the-art generative text models to create variations of claims in zero-shot settings as a data-augmentation strategy for the identification of previously fact-checked claims. Our results indicate that models struggles in varying the surface forms of the claims.</p

    CheckIT!:A Corpus of Expert Fact-checked Claims for Italian

    Get PDF
    This paper introduces CheckIT!, a resource of expert fact-checked claims, filling a gap for the development of fact-checking pipelines in Italian. We further investigate the use of three state-of-the-art generative text models to create variations of claims in zero-shot settings as a data-augmentation strategy for the identification of previously fact-checked claims. Our results indicate that models struggles in varying the surface forms of the claims.</p

    Preface to the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI)

    Get PDF
    Natural Language Processing (NLP) is an important research topic in Artificial Intelligence (AI), as it is the target of different scientific and industrial interests. Natural Language is at the crossroad of Learning, Knowledge Representation, and Cognitive Modeling. Several recent AI achievements have repeatedly shown their beneficial impact on complex inference tasks, with huge application perspectives in linguistic modeling, processing, and inferences. However, Natural Language Understanding is still a rich research topic, whose cross-fertilization spans a number of independent areas such as Cognitive Computing, Robotics as well as HumanComputer Interaction. For AI, Natural Languages are the research focus of paradigms and applications but, at the same time, they act as cornerstones of automation, autonomy, and learnability for most intelligent phenomena ranging from Vision to Planning and Social Behaviors. A reflection about such diverse and promising interactions is an important target for current AI studies, fully in the core mission of AI*IA. This workshop, supported by the Special Interest Group on NLP of AI*IA1 and by the Italian Association of Computational Linguistics (AILC)2, aims at providing a broad overview of recent activities in the eld of Human Language Technologies (HLT) in Italy. In this context, the organization of NL4AI 2021 [1] provided researchers with the opportunity to share experiences and insights about AI applications focused on NLP in several domains. The 2022 edition of NL4AI is co-located with the 21th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2022), taking place on November 30th in Udine, Italy. The program of the meeting is available on the official workshop website3. We received 17 submissions, 14 of which were accepted after peer-review

    Text Frame Detector: Slot Filling Based On Domain Knowledge Bases

    Get PDF
    In this paper we present a systemcalledText Frame Detector(TFD) whichaims at populating a frame-based ontologyin a graph-based structure. Our systemorganizes textual information into frames,according to a predefined set of semanti-cally informed patterns linking pre-codedinformation such as named entities, sim-ple and complex terms. Given the semi-automatic expansion of such informationwith word embeddings, the system can beeasily adapted to new domains
    • …
    corecore