Search CORE

4,698 research outputs found

Peircean semiotics and text linguistic models

Author: van den Hoven P.J.
Publication venue: Nanjing Normal University Press
Publication date: 01/01/2010
Field of study

Experiments to investigate the utility of nearest neighbour metrics based on linguistically informed features for detecting textual plagiarism

Author: Almquist Per
Karlgren Jussi
Publication venue
Publication date: 01/01/2011
Field of study

Plagiarism detection is a challenge for linguistic models — most current implemented models use simple occurrence statistics for linguistic items. In this paper we report two experiments related to plagiarism detection where we use a model for distributional semantics and of sentence stylistics to compare sentence by sentence the likelihood of a text being partly plagiarised. The result of the comparison are displayed for visual inspection by a plagiarism assessor

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

DSpace at Tartu University Library

On Explaining Multimodal Hateful Meme Detection Models

Author: Chong Wen-Haw
Hee Ming Shan
Lee Roy Ka-Wei
Publication venue
Publication date: 01/04/2022
Field of study

Hateful meme detection is a new multimodal task that has gained significant traction in academic and industry research communities. Recently, researchers have applied pre-trained visual-linguistic models to perform the multimodal classification task, and some of these solutions have yielded promising results. However, what these visual-linguistic models learn for the hateful meme classification task remains unclear. For instance, it is unclear if these models are able to capture the derogatory or slurs references in multimodality (i.e., image and text) of the hateful memes. To fill this research gap, this paper propose three research questions to improve our understanding of these visual-linguistic models performing the hateful meme classification task. We found that the image modality contributes more to the hateful meme classification task, and the visual-linguistic models are able to perform visual-text slurs grounding to a certain extent. Our error analysis also shows that the visual-linguistic models have acquired biases, which resulted in false-positive predictions

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Linguistically Grounded Models of Language Change

Author: Poibeau Thierry
Publication venue
Publication date: 01/01/2006
Field of study

Questions related to the evolution of language have recently known an impressive increase of interest (Briscoe, 2002). This short paper aims at questioning the scientific status of these models and their relations to attested data. We show that one cannot directly model non-linguistic factors (exogenous factors) even if they play a crucial role in language evolution. We then examine the relation between linguistic models and attested language data, as well as their contribution to cognitive linguistics

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

HAL-Paris 13

Automatic extraction of linguistic models for image description

Author: Baturone Castillo María Iluminada
Gersnoviez A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

This paper describes a methodology to extract fuzzy models that describe linguistically the low-level features of an image (such as color, texture, etc.). The methodology combines grid-based algorithms with clustering and tabular simplification methods to compress image information into a small number of fuzzy rules with high linguistic meaning. All the steps of the methodology are carried out with the help offered by the tools of Xfuzzy 3 environment, so we can define, simplify, tune and verify the fuzzy models automatically. Several examples are included to illustrate the advantages of the methodolog

Repositorio Institucional de la Universidad de Córdoba

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Large Linguistic Models: Analyzing theoretical linguistic abilities of LLMs

Author: Beguš Gašper
Dąbkowski Maksymilian
Rhodes Ryan
Publication venue
Publication date: 21/08/2023
Field of study

The performance of large language models (LLMs) has recently improved to the point where the models can perform well on many language tasks. We show here that for the first time, the models can also generate coherent and valid formal analyses of linguistic data and illustrate the vast potential of large language models for analyses of their metalinguistic abilities. LLMs are primarily trained on language data in the form of text; analyzing and evaluating their metalinguistic abilities improves our understanding of their general capabilities and sheds new light on theoretical models in linguistics. In this paper, we probe into GPT-4's metalinguistic capabilities by focusing on three subfields of formal linguistics: syntax, phonology, and semantics. We outline a research program for metalinguistic analyses of large language models, propose experimental designs, provide general guidelines, discuss limitations, and offer future directions for this line of research. This line of inquiry also exemplifies behavioral interpretability of deep learning, where models' representations are accessed by explicit prompting rather than internal representations

arXiv.org e-Print Archive

Digital Stylometry: Linking Profiles Across Social Networks

Author: Roy Deb
Vosoughi Soroush
Zhou Helen
Publication venue
Publication date: 01/01/2015
Field of study

There is an ever growing number of users with accounts on multiple social media and networking sites. Consequently, there is increasing interest in matching user accounts and profiles across different social networks in order to create aggregate profiles of users. In this paper, we present models for Digital Stylometry, which is a method for matching users through stylometry inspired techniques. We experimented with linguistic, temporal, and combined temporal-linguistic models for matching user accounts, using standard and novel techniques. Using publicly available data, our best model, a combined temporal-linguistic one, was able to correctly match the accounts of 31% of 5,612 distinct users across Twitter and Facebook.Comment: SocInfo'15, Beijing, China. In proceedings of the 7th International Conference on Social Informatics (SocInfo 2015). Beijing, Chin

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Analysis criteria of logic and linguistic models of natural language sentences

Author: Вавіленкова Анастасія Ігорівна
Publication venue: 'National Technical University Kharkiv Polytechnic Institute'
Publication date: 01/01/2017
Field of study

Для здійснення змістовного аналізу електронних текстових документів запропоновано використовувати формальні логіко-лінгвістичні моделі. Метою статті є опис критеріїв аналізу формальних моделей, що здатні відображати зміст речень природної мови та формуються з використанням математичного апарату логіки предикатів. Описані критерії аналізу логіко-лінгвістичних моделей необхідні для побудови формальних моделей електронних текстових документів.The article describes the main text models used today as a tool for content processing electronic text documents. To make a content analysis author proposes to use formal logic and linguistic models, which are based on functional relationships between the principal and subordinate parts of natural language sentences. The article is to describe the criteria for analysis of formal models that can reflect the content of natural language sentences and which are formed using mathematical tools of predicate logic. For this purpose, the study researches principles of construction of logic and linguistic models of natural language sentences and formulates four criteria of analysis. First criterion analyzes the number of simple predicates in logic and linguistic model that helps to identify information about the type and composition of natural language sentences. The second criterion analyzes potency of set of predicate variables and constants of logic and linguistic model, which affects the number of simple predicates and identifies the type of individual forms of logic and linguistic model. The third criterion focuses on the analysis of logical operations that used in logic and linguistic model. That makes it possible to analyze the sequence of considerations referred to the natural language sentence. The forth criterion examines the presence of identical components in logic and linguistic models of natural language sentences from different sets of predicate variables and constants. Described analysis criteria of logic and linguistic models required to build formal models of electronic text documents using the mathematical apparatus of predicate logic

Electronic National Technical University "Kharkiv Polytechnic Institute" Institutional Repository (eNTUKhPIIR)

Intelligent fuzzy controller for event-driven real time systems

Author: Grantner Janos
Patyra Marek
Stachowicz Marian S.
Publication venue
Publication date
Field of study

Most of the known linguistic models are essentially static, that is, time is not a parameter in describing the behavior of the object's model. In this paper we show a model for synchronous finite state machines based on fuzzy logic. Such finite state machines can be used to build both event-driven, time-varying, rule-based systems and the control unit section of a fuzzy logic computer. The architecture of a pipelined intelligent fuzzy controller is presented, and the linguistic model is represented by an overall fuzzy relation stored in a single rule memory. A VLSI integrated circuit implementation of the fuzzy controller is suggested. At a clock rate of 30 MHz, the controller can perform 3 MFLIPS on multi-dimensional fuzzy data

NASA Technical Reports Server

Recognizing People by Body Shape Using Deep Networks of Images and Words

Author: Castillo Carlos D.
Gandi Veda Nandan
Hill Matthew Q.
Jaggernauth Lucas
Metz Thomas M.
Myers Blake A.
O'Toole Alice J.
Publication venue
Publication date: 30/05/2023
Field of study

Common and important applications of person identification occur at distances and viewpoints in which the face is not visible or is not sufficiently resolved to be useful. We examine body shape as a biometric across distance and viewpoint variation. We propose an approach that combines standard object classification networks with representations based on linguistic (word-based) descriptions of bodies. Algorithms with and without linguistic training were compared on their ability to identify people from body shape in images captured across a large range of distances/views (close-range, 100m, 200m, 270m, 300m, 370m, 400m, 490m, 500m, 600m, and at elevated pitch in images taken by an unmanned aerial vehicle [UAV]). Accuracy, as measured by identity-match ranking and false accept errors in an open-set test, was surprisingly good. For identity-ranking, linguistic models were more accurate for close-range images, whereas non-linguistic models fared better at intermediary distances. Fusion of the linguistic and non-linguistic embeddings improved performance at all, but the farthest distance. Although the non-linguistic model yielded fewer false accepts at all distances, fusion of the linguistic and non-linguistic models decreased false accepts for all, but the UAV images. We conclude that linguistic and non-linguistic representations of body shape can offer complementary identity information for bodies that can improve identification in applications of interest.Comment: 9 pages, 5 figures, 4 table

arXiv.org e-Print Archive