1,865 research outputs found
Training an adaptive dialogue policy for interactive learning of visually grounded word meanings
We present a multi-modal dialogue system for interactive learning of
perceptually grounded word meanings from a human tutor. The system integrates
an incremental, semantic parsing/generation framework - Dynamic Syntax and Type
Theory with Records (DS-TTR) - with a set of visual classifiers that are
learned throughout the interaction and which ground the meaning representations
that it produces. We use this system in interaction with a simulated human
tutor to study the effects of different dialogue policies and capabilities on
the accuracy of learned meanings, learning rates, and efforts/costs to the
tutor. We show that the overall performance of the learning agent is affected
by (1) who takes initiative in the dialogues; (2) the ability to express/use
their confidence level about visual attributes; and (3) the ability to process
elliptical and incrementally constructed dialogue turns. Ultimately, we train
an adaptive dialogue policy which optimises the trade-off between classifier
accuracy and tutoring costs.Comment: 11 pages, SIGDIAL 2016 Conferenc
Machine-assisted mixed methods: augmenting humanities and social sciences with artificial intelligence
The increasing capacities of large language models (LLMs) present an
unprecedented opportunity to scale up data analytics in the humanities and
social sciences, augmenting and automating qualitative analytic tasks
previously typically allocated to human labor. This contribution proposes a
systematic mixed methods framework to harness qualitative analytic expertise,
machine scalability, and rigorous quantification, with attention to
transparency and replicability. 16 machine-assisted case studies are showcased
as proof of concept. Tasks include linguistic and discourse analysis, lexical
semantic change detection, interview analysis, historical event cause inference
and text mining, detection of political stance, text and idea reuse, genre
composition in literature and film; social network inference, automated
lexicography, missing metadata augmentation, and multimodal visual cultural
analytics. In contrast to the focus on English in the emerging LLM
applicability literature, many examples here deal with scenarios involving
smaller languages and historical texts prone to digitization distortions. In
all but the most difficult tasks requiring expert knowledge, generative LLMs
can demonstrably serve as viable research instruments. LLM (and human)
annotations may contain errors and variation, but the agreement rate can and
should be accounted for in subsequent statistical modeling; a bootstrapping
approach is discussed. The replications among the case studies illustrate how
tasks previously requiring potentially months of team effort and complex
computational pipelines, can now be accomplished by an LLM-assisted scholar in
a fraction of the time. Importantly, this approach is not intended to replace,
but to augment researcher knowledge and skills. With these opportunities in
sight, qualitative expertise and the ability to pose insightful questions have
arguably never been more critical
VendorLink: An NLP approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets
The anonymity on the Darknet allows vendors to stay undetected by using
multiple vendor aliases or frequently migrating between markets. Consequently,
illegal markets and their connections are challenging to uncover on the
Darknet. To identify relationships between illegal markets and their vendors,
we propose VendorLink, an NLP-based approach that examines writing patterns to
verify, identify, and link unique vendor accounts across text advertisements
(ads) on seven public Darknet markets. In contrast to existing literature,
VendorLink utilizes the strength of supervised pre-training to perform
closed-set vendor verification, open-set vendor identification, and
low-resource market adaption tasks. Through VendorLink, we uncover (i) 15
migrants and 71 potential aliases in the Alphabay-Dreams-Silk dataset, (ii) 17
migrants and 3 potential aliases in the Valhalla-Berlusconi dataset, and (iii)
75 migrants and 10 potential aliases in the Traderoute-Agora dataset.
Altogether, our approach can help Law Enforcement Agencies (LEA) make more
informed decisions by verifying and identifying migrating vendors and their
potential aliases on existing and Low-Resource (LR) emerging Darknet markets
- …