59,496 research outputs found
LODE: Linking Digital Humanities Content to the Web of Data
Numerous digital humanities projects maintain their data collections in the
form of text, images, and metadata. While data may be stored in many formats,
from plain text to XML to relational databases, the use of the resource
description framework (RDF) as a standardized representation has gained
considerable traction during the last five years. Almost every digital
humanities meeting has at least one session concerned with the topic of digital
humanities, RDF, and linked data. While most existing work in linked data has
focused on improving algorithms for entity matching, the aim of the
LinkedHumanities project is to build digital humanities tools that work "out of
the box," enabling their use by humanities scholars, computer scientists,
librarians, and information scientists alike. With this paper, we report on the
Linked Open Data Enhancer (LODE) framework developed as part of the
LinkedHumanities project. With LODE we support non-technical users to enrich a
local RDF repository with high-quality data from the Linked Open Data cloud.
LODE links and enhances the local RDF repository without compromising the
quality of the data. In particular, LODE supports the user in the enhancement
and linking process by providing intuitive user-interfaces and by suggesting
high-quality linking candidates using tailored matching algorithms. We hope
that the LODE framework will be useful to digital humanities scholars
complementing other digital humanities tools
Deep Multimodal Image-Repurposing Detection
Nefarious actors on social media and other platforms often spread rumors and
falsehoods through images whose metadata (e.g., captions) have been modified to
provide visual substantiation of the rumor/falsehood. This type of modification
is referred to as image repurposing, in which often an unmanipulated image is
published along with incorrect or manipulated metadata to serve the actor's
ulterior motives. We present the Multimodal Entity Image Repurposing (MEIR)
dataset, a substantially challenging dataset over that which has been
previously available to support research into image repurposing detection. The
new dataset includes location, person, and organization manipulations on
real-world data sourced from Flickr. We also present a novel, end-to-end, deep
multimodal learning model for assessing the integrity of an image by combining
information extracted from the image with related information from a knowledge
base. The proposed method is compared against state-of-the-art techniques on
existing datasets as well as MEIR, where it outperforms existing methods across
the board, with AUC improvement up to 0.23.Comment: To be published at ACM Multimeda 2018 (orals
Automated census record linking: a machine learning approach
Thanks to the availability of new historical census sources and advances in record linking technology, economic historians are becoming big data genealogists. Linking individuals over time and between databases has opened up new avenues for research into intergenerational mobility, assimilation, discrimination, and the returns to education. To take advantage of these new research opportunities, scholars need to be able to accurately and efficiently match historical records and produce an unbiased dataset of links for downstream analysis. I detail a standard and transparent census matching technique for constructing linked samples that can be replicated across a variety of cases. The procedure applies insights from machine learning classification and text comparison to the well known problem of record linkage, but with a focus on the sorts of costs and benefits of working with historical data. I begin by extracting a subset of possible matches for each record, and then use training data to tune a matching algorithm that attempts to minimize both false positives and false negatives, taking into account the inherent noise in historical records. To make the procedure precise, I trace its application to an example from my own work, linking children from the 1915 Iowa State Census to their adult-selves in the 1940 Federal Census. In addition, I provide guidance on a number of practical questions, including how large the training data needs to be relative to the sample.This research has been
supported by the NSF-IGERT Multidisciplinary Program in Inequality & Social Policy at Harvard
University (Grant No. 0333403)
A Frame Tracking Model for Memory-Enhanced Dialogue Systems
Recently, resources and tasks were proposed to go beyond state tracking in
dialogue systems. An example is the frame tracking task, which requires
recording multiple frames, one for each user goal set during the dialogue. This
allows a user, for instance, to compare items corresponding to different goals.
This paper proposes a model which takes as input the list of frames created so
far during the dialogue, the current user utterance as well as the dialogue
acts, slot types, and slot values associated with this utterance. The model
then outputs the frame being referenced by each triple of dialogue act, slot
type, and slot value. We show that on the recently published Frames dataset,
this model significantly outperforms a previously proposed rule-based baseline.
In addition, we propose an extensive analysis of the frame tracking task by
dividing it into sub-tasks and assessing their difficulty with respect to our
model
An Integrated Framework for Sensing Radio Frequency Spectrum Attacks on Medical Delivery Drones
Drone susceptibility to jamming or spoofing attacks of GPS, RF, Wi-Fi, and
operator signals presents a danger to future medical delivery systems. A
detection framework capable of sensing attacks on drones could provide the
capability for active responses. The identification of interference attacks has
applicability in medical delivery, disaster zone relief, and FAA enforcement
against illegal jamming activities. A gap exists in the literature for solo or
swarm-based drones to identify radio frequency spectrum attacks. Any
non-delivery specific function, such as attack sensing, added to a drone
involves a weight increase and additional complexity; therefore, the value must
exceed the disadvantages. Medical delivery, high-value cargo, and disaster zone
applications could present a value proposition which overcomes the additional
costs. The paper examines types of attacks against drones and describes a
framework for designing an attack detection system with active response
capabilities for improving the reliability of delivery and other medical
applications.Comment: 7 pages, 1 figures, 5 table
- …