7,518 research outputs found
Finding Person Relations in Image Data of the Internet Archive
The multimedia content in the World Wide Web is rapidly growing and contains
valuable information for many applications in different domains. For this
reason, the Internet Archive initiative has been gathering billions of
time-versioned web pages since the mid-nineties. However, the huge amount of
data is rarely labeled with appropriate metadata and automatic approaches are
required to enable semantic search. Normally, the textual content of the
Internet Archive is used to extract entities and their possible relations
across domains such as politics and entertainment, whereas image and video
content is usually neglected. In this paper, we introduce a system for person
recognition in image content of web news stored in the Internet Archive. Thus,
the system complements entity recognition in text and allows researchers and
analysts to track media coverage and relations of persons more precisely. Based
on a deep learning face recognition approach, we suggest a system that
automatically detects persons of interest and gathers sample material, which is
subsequently used to identify them in the image data of the Internet Archive.
We evaluate the performance of the face recognition system on an appropriate
standard benchmark dataset and demonstrate the feasibility of the approach with
two use cases
Recommended from our members
Finding the traces of behavioral and cognitive processes in big data and naturally occurring datasets.
Today, people generate and store more data than ever before as they interact with both real and virtual environments. These digital traces of behavior and cognition offer cognitive scientists and psychologists an unprecedented opportunity to test theories outside the laboratory. Despite general excitement about big data and naturally occurring datasets among researchers, three gaps stand in the way of their wider adoption in theory-driven research: the imagination gap, the skills gap, and the culture gap. We outline an approach to bridging these three gaps while respecting our responsibilities to the public as participants in and consumers of the resulting research. To that end, we introduce Data on the Mind ( http://www.dataonthemind.org ), a community-focused initiative aimed at meeting the unprecedented challenges and opportunities of theory-driven research with big data and naturally occurring datasets. We argue that big data and naturally occurring datasets are most powerfully used to supplement-not supplant-traditional experimental paradigms in order to understand human behavior and cognition, and we highlight emerging ethical issues related to the collection, sharing, and use of these powerful datasets
ADVANCES IN KNOWLEDGE DISCOVERY IN DATABASES
The Knowledge Discovery in Databases and Data Mining field proposes the development of methods and techniques for assigning useful meanings for data stored in databases. It gathers researches from many study fields like machine learning, pattern recognition, databases, statistics, artificial intelligence, knowledge acquisition for expert systems, data visualization and grids. While Data Mining represents a set of specific algorithms of finding useful meanings in stored data, Knowledge Discovery in Databases represents the overall process of finding knowledge and includes the Data Mining as one step among others such as selection, pre�processing, transformation and interpretation of mined data. This paper aims to point the most important steps that were made in the Knowledge Discovery in Databases field of study and to show how the overall process of discovering can be improved in the future.
- …