57,021 research outputs found
Finding Person Relations in Image Data of the Internet Archive
The multimedia content in the World Wide Web is rapidly growing and contains
valuable information for many applications in different domains. For this
reason, the Internet Archive initiative has been gathering billions of
time-versioned web pages since the mid-nineties. However, the huge amount of
data is rarely labeled with appropriate metadata and automatic approaches are
required to enable semantic search. Normally, the textual content of the
Internet Archive is used to extract entities and their possible relations
across domains such as politics and entertainment, whereas image and video
content is usually neglected. In this paper, we introduce a system for person
recognition in image content of web news stored in the Internet Archive. Thus,
the system complements entity recognition in text and allows researchers and
analysts to track media coverage and relations of persons more precisely. Based
on a deep learning face recognition approach, we suggest a system that
automatically detects persons of interest and gathers sample material, which is
subsequently used to identify them in the image data of the Internet Archive.
We evaluate the performance of the face recognition system on an appropriate
standard benchmark dataset and demonstrate the feasibility of the approach with
two use cases
- …