Search CORE

28,785 research outputs found

Multimodal person recognition for human-vehicle interaction

Author: Abut Huseyin
Abut Hüseyin
Ercil Aytul
Erdogan Hakan
Erdoğan Hakan
Erzin Engin
Erçil Aytül
Tekalp A. Murat
Yemez Yucel
Yemez Yücel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2006
Field of study

Next-generation vehicles will undoubtedly feature biometric person recognition as part of an effort to improve the driving experience. Today's technology prevents such systems from operating satisfactorily under adverse conditions. A proposed framework for achieving person recognition successfully combines different biometric modalities, borne out in two case studies

Sabanci University Research Database

High-level feature detection from video in TRECVid: a 5-year retrospective of achievements

Author: A. F. Smeaton
A. F. Smeaton
A. Loui
A. P. Natsev
A. Smeulders
C. G. M. Snoek
C. G. Snoek
E. Yilmaz
M. G. Christel
M. Naphade
M. R. Naphade
P. Joly
P. Over
T. Volkmer
W. Kraaij
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Successful and effective content-based access to digital video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip. The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work done on the TRECVid high-level feature task, showing the progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can achieve large-scale, fast and reliable high-level feature detection on video

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis

Author: Borth Damian
Elizalde Benjamin
Lane Ian
Raj Bhiksha
Sager Sebastian
Schulze Christian
Publication venue
Publication date: 08/01/2018
Field of study

Recently, sound recognition has been used to identify sounds, such as car and river. However, sounds have nuances that may be better described by adjective-noun pairs such as slow car, and verb-noun pairs such as flying insects, which are under explored. Therefore, in this work we investigate the relation between audio content and both adjective-noun pairs and verb-noun pairs. Due to the lack of datasets with these kinds of annotations, we collected and processed the AudioPairBank corpus consisting of a combined total of 1,123 pairs and over 33,000 audio files. One contribution is the previously unavailable documentation of the challenges and implications of collecting audio recordings with these type of labels. A second contribution is to show the degree of correlation between the audio content and the labels through sound recognition experiments, which yielded results of 70% accuracy, hence also providing a performance benchmark. The results and study in this paper encourage further exploration of the nuances in audio and are meant to complement similar research performed on images and text in multimedia analysis.Comment: This paper is a revised version of "AudioSentibank: Large-scale Semantic Ontology of Acoustic Concepts for Audio Content Analysis

arXiv.org e-Print Archive

Directory of Open Access Journals

The DRIVE-SAFE project: signal processing and advanced information technologies for improving driving prudence and accidents

Author: Abut Huseyin
Abut Hüseyin
Ercil Aytul
Erçil Aytül
Guvenc Levent
Güvenç Levent
Publication venue
Publication date: 01/11/2006
Field of study

In this paper, we will talk about the Drivesafe project whose aim is creating conditions for prudent driving on highways and roadways with the purposes of reducing accidents caused by driver behavior. To achieve these primary goals, critical data is being collected from multimodal sensors (such as cameras, microphones, and other sensors) to build a unique databank on driver behavior. We are developing system and technologies for analyzing the data and automatically determining potentially dangerous situations (such as driver fatigue, distraction, etc.). Based on the findings from these studies, we will propose systems for warning the drivers and taking other precautionary measures to avoid accidents once a dangerous situation is detected. In order to address these issues a national consortium has been formed including Automotive Research Center (OTAM), Koç University, Istanbul Technical University, Sabancı University, Ford A.S., Renault A.S., and Fiat A. Ş

Sabanci University Research Database

K-Space at TRECVid 2008

Author: Adamek Tomasz
Byrne Daragh
Jones Gareth J.F.
Keenan Gordon
Lee Hyowon
McGuinness Kevin
O'Connor Noel E.
O'Hare Neil
Smeaton Alan F.
Wilkins Peter
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/11/2008
Field of study

In this paper we describe K-Space’s participation in TRECVid 2008 in the interactive search task. For 2008 the K-Space group performed one of the largest interactive video information retrieval experiments conducted in a laboratory setting. We had three institutions participating in a multi-site multi-system experiment. In total 36 users participated, 12 each from Dublin City University (DCU, Ireland), University of Glasgow (GU, Scotland) and Centrum Wiskunde & Informatica (CWI, the Netherlands). Three user interfaces were developed, two from DCU which were also used in 2007 as well as an interface from GU. All interfaces leveraged the same search service. Using a latin squares arrangement, each user conducted 12 topics, leading in total to 6 runs per site, 18 in total. We officially submitted for evaluation 3 of these runs to NIST with an additional expert run using a 4th system. Our submitted runs performed around the median. In this paper we will present an overview of the search system utilized, the experimental setup and a preliminary analysis of our results

Irish Universities

DCU Online Research Access Service

RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA

Author: Baumann Stefan
Grice Martine
Krenn Brigitte
Pirker Hannes
Piwek Paul
Schroeder Marc
Publication venue
Publication date: 01/01/2002
Field of study

In this paper, we describe the Rich Representation Language (RRL) which is used in the NECA system. The NECA system generates interactions between two or more animated characters. The RRL is a formal framework for representing the information that is exchanged at the interfaces between the various NECA system modules

CiteSeerX

Open Research Online (The Open University)

K-Space at TRECVID 2008

Author: Adamek T.
Amin A.
Avrithis Y.
Bailer W.
Benmokhtar R.
Byrne D.
Chandramouli K.
Cobet A.
Dumont E.
Goldmann L.
Goyal A.
Haller M.
Halvey M.
Hannah D.
Hopfgartner F.
Huet B.
Izquierdo E.
Jones G.
Jose J.M.
Keenan G.
Kompatsiaris I.
Lee H.
McGuinness K.
Merialdo B.
Mezaris V.
Moerzinger R.
O'Connor N.
O'Hare N.
Papadopoulous G.
Praks P.
Punitha P.
Samour A.
Schallauer P.
Sikora T.
Smeaton A.F.
Spyrou E.
Tolias G.
Troncy R.
Villa R.
Wilkins P.
Publication venue
Publication date: 01/01/2008
Field of study

In this paper we describe K-Space’s participation in TRECVid 2008 in the interactive search task. For 2008 the K-Space group performed one of the largest interactive video information retrieval experiments conducted in a laboratory setting. We had three institutions participating in a multi-site multi-system experiment. In total 36 users participated, 12 each from Dublin City University (DCU, Ireland), University of Glasgow (GU, Scotland) and Centrum Wiskunde and Informatica (CWI, the Netherlands). Three user interfaces were developed, two from DCU which were also used in 2007 as well as an interface from GU. All interfaces leveraged the same search service. Using a latin squares arrangement, each user conducted 12 topics, leading in total to 6 runs per site, 18 in total. We officially submitted for evaluation 3 of these runs to NIST with an additional expert run using a 4th system. Our submitted runs performed around the median. In this paper we will present an overview of the search system utilized, the experimental setup and a preliminary analysis of our results

DSpace at NTUA

Enlighten

K-Space at TRECVid 2007

Author: Adamek Tomasz
Byrne Daragh
Jones Gareth J.F.
Keenan Gordon
Lee Hyowon
McGuinness Kevin
O'Connor Noel E.
Smeaton Alan F.
Wilkins Peter
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/11/2007
Field of study

In this paper we describe K-Space participation in TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance. The first of the two systems was a ‘shot’ based interface, where the results from a query were presented as a ranked list of shots. The second interface was ‘broadcast’ based, where results were presented as a ranked list of broadcasts. Both systems made use of the outputs of our high-level feature submission as well as low-level visual features

Irish Universities

DCU Online Research Access Service