Search CORE

4 research outputs found

A Decision Making Framework for Road User Cost Analysis along Freeway Work Zone Projects

Author: Ates Ozan K.
Publication venue: Ohio University / OhioLINK
Publication date: 09/06/2014
Field of study

OhioLINK Electronic Thesis and Dissertation Center

SPEECH DETECTION ON BROADCAST AUDIO

Author: Acar Banu Oskay
Ates Tugrul K.
Esen Ersin
Onur Duygu Oskay
Ozan Ezgi Can
Zubari Unal
Çiloğlu Tolga
Publication venue
Publication date: 27/08/2010
Field of study

Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-speech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's)

OpenMETU (Middle East Technical University)

Content Based Copy Detection with Coarse Audio-Visual Fingerprints

Author: Acar Banu Oskay
Alatan Abdullah Aydın
Ates Tugrul K.
Ciloglu Tolga
Esen Ersin
Ozalp Egemen
Ozan Ezgi C.
Saracoglu Ahmet
Zubari Uenal
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Content based copy detection (CBCD) emerges as a viable choice against active detection methodology of watermarking. The very first reason is that the media already under circulation cannot be marked and secondly, CBCD inherently can endure various severe attacks, which watermarking cannot. Although in general, media content is handled independently as visual and audio in this work both information sources are utilized in a unified framework, in which coarse representation of fundamental features are employed. From the copy detection perspective, number of attacks on audio content is limited with respect to visual case. Therefore audio, if present, is an indispensable part of a robust video copy detection system. In this study, the validity of this statement is presented through various experiments on a large data set

OpenMETU (Middle East Technical University)

Multimodal concept detection in broadcast media: KavTan

Author: Acar Banu Oskay
Alatan A. Aydin
Alatan Abdullah Aydın
Arabaci Mehmet Ali
Ates Tugrul K.
ATIL Ilkay
ESEN Ersin
KARADENİZ Talha
Ozan Ezgi Can
Ozkan Savas
SARACOĞLU Ahmet
SELÇUK Sezin
SEVİMLİ Hakan
SEVİNÇ Muge
SOYSAL Medeni
TANKIZ Seda
TEKİN Mashar
Çiloğlu Tolga
ÖNÜR Duygu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2014
Field of study

Concept detection stands as an important problem for efficient indexing and retrieval in large video archives. In this work, the KavTan System, which performs high-level semantic classification in one of the largest TV archives of Turkey, is presented. In this system, concept detection is performed using generalized visual and audio concept detection modules that are supported by video text detection, audio keyword spotting and specialized audio-visual semantic detection components. The performance of the presented framework was assessed objectively over a wide range of semantic concepts (5 high-level, 14 visual, 9 audio, 2 supplementary) by using a significant amount of precisely labeled ground truth data. KavTan System achieves successful high-level concept detection performance in unconstrained TV broadcast by efficiently utilizing multimodal information that is systematically extracted from both spatial and temporal extent of multimedia data

OpenMETU (Middle East Technical University)