Search CORE

82 research outputs found

Reply to Drs Kalyoncu, Selçuk and Çöplü

Author: Saraçlar Y.
Publication venue: Published by Elsevier Ltd.
Publication date: 31/05/1998
Field of study

Elsevier - Publisher Connector

Rethinking classification results based on read speech, or: why improvements do not always transfer to other speaking styles

Author: A Field
A Juneja
A Salomon
AM Abdelatti Ali
B Schölkopf
Barbara Schuppler
C Cortes
CY Espy-Wilson
DMW Powers
F Metze
F Pernkopf
J Frankel
JM Kessens
K Johnson
K Kirchhoff
K Manjunath
KJ Kohler
M Saraçlar
O Scharenborg
O Scharenborg
P Niyogi
R Ogden
S Chang
S Greenberg
S King
S King
SM Siniscalchi
T Pruthi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Crossref

TUGraz OPEN Library

Feature analysis for discriminative confidence estimation in spoken term detection

Author: Akbacak
Almuallim
Almuallim
Ayed
Bekkerman
Ben-Bassat
Bergen
Bi
Bishop
Breiman
Can
Chan
Chase
Chen
Cox
Deligne
Dong Wang
Doroteo T. Toledano
Duda
Duda
Forman
Furey
Gadde
Gillick
Goldwater
Good
Guyon
Guyon
Hain
Hall
Hastie
Hauptmann
Hellevik
Jansen
Javier Tejedor
Jiang
José Colás
Kamppari
Kao
Kemp
Kira
Kira
Kohavi
Koller
Kononenko
Langley
Liaw
Logan
Mamou
Mamou
Manos
Mathan
Meng
Moreno
Motlicek
Neti
Ou
Parada
Parlak
Pinto
Rohlicek
Saeys
Saraçlar
Schaaf
Shafran
Simon King
Siu
Stolcke
Sudoh
Sukkar
Szöke
Szöke
Szöke
Tejedor
Tejedor
Thambiratmann
Tibshirani
Torkkola
Tusher
Vergyri
Vergyri
Wallace
Wallace
Wang
Wang
Wang
Wang
Wang
Weintraub
Weston
Yu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

This is the author’s version of a work that was accepted for publication in Computer Speech & Language. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Speech & Language, 28, 5, (2014) DOI: 10.1016/j.csl.2013.09.008Discriminative confidence based on multi-layer perceptrons (MLPs) and multiple features has shown significant advantage compared to the widely used lattice-based confidence in spoken term detection (STD). Although the MLP-based framework can handle any features derived from a multitude of sources, choosing all possible features may lead to over complex models and hence less generality. In this paper, we design an extensive set of features and analyze their contribution to STD individually and as a group. The main goal is to choose a small set of features that are sufficiently informative while keeping the model simple and generalizable. We employ two established models to conduct the analysis: one is linear regression which targets for the most relevant features and the other is logistic linear regression which targets for the most discriminative features. We find the most informative features are comprised of those derived from diverse sources (ASR decoding, duration and lexical properties) and the two models deliver highly consistent feature ranks. STD experiments on both English and Spanish data demonstrate significant performance gains with the proposed feature sets.This work has been partially supported by project PriorSPEECH (TEC2009-14719-C02-01) from the Spanish Ministry of Science and Innovation and by project MAV2VICMR (S2009/TIC-1542) from the Community of Madrid

CiteSeerX

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Edinburgh Research Explorer

Biblos-e Archivo

Reply to Drs Kalyoncu, Selçuk and Çöplü

Author: Kalyoncu
Kalyoncu
Robertson
Saraçlar
Y. Saraçlar
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Multi-stream long short-term memory neural network language model

Author: Arısoy Ebru
Saraçlar Murat
Publication venue
Publication date: 01/01/2015
Field of study

Ebru Arısoy (MEF Author)##nofulltext##Long Short-Term Memory (LSTM) neural networks are recurrent neural networks that contain memory units that can store contextual information from past inputs for arbitrary amounts of time. A typical LSTM neural network language model is trained by feeding an input sequence. i.e., a stream of words, to the input layer of the network and the output layer predicts the probability of the next word given the past inputs in the sequence. In this paper we introduce a multi-stream LSTM neural network language model where multiple asynchronous input sequences are fed to the network as parallel streams while predicting the output word sequence. For our experiments, we use a sub-word sequence in addition to a word sequence as the input streams, which allows joint training of the LSTM neural network language model using both information sources.WOS:000380581600296Scopus - Affiliation ID: 60105072Conference Proceedings Citation Index- ScienceProceedings PaperEylül2015YÖK - 2015-1

MEF University Institutional Repository

Kişiselleştirilmiş ve uyarlamalı öğrenme için sözlü dersanlatımlarının işlenmesi

Author: Arısoy Saraçlar Ebru
Publication venue: The Scientific and Technological Research Council of Turkey (TUBITAK-ULAKBIM) - DIGITAL COMMONS JOURNALS
Publication date: 01/01/2021
Field of study

Proje No: 117E202Bu projede kişiselleştirilmiş ve uyarlamalı eğitim için gelişmiş konuşma ve dil işleme teknolojileri geliştirip özellikle çevrimiçi ders videolarından öğrenme sürecini verimli hale getirmek amaçlanmıştır. Geliştirilen proje bildiğimiz kadarıyla gelişmiş konuşma ve dil işleme uygulamalarının eğitim teknolojilerinde kullanıldığı en kapsamlı projelerden biridir ve bu sayede eğitim teknolojileri literatürüne de katkı sağlamıştır. Projenin gerçeklenmesi süresince Otomatik Konuşma Tanıma (OKT), konuşma geri getirimi, özellikle sözlü soru cevaplama ve duygu ve düşünce analizi üzerine özgün çalışmalar yapılmıştır. Sistemler hem İngilizce hem de Türkçe ders anlatım videoları için geliştirilmiştir. OKT alanında hem hibrit akustik model içeren hem de uçtan uca eğitilen sistemler için sistem uyarlaması yapılmıştır. Bu sayede ders anlatım videoları üzerinde yüksek başarımla çalışan OKT sistemleri geliştirilmiştir. Sözlü soru cevaplama sistemi ise ilk defa ders anlatım videoları alanına bu proje ile uygulanmıştır. Sözlü soru cevaplama sistemine ait zorlukların aşılması için özgün yöntemler önerilmiş ve bu yöntemler başarıyla uygulanmıştır. Bu sayede sözlü soru cevaplama literatürüne katkı sağlanmıştır. Bu çalışmalara ek olarak ders anlatım videolarına eşlik eden öğrenci yorumları duygu ve düşünce analizi yöntemleri ile olumlu veya olumsuz olarak sınıflandırılmış ve öğrencilerin videoda anlatılan konuyu anlayıp anlamadıkları ölçülmüştür. Bu çalışma duygu ve düşünce analizi sisteminin eğitim teknolojileri alanında öğrencilerin öğrenme düzeyinin analizinde kullanılması özgünlüğünü içermektedir. Projenin çıktısı olarak OKT ve soru cevaplama sistemlerini içeren çevrimiçi bir öğrenme platformu üniversite içinde gerçeklenmiş ve 2020 - 2021 bahar döneminde öğrencilerin kullanımına açılmıştır. Özellikle COVID-19 pandemisiyle hayatımıza etkin olarak giren uzaktan eğitim modeline proje kapsamında geliştirilen sistemin katkı sağlama potansiyeli bulunmaktadır. Bu yüzden geliştirilen sistemin uzaktan eğitim sürecine etkin olarak dahil edilmesi üzerine ilerleyen dönemlerde çalışmalar yapılacaktır.Mar

MEF University Institutional Repository