14 research outputs found
Reducing energy burden in the power sector: Metrics for assessing energy poverty
What does it mean to build an equitable and just power system, and how do metrics used to assess energy justice affect the power industry? Across the world, countries are undergoing an energy transition. In the past, energy system build-out and operations have left some communities more heavily burdened than others (e.g., higher levels of air pollution or expensive energy bills)
The 2001 GMTK-based SPINE ASR system
This paper provides a detailed description of the University of Washington automatic speech recognition (ASR) system for the 2001 DARPA SPeech In Noisy Environments (SPINE) task. Our system makes heavy use of the graphical modeling toolkit (GMTK), a general purpose graphical modeling-based ASR system that allows arbitrary parameter tying, flexible deterministic and stochastic dependencies between variables, and a generalized maximum likelihood parameter estimation algorithm. In our SPINE system, GMTK was used for acoustic model training whereas feature extraction, speaker adaptation, and first-pass decoding were performed by HTK. Our integrated GMTK/HTK system demonstrates the relative merits provided by each tool. Novel aspects of our SPINE system include the capturing of correlations among feature vectors via a globally-shared factored sparse inverse covariance matrix and generalized EM training. 1
Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM). Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10 % relative improvement over the best unimodal concept detector
IBM research TRECVID-2003 video retrieval system
In this paper we describe our participation in the NIST TRECVID-2003 evaluation. We participated in four tasks of the benchmark including shot boundary detection, high-level feature detection, story segmentation, and search. We describe the different runs we submitted for each track and discuss our performance. 1