382 research outputs found
Towards a query language for annotation graphs
The multidimensional, heterogeneous, and temporal nature of speech databases
raises interesting challenges for representation and query. Recently,
annotation graphs have been proposed as a general-purpose representational
framework for speech databases. Typical queries on annotation graphs require
path expressions similar to those used in semistructured query languages.
However, the underlying model is rather different from the customary graph
models for semistructured data: the graph is acyclic and unrooted, and both
temporal and inclusion relationships are important. We develop a query language
and describe optimization techniques for an underlying relational
representation.Comment: 8 pages, 10 figure
ATLAS: A flexible and extensible architecture for linguistic annotation
We describe a formal model for annotating linguistic artifacts, from which we
derive an application programming interface (API) to a suite of tools for
manipulating these annotations. The abstract logical model provides for a range
of storage formats and promotes the reuse of tools that interact through this
API. We focus first on ``Annotation Graphs,'' a graph model for annotations on
linear signals (such as text and speech) indexed by intervals, for which
efficient database storage and querying techniques are applicable. We note how
a wide range of existing annotated corpora can be mapped to this annotation
graph model. This model is then generalized to encompass a wider variety of
linguistic ``signals,'' including both naturally occuring phenomena (as
recorded in images, video, multi-modal interactions, etc.), as well as the
derived resources that are increasingly important to the engineering of natural
language processing systems (such as word lists, dictionaries, aligned
bilingual corpora, etc.). We conclude with a review of the current efforts
towards implementing key pieces of this architecture.Comment: 8 pages, 9 figure
A Formal Framework for Linguistic Annotation
`Linguistic annotation' covers any descriptive or analytic notations applied
to raw language data. The basic data may be in the form of time functions --
audio, video and/or physiological recordings -- or it may be textual. The added
notations may include transcriptions of all sorts (from phonetic features to
discourse structures), part-of-speech and sense tagging, syntactic analysis,
`named entity' identification, co-reference annotation, and so on. While there
are several ongoing efforts to provide formats and tools for such annotations
and to publish annotated linguistic databases, the lack of widely accepted
standards is becoming a critical problem. Proposed standards, to the extent
they exist, have focussed on file formats. This paper focuses instead on the
logical structure of linguistic annotations. We survey a wide variety of
existing annotation formats and demonstrate a common conceptual core, the
annotation graph. This provides a formal framework for constructing,
maintaining and searching linguistic annotations, while remaining consistent
with many alternative data structures and file formats.Comment: 49 page
Estimating Speaking Rate by Means of Rhythmicity Parameters
In this paper we present a speech rate estimator based on so-called rhythmicity features derived from a modified version of the short-time energy envelope. To evaluate the new method, it is compared to a traditional speech rate estimator on the basis of semi-automatic segmentation. Speech material from the Alcohol Language Corpus (ALC) covering intoxicated and sober speech of different speech styles provides a statistically sound foundation to test upon. The proposed measure clearly correlates with the semi-automatically determined speech rate and seems to be robust across speech styles and speaker states
TEMPORAL DATA EXTRACTION AND QUERY SYSTEM FOR EPILEPSY SIGNAL ANALYSIS
The 2016 Epilepsy Innovation Institute (Ei2) community survey reported that unpredictability is the most challenging aspect of seizure management. Effective and precise detection, prediction, and localization of epileptic seizures is a fundamental computational challenge. Utilizing epilepsy data from multiple epilepsy monitoring units can enhance the quantity and diversity of datasets, which can lead to more robust epilepsy data analysis tools. The contributions of this dissertation are two-fold. One is the implementation of a temporal query for epilepsy data; the other is the machine learning approach for seizure detection, seizure prediction, and seizure localization. The three key components of our temporal query interface are: 1) A pipeline for automatically extract European Data Format (EDF) information and epilepsy annotation data from cross-site sources; 2) Data quantity monitoring for Epilepsy temporal data; 3) A web-based annotation query interface for preliminary research and building customized epilepsy datasets. The system extracted and stored about 450,000 epilepsy-related events of more than 2,497 subjects from seven institutes up to September 2019. Leveraging the epilepsy temporal events query system, we developed machine learning models for seizure detection, prediction, and localization. Using 135 extracted features from EEG signals, we trained a channel-based eXtreme Gradient Boosting model to detect seizures on 8-second EEG segments. A long-term EEG recording evaluation shows that the model can detect about 90.34% seizures on an existing EEG dataset with 961 hours of data. The model achieved 89.88% accuracy, 92.32% sensitivity, and 84.76% AUC based on the segments evaluation. We also introduced a transfer learning approach consisting of 1) a base deep learning model pre-trained by ImageNet dataset and 2) customized fully connected layers, to train the patient-specific pre-ictal and inter-ictal data from our database. Two convolutional neural network architectures were evaluated using 53 pre-ictal segments and 265 continuous hours of inter-ictal EEG data. The evaluation shows that our model reached 86.79% sensitivity and 3.38% false-positive rate. Another transfer learning model for seizure localization uses a pre-trained ResNext50 structure and was trained with an image augmentation dataset labeling by fingerprint. Our model achieved 88.22% accuracy, 34.99% sensitivity, 1.02% false-positive rate, and 34.3% positive likelihood rate
- âŠ