964 research outputs found
Patterns versus Characters in Subword-aware Neural Language Modeling
Words in some natural languages can have a composite structure. Elements of
this structure include the root (that could also be composite), prefixes and
suffixes with which various nuances and relations to other words can be
expressed. Thus, in order to build a proper word representation one must take
into account its internal structure. From a corpus of texts we extract a set of
frequent subwords and from the latter set we select patterns, i.e. subwords
which encapsulate information on character -gram regularities. The selection
is made using the pattern-based Conditional Random Field model with
regularization. Further, for every word we construct a new sequence over an
alphabet of patterns. The new alphabet's symbols confine a local statistical
context stronger than the characters, therefore they allow better
representations in and are better building blocks for word
representation. In the task of subword-aware language modeling, pattern-based
models outperform character-based analogues by 2-20 perplexity points. Also, a
recurrent neural network in which a word is represented as a sum of embeddings
of its patterns is on par with a competitive and significantly more
sophisticated character-based convolutional architecture.Comment: 10 page
Structured Prediction of Sequences and Trees using Infinite Contexts
Linguistic structures exhibit a rich array of global phenomena, however
commonly used Markov models are unable to adequately describe these phenomena
due to their strong locality assumptions. We propose a novel hierarchical model
for structured prediction over sequences and trees which exploits global
context by conditioning each generation decision on an unbounded context of
prior decisions. This builds on the success of Markov models but without
imposing a fixed bound in order to better represent global phenomena. To
facilitate learning of this large and unbounded model, we use a hierarchical
Pitman-Yor process prior which provides a recursive form of smoothing. We
propose prediction algorithms based on A* and Markov Chain Monte Carlo
sampling. Empirical results demonstrate the potential of our model compared to
baseline finite-context Markov models on part-of-speech tagging and syntactic
parsing
3DQ: Compact Quantized Neural Networks for Volumetric Whole Brain Segmentation
Model architectures have been dramatically increasing in size, improving
performance at the cost of resource requirements. In this paper we propose 3DQ,
a ternary quantization method, applied for the first time to 3D Fully
Convolutional Neural Networks (F-CNNs), enabling 16x model compression while
maintaining performance on par with full precision models. We extensively
evaluate 3DQ on two datasets for the challenging task of whole brain
segmentation. Additionally, we showcase our method's ability to generalize on
two common 3D architectures, namely 3D U-Net and V-Net. Outperforming a variety
of baselines, the proposed method is capable of compressing large 3D models to
a few MBytes, alleviating the storage needs in space critical applications.Comment: Accepted to MICCAI 201
Label-Dependencies Aware Recurrent Neural Networks
In the last few years, Recurrent Neural Networks (RNNs) have proved effective
on several NLP tasks. Despite such great success, their ability to model
\emph{sequence labeling} is still limited. This lead research toward solutions
where RNNs are combined with models which already proved effective in this
domain, such as CRFs. In this work we propose a solution far simpler but very
effective: an evolution of the simple Jordan RNN, where labels are re-injected
as input into the network, and converted into embeddings, in the same way as
words. We compare this RNN variant to all the other RNN models, Elman and
Jordan RNN, LSTM and GRU, on two well-known tasks of Spoken Language
Understanding (SLU). Thanks to label embeddings and their combination at the
hidden layer, the proposed variant, which uses more parameters than Elman and
Jordan RNNs, but far fewer than LSTM and GRU, is more effective than other
RNNs, but also outperforms sophisticated CRF models.Comment: 22 pages, 3 figures. Accepted at CICling 2017 conference. Best
Verifiability, Reproducibility, and Working Description awar
Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures
The presence of Long Distance Dependencies (LDDs) in sequential data poses
significant challenges for computational models. Various recurrent neural
architectures have been designed to mitigate this issue. In order to test these
state-of-the-art architectures, there is growing need for rich benchmarking
datasets. However, one of the drawbacks of existing datasets is the lack of
experimental control with regards to the presence and/or degree of LDDs. This
lack of control limits the analysis of model performance in relation to the
specific challenge posed by LDDs. One way to address this is to use synthetic
data having the properties of subregular languages. The degree of LDDs within
the generated data can be controlled through the k parameter, length of the
generated strings, and by choosing appropriate forbidden strings. In this
paper, we explore the capacity of different RNN extensions to model LDDs, by
evaluating these models on a sequence of SPk synthesized datasets, where each
subsequent dataset exhibits a longer degree of LDD. Even though SPk are simple
languages, the presence of LDDs does have significant impact on the performance
of recurrent neural architectures, thus making them prime candidate in
benchmarking tasks.Comment: International Conference of Artificial Neural Networks (ICANN) 201
Dutch parallel corpus: a balanced parallel corpus for Dutch-English and Dutch-French
status: publishe
A Multi-Armed Bandit to Smartly Select a Training Set from Big Medical Data
With the availability of big medical image data, the selection of an adequate
training set is becoming more important to address the heterogeneity of
different datasets. Simply including all the data does not only incur high
processing costs but can even harm the prediction. We formulate the smart and
efficient selection of a training dataset from big medical image data as a
multi-armed bandit problem, solved by Thompson sampling. Our method assumes
that image features are not available at the time of the selection of the
samples, and therefore relies only on meta information associated with the
images. Our strategy simultaneously exploits data sources with high chances of
yielding useful samples and explores new data regions. For our evaluation, we
focus on the application of estimating the age from a brain MRI. Our results on
7,250 subjects from 10 datasets show that our approach leads to higher accuracy
while only requiring a fraction of the training data.Comment: MICCAI 2017 Proceeding
Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus
The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning
Test-retest reliability of temporal and spatial gait characteristics measured with an instrumented walkway system (GAITRite(®))
BACKGROUND: The purpose of this study was to determine the test-retest reliability of temporal and spatial gait measurements over a one-week period as measured using an instrumented walkway system (GAITRite(®)). METHODS: Subjects were tested on two occasions one week apart. Measurements were made at preferred and fast walking speeds using the GAITRite(® )system. Measurements tested included walking speed, step length, stride length, base of support, step time, stride time, swing time, stance time, single and double support times, and toe in-toe out angle. RESULTS: Twenty-one healthy subjects participated in this study. The group consisted of 12 men and 9 women, with an average age of 34 years (range: 19 – 59 years). At preferred walking speed, all gait measurements had ICC's of 0.92 and higher, except base of support which had an ICC of 0.80. At fast walking speed all gait measurements had ICC's above 0.89 except base of support (ICC = 0.79), CONCLUSIONS: Spatial-temporal gait measurements demonstrate good to excellent test-retest reliability over a one-week time span
Consensus statements on the utility of defining ARDS and the utility of past and current definitions of ARDS—protocol for a Delphi study
Introduction: Acute respiratory distress syndrome (ARDS), marked by acute hypoxemia and bilateral pulmonary infiltrates, has been defined in multiple ways since its first description. This Delphi study aims to collect global opinions on the conceptual framework of ARDS, assess the usefulness of components within current and past definitions and investigate the role of subphenotyping. The varied expertise of the panel will provide valuable insights for refining future ARDS definitions and improving clinical management. Methods: A diverse panel of 35–40 experts will be selected based on predefined criteria. Multiple choice questions (MCQs) or 7-point Likert-scale statements will be used in the iterative Delphi rounds to achieve consensus on key aspects related to the utility of definitions and subphenotyping. The Delphi rounds will be continued until a stable agreement or disagreement is achieved for all statements. Analysis: Consensus will be considered as reached when a choice in MCQs or Likert-scale statement achieved ≥80% of votes for agreement or disagreement. The stability will be checked by non-parametric χ2 tests or Kruskal Wallis test starting from the second round of Delphi process. A p-value ≥0.05 will be used to define stability. Ethics and dissemination: The study will be conducted in full concordance with the principles of the Declaration of Helsinki and will be reported according to CREDES guidance. This study has been granted an ethical approval waiver by the NMC Healthcare Regional Research Ethics Committee, Dubai (NMCHC/CR/DXB/REC/APP/002), owing to the nature of the research. Informed consent will be obtained from all panellists before the start of the Delphi process. The study will be published in a peer-review journal with the authorship agreed as per ICMJE requirements. Trial registration number: NCT06159465
- …