Search CORE

8 research outputs found

Patterns versus Characters in Subword-aware Neural Language Modeling

Author: MP Marcus
N Chomsky
PJ Werbos
S Hochreiter
Publication venue
Publication date: 02/09/2017
Field of study

Words in some natural languages can have a composite structure. Elements of this structure include the root (that could also be composite), prefixes and suffixes with which various nuances and relations to other words can be expressed. Thus, in order to build a proper word representation one must take into account its internal structure. From a corpus of texts we extract a set of frequent subwords and from the latter set we select patterns, i.e. subwords which encapsulate information on character

n

-gram regularities. The selection is made using the pattern-based Conditional Random Field model with

l_1

regularization. Further, for every word we construct a new sequence over an alphabet of patterns. The new alphabet's symbols confine a local statistical context stronger than the characters, therefore they allow better representations in

{\mathbb{R}}^n

and are better building blocks for word representation. In the task of subword-aware language modeling, pattern-based models outperform character-based analogues by 2-20 perplexity points. Also, a recurrent neural network in which a word is represented as a sum of embeddings of its patterns is on par with a competitive and significantly more sophisticated character-based convolutional architecture.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Multiscale Fields of Patterns

Author: Felzenszwalb Pedro F.
Oberlin John G.
Publication venue
Publication date: 12/12/2014
Field of study

We describe a framework for defining high-order image models that can be used in a variety of applications. The approach involves modeling local patterns in a multiscale representation of an image. Local properties of a coarsened image reflect non-local properties of the original image. In the case of binary images local properties are defined by the binary patterns observed over small neighborhoods around each pixel. With the multiscale representation we capture the frequency of patterns observed at different scales of resolution. This framework leads to expressive priors that depend on a relatively small number of parameters. For inference and learning we use an MCMC method for block sampling with very large blocks. We evaluate the approach with two example applications. One involves contour detection. The other involves binary segmentation.Comment: In NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Disease Name Extraction from Clinical Text Using Conditional Random Fields

Author: Ghiasvand Omid
Publication venue: UWM Digital Commons
Publication date: 01/05/2014
Field of study

The aim of the research done in this thesis was to extract disease and disorder names from clinical texts. We utilized Conditional Random Fields (CRF) as the main method to label diseases and disorders in clinical sentences. We used some other tools such as MetaMap and Stanford Core NLP tool to extract some crucial features. MetaMap tool was used to identify names of diseases/disorders that are already in UMLS Metathesaurus. Some other important features such as lemmatized versions of words, and POS tags were extracted using the Stanford Core NLP tool. Some more features were extracted directly from UMLS Metathesaurus, including semantic types of words. We participated in the SemEval 2014 competition\u27s Task 7 and used its provided data to train and evaluate our system. Training data contained 199 clinical texts, development data contained 99 clinical texts, and the test data contained 133 clinical texts, these included discharge summaries, echocardiogram, radiology, and ECG reports. We obtained competitive results on the disease/disorder name extraction task. We found through ablation study that while all features contributed, MetaMap matches, POS tags, and previous and next words were the most effective features

University of Wisconsin-Milwaukee

Computing a partition function of a generalized pattern-based energy over a semiring

Author: Takhanov Rustem
Publication venue
Publication date: 27/05/2023
Field of study

Valued constraint satisfaction problems with ordered variables (VCSPO) are a special case of Valued CSPs in which variables are totally ordered and soft constraints are imposed on tuples of variables that do not violate the order. We study a restriction of VCSPO, in which soft constraints are imposed on a segment of adjacent variables and a constraint language

\Gamma

consists of

\{0,1\}

-valued characteristic functions of predicates. This kind of potentials generalizes the so-called pattern-based potentials, which were applied in many tasks of structured prediction. For a constraint language

\Gamma

we introduce a closure operator,

\overline{\Gamma^{\cap}}\supseteq \Gamma

, and give examples of constraint languages for which

|\overline{\Gamma^{\cap}}|

is small. If all predicates in

\Gamma

are cartesian products, we show that the minimization of a generalized pattern-based potential (or, the computation of its partition function) can be made in

{\mathcal O}(|V|\cdot |D|^2 \cdot |\overline{\Gamma^{\cap}}|^2 )

time, where

V

is a set of variables,

D

is a domain set. If, additionally, only non-positive weights of constraints are allowed, the complexity of the minimization task drops to

{\mathcal O}(|V|\cdot |\overline{\Gamma^{\cap}}| \cdot |D| \cdot \max_{\rho\in \Gamma}\|\rho\|^2 )

where

\|\rho\|

is the arity of

\rho\in \Gamma

. For a general language

\Gamma

and non-positive weights, the minimization task can be carried out in

{\mathcal O}(|V|\cdot |\overline{\Gamma^{\cap}}|^2)

time. We argue that in many natural cases

\overline{\Gamma^{\cap}}

is of moderate size, though in the worst case

|\overline{\Gamma^{\cap}}|

can blow up and depend exponentially on

\max_{\rho\in \Gamma}\|\rho\|

arXiv.org e-Print Archive

Associations between dog breed and clinical features of mammary epithelial neoplasia in bitches: an epidemiological study of submissions to a single diagnostic pathology centre between 2008-2021

Author: Beck Sam
Brodbelt David
Edmunds Grace
O'Neill Dan
Smalley Matthew J
Spasic Irena
Umakant Kale Kedar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/03/2023
Field of study

Mammary cancer is one of the most common neoplasms of dogs, primarily bitches. While studies have been carried out identifying differing risk of mammary neoplasia in different dog breeds, few studies have reported associations between dog breeds and clinical features such as number of neoplastic lesions found in an individual case or the likelihood of lesions being benign or malignant. Such epidemiological studies are essential as a foundation for exploring potential genetic drivers of mammary tumour behaviour. Here, we have examined associations between breed, age and neuter status and the odds of a diagnosis of a mammary epithelial-origin neoplastic lesion (as opposed to any other histopathological diagnosis from a biopsied lesion) as well as the odds of a bitch presenting with either a single mammary lesion or multiple lesions, and the odds that those lesions are benign or malignant. The study population consisted of 129,258 samples from bitches, including 13,401 mammary epithelial neoplasms, submitted for histological assessment to a single histopathology laboratory between 2008 and 2021

Online Research @ Cardiff

Explore Bristol Research

Inference algorithms for pattern-based CRFs on sequence data

Author: Kolmogorov Vladimir
Takhanov Rustem
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) (Formula presented.) is the sum of terms over intervals [i, j] where each term is non-zero only if the substring (Formula presented.) equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.) where L is the combined length of input patterns, (Formula presented.) is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.), where (Formula presented.) is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

Inference Algorithms for Pattern-Based CRFs on Sequence Data

Author: C Bystroff
MD Vose
O Berkman
Rustem Takhanov
Vladimir Kolmogorov
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref