8 research outputs found
Patterns versus Characters in Subword-aware Neural Language Modeling
Words in some natural languages can have a composite structure. Elements of
this structure include the root (that could also be composite), prefixes and
suffixes with which various nuances and relations to other words can be
expressed. Thus, in order to build a proper word representation one must take
into account its internal structure. From a corpus of texts we extract a set of
frequent subwords and from the latter set we select patterns, i.e. subwords
which encapsulate information on character -gram regularities. The selection
is made using the pattern-based Conditional Random Field model with
regularization. Further, for every word we construct a new sequence over an
alphabet of patterns. The new alphabet's symbols confine a local statistical
context stronger than the characters, therefore they allow better
representations in and are better building blocks for word
representation. In the task of subword-aware language modeling, pattern-based
models outperform character-based analogues by 2-20 perplexity points. Also, a
recurrent neural network in which a word is represented as a sum of embeddings
of its patterns is on par with a competitive and significantly more
sophisticated character-based convolutional architecture.Comment: 10 page
Multiscale Fields of Patterns
We describe a framework for defining high-order image models that can be used
in a variety of applications. The approach involves modeling local patterns in
a multiscale representation of an image. Local properties of a coarsened image
reflect non-local properties of the original image. In the case of binary
images local properties are defined by the binary patterns observed over small
neighborhoods around each pixel. With the multiscale representation we capture
the frequency of patterns observed at different scales of resolution. This
framework leads to expressive priors that depend on a relatively small number
of parameters. For inference and learning we use an MCMC method for block
sampling with very large blocks. We evaluate the approach with two example
applications. One involves contour detection. The other involves binary
segmentation.Comment: In NIPS 201
Disease Name Extraction from Clinical Text Using Conditional Random Fields
The aim of the research done in this thesis was to extract disease and disorder names from clinical texts. We utilized Conditional Random Fields (CRF) as the main method to label diseases and disorders in clinical sentences. We used some other tools such as MetaMap and Stanford Core NLP tool to extract some crucial features. MetaMap tool was used to identify names of diseases/disorders that are already in UMLS Metathesaurus. Some other important features such as lemmatized versions of words, and POS tags were extracted using the Stanford Core NLP tool. Some more features were extracted directly from UMLS Metathesaurus, including semantic types of words. We participated in the SemEval 2014 competition\u27s Task 7 and used its provided data to train and evaluate our system. Training data contained 199 clinical texts, development data contained 99 clinical texts, and the test data contained 133 clinical texts, these included discharge summaries, echocardiogram, radiology, and ECG reports. We obtained competitive results on the disease/disorder name extraction task. We found through ablation study that while all features contributed, MetaMap matches, POS tags, and previous and next words were the most effective features
Computing a partition function of a generalized pattern-based energy over a semiring
Valued constraint satisfaction problems with ordered variables (VCSPO) are a
special case of Valued CSPs in which variables are totally ordered and soft
constraints are imposed on tuples of variables that do not violate the order.
We study a restriction of VCSPO, in which soft constraints are imposed on a
segment of adjacent variables and a constraint language consists of
-valued characteristic functions of predicates. This kind of
potentials generalizes the so-called pattern-based potentials, which were
applied in many tasks of structured prediction.
For a constraint language we introduce a closure operator, , and give examples of constraint
languages for which is small. If all predicates in
are cartesian products, we show that the minimization of a generalized
pattern-based potential (or, the computation of its partition function) can be
made in
time, where is a set of variables, is a domain set. If, additionally,
only non-positive weights of constraints are allowed, the complexity of the
minimization task drops to where is the
arity of . For a general language and non-positive
weights, the minimization task can be carried out in time.
We argue that in many natural cases is of moderate
size, though in the worst case can blow up and
depend exponentially on
Associations between dog breed and clinical features of mammary epithelial neoplasia in bitches: an epidemiological study of submissions to a single diagnostic pathology centre between 2008-2021
Mammary cancer is one of the most common neoplasms of dogs, primarily bitches. While studies have been carried out identifying differing risk of mammary neoplasia in different dog breeds, few studies have reported associations between dog breeds and clinical features such as number of neoplastic lesions found in an individual case or the likelihood of lesions being benign or malignant. Such epidemiological studies are essential as a foundation for exploring potential genetic drivers of mammary tumour behaviour. Here, we have examined associations between breed, age and neuter status and the odds of a diagnosis of a mammary epithelial-origin neoplastic lesion (as opposed to any other histopathological diagnosis from a biopsied lesion) as well as the odds of a bitch presenting with either a single mammary lesion or multiple lesions, and the odds that those lesions are benign or malignant. The study population consisted of 129,258 samples from bitches, including 13,401 mammary epithelial neoplasms, submitted for histological assessment to a single histopathology laboratory between 2008 and 2021
Inference algorithms for pattern-based CRFs on sequence data
We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) (Formula presented.) is the sum of terms over intervals [i, j] where each term is non-zero only if the substring (Formula presented.) equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.) where L is the combined length of input patterns, (Formula presented.) is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.), where (Formula presented.) is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights