Search CORE

96,600 research outputs found

Annotations on Complex Patterns

Author: Bottoni Paolo
Parisi Presicce Francesco
Publication venue: European Association of Software Science and Technology
Publication date: 01/01/2013
Field of study

Modelers of systems often want to isolate specific parts of a model to be treated as a whole, for example to protect them from accidental changes,to constrain them to specific policies, or to identify them as instances of a general pattern. In particular, we study here the case in which these parts are annotated with information from some external model.In a previous paper, we have discussed the use of annotations on individual model elements, represented as nodes in a graph; in this paper we model annotation processes involving also annotations themselves or whole configurations.To address the latter problem, we enrich the notion of graph by introducing a third sort of elements, called boxes, encompassing subgraphs, and associate them with annotations, too. We show how annotations on boxes support the modeling of complex policies,adapting the previous constructions for notation-aware rewriting to include boxes.The paper illustrates these concepts on the concrete modeling scenario of an organisation with security and temporal annotations

Electronic Communications of the EASST (European Association of Software Science and Technology)

Archivio della ricerca- Università di Roma La Sapienza

Using regular expressions to express bowing patterns for string players

Author: Hall C.V.
O'Donnell J.T.
Publication venue
Publication date: 01/01/2009
Field of study

The study of bowing is critically important for string players. Traditional bowing annotations are a valuable part of orchestral and individual documentation, but they do not help the performer to search a piece for other passages that should be bowed the same way, or to identify alternative bowing styles. We introduce a notation based on regular expressions that describes patterns of notes in the music, as well as the bowing to be applied to the pattern. These expressions support complex bowings, and not just single annotations without musical context. The notation is simpler than general tools for regular expressions used in some software, and is suitable for use by students and musicians. We have developed a music editor that implements the notation and edits documents in Lilypond. The approach has been evaluated by experimenting with the editor on six violin sonatas by Mozart. The experiments demonstrate that the regular expression notation is successful at finding passages and inserting the bowings; that the patterns occur a number of times; and the bowings can be inserted automatically and consistently

University of Michigan Library Repository

Enlighten

Recommended from our members

Identification of the expressome by machine learning on omics data.

Author: Briggs Steven P
Noshay Jaclyn
Sartor Ryan C
Springer Nathan M
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

Accurate annotation of plant genomes remains complex due to the presence of many pseudogenes arising from whole-genome duplication-generated redundancy or the capture and movement of gene fragments by transposable elements. Machine learning on genome-wide epigenetic marks, informed by transcriptomic and proteomic training data, could be used to improve annotations through classification of all putative protein-coding genes as either constitutively silent or able to be expressed. Expressed genes were subclassified as able to express both mRNAs and proteins or only RNAs, and CG gene body methylation was associated only with the former subclass. More than 60,000 protein-coding genes have been annotated in the reference genome of maize inbred B73. About two-thirds of these genes are transcribed and are designated the filtered gene set (FGS). Classification of genes by our trained random forest algorithm was accurate and relied only on histone modifications or DNA methylation patterns within the gene body; promoter methylation was unimportant. Other inbred lines are known to transcribe significantly different sets of genes, indicating that the FGS is specific to B73. We accurately classified the sets of transcribed genes in additional inbred lines, arising from inbred-specific DNA methylation patterns. This approach highlights the potential of using chromatin information to improve annotations of functional genes

eScholarship - University of California

On mining complex sequential data by means of FCA and pattern structures

Author: Buzmakov Aleksey
Egho Elias
Jay Nicolas
Kuznetsov Sergei O.
Napoli Amedeo
Raïssi Chedy
Publication venue
Publication date: 09/04/2015
Field of study

Nowadays data sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patient patterns from a French healthcare data set on cancer. The quantitative and qualitative results (with annotations and analysis from a physician) are reported in this use case which is the main motivation for this work. Keywords: data mining; formal concept analysis; pattern structures; projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems. The paper is created in the wake of the conference on Concept Lattice and their Applications (CLA'2013). 27 pages, 9 figures, 3 table

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Conditions, constraints and contracts: on the use of annotations for policy modeling.

Author: Bottoni Paolo Gaspare
Navigli Roberto
PARISI PRESICCE Francesco
Publication venue: Technical University of Aachen
Publication date: 01/01/2015
Field of study

Organisational policies express constraints on generation and processing of resources. However, application domains rely on transformation processes, which are in principle orthogonal to policy specifications and domain rules and policies may evolve in a non-synchronised way. In previous papers, we have proposed annotations as a flexible way to model aspects of some policy, and showed how they could be used to impose constraints on domain configurations, how to derive application conditions on transformations, and how to annotate complex patterns. We extend the approach by: allowing domain model elements to be annotated with collections of elements, which can be collectively applied to individual resources or collections thereof; proposing an original construction to solve the problem of annotations remaining orphan , when annotated resources are consumed; introducing a notion of contract, by which a policy imposes additional pre-conditions and post-conditions on rules for deriving new resources. We discuss a concrete case study of linguistic resources, annotated with information on the licenses under which they can be used. The annotation framework allows forms of reasoning such as identifying conflicts among licenses, enforcing the presence of licenses, or ruling out some modifications of a licence configuration

Archivio della ricerca- Università di Roma La Sapienza

Electronic Communications of the EASST (European Association of Software Science and Technology)

Boosting Drug Named Entity Recognition using an Aggregate Classifier

Author: Ananiadou Sophia
Dowsey Andrew W.
Korkontzelos Ioannis
Piliouras Dimitrios
Publication venue: 'Elsevier BV'
Publication date: 17/06/2015
Field of study

AbstractObjectiveDrug named entity recognition (NER) is a critical step for complex biomedical NLP tasks such as the extraction of pharmacogenomic, pharmacodynamic and pharmacokinetic parameters. Large quantities of high quality training data are almost always a prerequisite for employing supervised machine-learning techniques to achieve high classification performance. However, the human labour needed to produce and maintain such resources is a significant limitation. In this study, we improve the performance of drug NER without relying exclusively on manual annotations.MethodsWe perform drug NER using either a small gold-standard corpus (120 abstracts) or no corpus at all. In our approach, we develop a voting system to combine a number of heterogeneous models, based on dictionary knowledge, gold-standard corpora and silver annotations, to enhance performance. To improve recall, we employed genetic programming to evolve 11 regular-expression patterns that capture common drug suffixes and used them as an extra means for recognition.MaterialsOur approach uses a dictionary of drug names, i.e. DrugBank, a small manually annotated corpus, i.e. the pharmacokinetic corpus, and a part of the UKPMC database, as raw biomedical text. Gold-standard and silver annotated data are used to train maximum entropy and multinomial logistic regression classifiers.ResultsAggregating drug NER methods, based on gold-standard annotations, dictionary knowledge and patterns, improved the performance on models trained on gold-standard annotations, only, achieving a maximum F-score of 95%. In addition, combining models trained on silver annotations, dictionary knowledge and patterns are shown to achieve comparable performance to models trained exclusively on gold-standard data. The main reason appears to be the morphological similarities shared among drug names.ConclusionWe conclude that gold-standard data are not a hard requirement for drug NER. Combining heterogeneous models build on dictionary knowledge can achieve similar or comparable classification performance with that of the best performing model trained on gold-standard annotations

Elsevier - Publisher Connector

Edge Hill University Research Information Repository

The University of Manchester - Institutional Repository

Explore Bristol Research

Regular expressions as violin bowing patterns

Author: Cordelia Hall
Dovey M.
Eisenberg A.
John T. O'Donnell
Ng K.
Winget M.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2012
Field of study

String players spend a significant amount of practice time creating and learning bowings. These may be indicated in the music using up-bow and down-bow symbols, but those traditional notations do not capture the complex bowing patterns that are latent within the music. Regular expressions, a mathematical notation for a simple class of formal languages, can describe precisely the bowing patterns that commonly arise in string music. A software tool based on regular expressions enables performers to search for passages that can be handled with similar bowings, and to edit them consistently. A computer-based music editor incorporating bowing patterns has been implemented, using Lilypond to typeset the music. Our approach has been evaluated by using the editor to study ten movements from six violin sonatas by W. A. Mozart. Our experience shows that the editor is successful at finding passages and inserting bowings; that relatively complex patterns occur a number of times; and that the bowings can be inserted automatically and consistently

Crossref

Enlighten

Object Discovery From a Single Unlabeled Image by Mining Frequent Itemset With Multi-scale Features

Author: Guan Qingji
Huang Yaping
Ling Haibin
Pu Mengyang
Zhang Jian
Zhang Runsheng
Zou Qi
Publication venue
Publication date: 08/08/2020
Field of study

TThe goal of our work is to discover dominant objects in a very general setting where only a single unlabeled image is given. This is far more challenge than typical co-localization or weakly-supervised localization tasks. To tackle this problem, we propose a simple but effective pattern mining-based method, called Object Location Mining (OLM), which exploits the advantages of data mining and feature representation of pre-trained convolutional neural networks (CNNs). Specifically, we first convert the feature maps from a pre-trained CNN model into a set of transactions, and then discovers frequent patterns from transaction database through pattern mining techniques. We observe that those discovered patterns, i.e., co-occurrence highlighted regions, typically hold appearance and spatial consistency. Motivated by this observation, we can easily discover and localize possible objects by merging relevant meaningful patterns. Extensive experiments on a variety of benchmarks demonstrate that OLM achieves competitive localization performance compared with the state-of-the-art methods. We also evaluate our approach compared with unsupervised saliency detection methods and achieves competitive results on seven benchmark datasets. Moreover, we conduct experiments on fine-grained classification to show that our proposed method can locate the entire object and parts accurately, which can benefit to improving the classification results significantly

arXiv.org e-Print Archive

Combination of DROOL rules and Protégé knowledge bases in the ONTO-H annotation tool

Author: Benjamins R.
Blázquez Juan
Contreras Jesús
Corcho Oscar
Dodero J.M.
García-Silva A.
Millán R
Navas E.
Niño M.
Rodríguez J.
Wert C
Publication venue: Facultad de Informática (UPM)
Publication date: 01/07/2005
Field of study

ONTO-H is a semi-automatic collaborative tool for the semantic annotation of documents, built as a Protégé 3.0 tab plug-in. Among its multiple functionalities aimed at easing the document annotation process, ONTO-H uses a rule-based system to create cascading annotations out from a single drag and drop operation from a part of a document into an already existing concept or instance of the domain ontology being used for annotation. It also gives support to the detection of name conflicts and instance duplications in the creation of the annotations. The rule system runs on top of the open source rule engine DROOLS and is connected to the domain ontology used for annotation by means of an ad-hoc programmed Java proxy

Archivo Digital UPM

Use of Subimages in Fish Species Identification: A Qualitative Study

Author: Delcambre Lois
Fox Edward
Hallerman Eric
Murthy Uma
Pérez-Quiñones Manuel
Torres Ricardo
Tzy Li Lin
Publication venue
Publication date: 01/03/2011
Field of study

Many scholarly tasks involve working with subdocuments, or contextualized fine-grain information, i.e., with information that is part of some larger unit. A digital library (DL) facil- itates management, access, retrieval, and use of collections of data and metadata through services. However, most DLs do not provide infrastructure or services to support working with subdocuments. Superimposed information (SI) refers to new information that is created to reference subdocu- ments in existing information resources. We combine this idea of SI with traditional DL services, to define and develop a DL with SI (SI-DL). We explored the use of subimages and evaluated the use of a prototype SI-DL (SuperIDR) in fish species identification, a scholarly task that involves work- ing with subimages. The contexts and strategies of working with subimages in SuperIDR suggest new and enhanced sup- port (SI-DL services) for scholarly tasks that involve working with subimages, including new ways of querying and search- ing for subimages and associated information. The main contribution of our work are the insights gained from these findings of use of subimages and of SuperIDR (a prototype SI-DL), which lead to recommendations for the design of digital libraries with superimposed information

Computer Science Technical Reports @Virginia Tech