96,600 research outputs found

    Annotations on Complex Patterns

    Get PDF
    Modelers of systems often want to isolate specific parts of a model to be treated as a whole, for example to protect them from accidental changes,to constrain them to specific policies, or to identify them as instances of a general pattern. In particular, we study here the case in which these parts are annotated with information from some external model.In a previous paper, we have discussed the use of annotations on individual model elements, represented as nodes in a graph; in this paper we model annotation processes involving also annotations themselves or whole configurations.To address the latter problem, we enrich the notion of graph by introducing a third sort of elements, called boxes, encompassing subgraphs, and associate them with annotations, too. We show how annotations on boxes support the modeling of complex policies,adapting the previous constructions for notation-aware rewriting to include boxes.The paper illustrates these concepts on the concrete modeling scenario of an organisation with security and temporal annotations

    Using regular expressions to express bowing patterns for string players

    Get PDF
    The study of bowing is critically important for string players. Traditional bowing annotations are a valuable part of orchestral and individual documentation, but they do not help the performer to search a piece for other passages that should be bowed the same way, or to identify alternative bowing styles. We introduce a notation based on regular expressions that describes patterns of notes in the music, as well as the bowing to be applied to the pattern. These expressions support complex bowings, and not just single annotations without musical context. The notation is simpler than general tools for regular expressions used in some software, and is suitable for use by students and musicians. We have developed a music editor that implements the notation and edits documents in Lilypond. The approach has been evaluated by experimenting with the editor on six violin sonatas by Mozart. The experiments demonstrate that the regular expression notation is successful at finding passages and inserting the bowings; that the patterns occur a number of times; and the bowings can be inserted automatically and consistently

    On mining complex sequential data by means of FCA and pattern structures

    Get PDF
    Nowadays data sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patient patterns from a French healthcare data set on cancer. The quantitative and qualitative results (with annotations and analysis from a physician) are reported in this use case which is the main motivation for this work. Keywords: data mining; formal concept analysis; pattern structures; projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems. The paper is created in the wake of the conference on Concept Lattice and their Applications (CLA'2013). 27 pages, 9 figures, 3 table

    Conditions, constraints and contracts: on the use of annotations for policy modeling.

    Get PDF
    Organisational policies express constraints on generation and processing of resources. However, application domains rely on transformation processes, which are in principle orthogonal to policy specifications and domain rules and policies may evolve in a non-synchronised way. In previous papers, we have proposed annotations as a flexible way to model aspects of some policy, and showed how they could be used to impose constraints on domain configurations, how to derive application conditions on transformations, and how to annotate complex patterns. We extend the approach by: allowing domain model elements to be annotated with collections of elements, which can be collectively applied to individual resources or collections thereof; proposing an original construction to solve the problem of annotations remaining orphan , when annotated resources are consumed; introducing a notion of contract, by which a policy imposes additional pre-conditions and post-conditions on rules for deriving new resources. We discuss a concrete case study of linguistic resources, annotated with information on the licenses under which they can be used. The annotation framework allows forms of reasoning such as identifying conflicts among licenses, enforcing the presence of licenses, or ruling out some modifications of a licence configuration

    Boosting Drug Named Entity Recognition using an Aggregate Classifier

    Get PDF
    AbstractObjectiveDrug named entity recognition (NER) is a critical step for complex biomedical NLP tasks such as the extraction of pharmacogenomic, pharmacodynamic and pharmacokinetic parameters. Large quantities of high quality training data are almost always a prerequisite for employing supervised machine-learning techniques to achieve high classification performance. However, the human labour needed to produce and maintain such resources is a significant limitation. In this study, we improve the performance of drug NER without relying exclusively on manual annotations.MethodsWe perform drug NER using either a small gold-standard corpus (120 abstracts) or no corpus at all. In our approach, we develop a voting system to combine a number of heterogeneous models, based on dictionary knowledge, gold-standard corpora and silver annotations, to enhance performance. To improve recall, we employed genetic programming to evolve 11 regular-expression patterns that capture common drug suffixes and used them as an extra means for recognition.MaterialsOur approach uses a dictionary of drug names, i.e. DrugBank, a small manually annotated corpus, i.e. the pharmacokinetic corpus, and a part of the UKPMC database, as raw biomedical text. Gold-standard and silver annotated data are used to train maximum entropy and multinomial logistic regression classifiers.ResultsAggregating drug NER methods, based on gold-standard annotations, dictionary knowledge and patterns, improved the performance on models trained on gold-standard annotations, only, achieving a maximum F-score of 95%. In addition, combining models trained on silver annotations, dictionary knowledge and patterns are shown to achieve comparable performance to models trained exclusively on gold-standard data. The main reason appears to be the morphological similarities shared among drug names.ConclusionWe conclude that gold-standard data are not a hard requirement for drug NER. Combining heterogeneous models build on dictionary knowledge can achieve similar or comparable classification performance with that of the best performing model trained on gold-standard annotations

    Regular expressions as violin bowing patterns

    Get PDF
    String players spend a significant amount of practice time creating and learning bowings. These may be indicated in the music using up-bow and down-bow symbols, but those traditional notations do not capture the complex bowing patterns that are latent within the music. Regular expressions, a mathematical notation for a simple class of formal languages, can describe precisely the bowing patterns that commonly arise in string music. A software tool based on regular expressions enables performers to search for passages that can be handled with similar bowings, and to edit them consistently. A computer-based music editor incorporating bowing patterns has been implemented, using Lilypond to typeset the music. Our approach has been evaluated by using the editor to study ten movements from six violin sonatas by W. A. Mozart. Our experience shows that the editor is successful at finding passages and inserting bowings; that relatively complex patterns occur a number of times; and that the bowings can be inserted automatically and consistently

    Object Discovery From a Single Unlabeled Image by Mining Frequent Itemset With Multi-scale Features

    Full text link
    TThe goal of our work is to discover dominant objects in a very general setting where only a single unlabeled image is given. This is far more challenge than typical co-localization or weakly-supervised localization tasks. To tackle this problem, we propose a simple but effective pattern mining-based method, called Object Location Mining (OLM), which exploits the advantages of data mining and feature representation of pre-trained convolutional neural networks (CNNs). Specifically, we first convert the feature maps from a pre-trained CNN model into a set of transactions, and then discovers frequent patterns from transaction database through pattern mining techniques. We observe that those discovered patterns, i.e., co-occurrence highlighted regions, typically hold appearance and spatial consistency. Motivated by this observation, we can easily discover and localize possible objects by merging relevant meaningful patterns. Extensive experiments on a variety of benchmarks demonstrate that OLM achieves competitive localization performance compared with the state-of-the-art methods. We also evaluate our approach compared with unsupervised saliency detection methods and achieves competitive results on seven benchmark datasets. Moreover, we conduct experiments on fine-grained classification to show that our proposed method can locate the entire object and parts accurately, which can benefit to improving the classification results significantly

    Combination of DROOL rules and Protégé knowledge bases in the ONTO-H annotation tool

    Get PDF
    ONTO-H is a semi-automatic collaborative tool for the semantic annotation of documents, built as a Protégé 3.0 tab plug-in. Among its multiple functionalities aimed at easing the document annotation process, ONTO-H uses a rule-based system to create cascading annotations out from a single drag and drop operation from a part of a document into an already existing concept or instance of the domain ontology being used for annotation. It also gives support to the detection of name conflicts and instance duplications in the creation of the annotations. The rule system runs on top of the open source rule engine DROOLS and is connected to the domain ontology used for annotation by means of an ad-hoc programmed Java proxy

    Use of Subimages in Fish Species Identification: A Qualitative Study

    Get PDF
    Many scholarly tasks involve working with subdocuments, or contextualized fine-grain information, i.e., with information that is part of some larger unit. A digital library (DL) facil- itates management, access, retrieval, and use of collections of data and metadata through services. However, most DLs do not provide infrastructure or services to support working with subdocuments. Superimposed information (SI) refers to new information that is created to reference subdocu- ments in existing information resources. We combine this idea of SI with traditional DL services, to define and develop a DL with SI (SI-DL). We explored the use of subimages and evaluated the use of a prototype SI-DL (SuperIDR) in fish species identification, a scholarly task that involves work- ing with subimages. The contexts and strategies of working with subimages in SuperIDR suggest new and enhanced sup- port (SI-DL services) for scholarly tasks that involve working with subimages, including new ways of querying and search- ing for subimages and associated information. The main contribution of our work are the insights gained from these findings of use of subimages and of SuperIDR (a prototype SI-DL), which lead to recommendations for the design of digital libraries with superimposed information
    corecore