2,267 research outputs found

    Effective pattern discovery for text mining

    Get PDF
    Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance

    Mining Frequent Generalized Patterns for Web Personalization in the presence of Taxonomies

    Get PDF
    The Web is a continuously evolving environment, since its content is updated on a regular basis. As a result, the traditional usage-based approach to generate recommendations that takes as input the navigation paths recorded on the Web page level, is not as effective. Moreover, most of the content available online is either explicitly or implicitly characterized by a set of categories organized in a taxonomy, allowing the page-level navigation patterns to be generalized to a higher, aggregate level. In this direction, the authors present the Frequent Generalized Pattern (FGP) algorithm. FGP takes as input the transaction data and a hierarchy of categories and produces generalized association rules that contain transaction items and/or item categories. The results can be used to generate association rules and subsequently recommendations for the users. The algorithm can be applied to the log files of a typical Web site; however, it can be more helpful in a Web 2.0 application, such as a feed aggregator or a digital library mediator, where content is semantically annotated and the taxonomic nature is more complex, requiring us to extend FGP in a version called FGP+. The authors experimentally evaluate both algorithms using Web log data collected from a newspaper Web site

    Multi-representation Ontology in the Context of Enterprise Information Systems

    Get PDF
    International audienceIn the last decade, ontologies as shared common vocabulary played a major role in many AI applications and informationintegration for heterogeneous, distributed systems. The problems of integrating and developing information systems anddatabases in heterogeneous, distributed environment have been translated in the technical perspectives as system’sinteroperability. Ontologies, however, are foreseen to play a key role in resolving partially the semantic conflicts anddifferences that exist among systems. Domain ontologies, however, are constructed by capturing a set of concepts and theirlinks according to various criteria such as the abstraction paradigm, the granularity scale, interest of user communities, andthe perception of the ontology developer. Thus, different applications of the same domain end up having severalrepresentations of the same real world phenomenon. Multi-representation ontology is an ontology (or ontologies) thatcharacterizes ontological concept by a variable set of properties (static and dynamic) or attributes in several contexts and/ orin several scales of granularity. This paper introduces the formalism used for defining the paradigm of multi-representationontology and shows the manifestation of this paradigm with Enterprise Information Systems

    A cognitive taxonomy of medical errors

    Get PDF
    AbstractObjective. Propose a cognitive taxonomy of medical errors at the level of individuals and their interactions with technology.Design. Use cognitive theories of human error and human action to develop the theoretical foundations of the taxonomy, develop the structure of the taxonomy, populate the taxonomy with examples of medical error cases, identify cognitive mechanisms for each category of medical error under the taxonomy, and apply the taxonomy to practical problems.Measurements. Four criteria were used to evaluate the cognitive taxonomy. The taxonomy should be able (1) to categorize major types of errors at the individual level along cognitive dimensions, (2) to associate each type of error with a specific underlying cognitive mechanism, (3) to describe how and explain why a specific error occurs, and (4) to generate intervention strategies for each type of error.Results. The proposed cognitive taxonomy largely satisfies the four criteria at a theoretical and conceptual level.Conclusion. Theoretically, the proposed cognitive taxonomy provides a method to systematically categorize medical errors at the individual level along cognitive dimensions, leads to a better understanding of the underlying cognitive mechanisms of medical errors, and provides a framework that can guide future studies on medical errors. Practically, it provides guidelines for the development of cognitive interventions to decrease medical errors and foundation for the development of medical error reporting system that not only categorizes errors but also identifies problems and helps to generate solutions. To validate this model empirically, we will next be performing systematic experimental studies

    A survey of self organisation in future cellular networks

    Get PDF
    This article surveys the literature over the period of the last decade on the emerging field of self organisation as applied to wireless cellular communication networks. Self organisation has been extensively studied and applied in adhoc networks, wireless sensor networks and autonomic computer networks; however in the context of wireless cellular networks, this is the first attempt to put in perspective the various efforts in form of a tutorial/survey. We provide a comprehensive survey of the existing literature, projects and standards in self organising cellular networks. Additionally, we also aim to present a clear understanding of this active research area, identifying a clear taxonomy and guidelines for design of self organising mechanisms. We compare strength and weakness of existing solutions and highlight the key research areas for further development. This paper serves as a guide and a starting point for anyone willing to delve into research on self organisation in wireless cellular communication networks

    Putting the Semantics into Semantic Versioning

    Full text link
    The long-standing aspiration for software reuse has made astonishing strides in the past few years. Many modern software development ecosystems now come with rich sets of publicly-available components contributed by the community. Downstream developers can leverage these upstream components, boosting their productivity. However, components evolve at their own pace. This imposes obligations on and yields benefits for downstream developers, especially since changes can be breaking, requiring additional downstream work to adapt to. Upgrading too late leaves downstream vulnerable to security issues and missing out on useful improvements; upgrading too early results in excess work. Semantic versioning has been proposed as an elegant mechanism to communicate levels of compatibility, enabling downstream developers to automate dependency upgrades. While it is questionable whether a version number can adequately characterize version compatibility in general, we argue that developers would greatly benefit from tools such as semantic version calculators to help them upgrade safely. The time is now for the research community to develop such tools: large component ecosystems exist and are accessible, component interactions have become observable through automated builds, and recent advances in program analysis make the development of relevant tools feasible. In particular, contracts (both traditional and lightweight) are a promising input to semantic versioning calculators, which can suggest whether an upgrade is likely to be safe.Comment: to be published as Onward! Essays 202
    • …
    corecore