2,267 research outputs found
Effective pattern discovery for text mining
Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance
Mining Frequent Generalized Patterns for Web Personalization in the presence of Taxonomies
The Web is a continuously evolving environment, since its content is updated on a regular basis. As a result, the traditional usage-based approach to generate recommendations that takes as input the navigation paths recorded on the Web page level, is not as effective. Moreover, most of the content available online is either explicitly or implicitly characterized by a set of categories organized in a taxonomy, allowing the page-level navigation patterns to be generalized to a higher, aggregate level. In this direction, the authors present the Frequent Generalized Pattern (FGP) algorithm. FGP takes as input the transaction data and a hierarchy of categories and produces generalized association rules that contain transaction items and/or item categories. The results can be used to generate association rules and subsequently recommendations for the users. The algorithm can be applied to the log files of a typical Web site; however, it can be more helpful in a Web 2.0 application, such as a feed aggregator or a digital library mediator, where content is semantically annotated and the taxonomic nature is more complex, requiring us to extend FGP in a version called FGP+. The authors experimentally evaluate both algorithms using Web log data collected from a newspaper Web site
Multi-representation Ontology in the Context of Enterprise Information Systems
International audienceIn the last decade, ontologies as shared common vocabulary played a major role in many AI applications and informationintegration for heterogeneous, distributed systems. The problems of integrating and developing information systems anddatabases in heterogeneous, distributed environment have been translated in the technical perspectives as system’sinteroperability. Ontologies, however, are foreseen to play a key role in resolving partially the semantic conflicts anddifferences that exist among systems. Domain ontologies, however, are constructed by capturing a set of concepts and theirlinks according to various criteria such as the abstraction paradigm, the granularity scale, interest of user communities, andthe perception of the ontology developer. Thus, different applications of the same domain end up having severalrepresentations of the same real world phenomenon. Multi-representation ontology is an ontology (or ontologies) thatcharacterizes ontological concept by a variable set of properties (static and dynamic) or attributes in several contexts and/ orin several scales of granularity. This paper introduces the formalism used for defining the paradigm of multi-representationontology and shows the manifestation of this paradigm with Enterprise Information Systems
A cognitive taxonomy of medical errors
AbstractObjective. Propose a cognitive taxonomy of medical errors at the level of individuals and their interactions with technology.Design. Use cognitive theories of human error and human action to develop the theoretical foundations of the taxonomy, develop the structure of the taxonomy, populate the taxonomy with examples of medical error cases, identify cognitive mechanisms for each category of medical error under the taxonomy, and apply the taxonomy to practical problems.Measurements. Four criteria were used to evaluate the cognitive taxonomy. The taxonomy should be able (1) to categorize major types of errors at the individual level along cognitive dimensions, (2) to associate each type of error with a specific underlying cognitive mechanism, (3) to describe how and explain why a specific error occurs, and (4) to generate intervention strategies for each type of error.Results. The proposed cognitive taxonomy largely satisfies the four criteria at a theoretical and conceptual level.Conclusion. Theoretically, the proposed cognitive taxonomy provides a method to systematically categorize medical errors at the individual level along cognitive dimensions, leads to a better understanding of the underlying cognitive mechanisms of medical errors, and provides a framework that can guide future studies on medical errors. Practically, it provides guidelines for the development of cognitive interventions to decrease medical errors and foundation for the development of medical error reporting system that not only categorizes errors but also identifies problems and helps to generate solutions. To validate this model empirically, we will next be performing systematic experimental studies
A survey of self organisation in future cellular networks
This article surveys the literature over the period of the last decade on the emerging field of self organisation as applied to wireless cellular communication networks. Self organisation has been extensively studied and applied in adhoc networks, wireless sensor networks and autonomic computer networks; however in the context of wireless cellular networks, this is the first attempt to put in perspective the various efforts in form of a tutorial/survey. We provide a comprehensive survey of the existing literature, projects and standards in self organising cellular networks. Additionally, we also aim to present a clear understanding of this active research area, identifying a clear taxonomy and guidelines for design of self organising mechanisms. We compare strength and weakness of existing solutions and highlight the key research areas for further development. This paper serves as a guide and a starting point for anyone willing to delve into research on self organisation in wireless cellular communication networks
Putting the Semantics into Semantic Versioning
The long-standing aspiration for software reuse has made astonishing strides
in the past few years. Many modern software development ecosystems now come
with rich sets of publicly-available components contributed by the community.
Downstream developers can leverage these upstream components, boosting their
productivity.
However, components evolve at their own pace. This imposes obligations on and
yields benefits for downstream developers, especially since changes can be
breaking, requiring additional downstream work to adapt to. Upgrading too late
leaves downstream vulnerable to security issues and missing out on useful
improvements; upgrading too early results in excess work. Semantic versioning
has been proposed as an elegant mechanism to communicate levels of
compatibility, enabling downstream developers to automate dependency upgrades.
While it is questionable whether a version number can adequately characterize
version compatibility in general, we argue that developers would greatly
benefit from tools such as semantic version calculators to help them upgrade
safely. The time is now for the research community to develop such tools: large
component ecosystems exist and are accessible, component interactions have
become observable through automated builds, and recent advances in program
analysis make the development of relevant tools feasible. In particular,
contracts (both traditional and lightweight) are a promising input to semantic
versioning calculators, which can suggest whether an upgrade is likely to be
safe.Comment: to be published as Onward! Essays 202
- …