73,761 research outputs found

    Towards a semantic and statistical selection of association rules

    Full text link
    The increasing growth of databases raises an urgent need for more accurate methods to better understand the stored data. In this scope, association rules were extensively used for the analysis and the comprehension of huge amounts of data. However, the number of generated rules is too large to be efficiently analyzed and explored in any further process. Association rules selection is a classical topic to address this issue, yet, new innovated approaches are required in order to provide help to decision makers. Hence, many interesting- ness measures have been defined to statistically evaluate and filter the association rules. However, these measures present two major problems. On the one hand, they do not allow eliminating irrelevant rules, on the other hand, their abun- dance leads to the heterogeneity of the evaluation results which leads to confusion in decision making. In this paper, we propose a two-winged approach to select statistically in- teresting and semantically incomparable rules. Our statis- tical selection helps discovering interesting association rules without favoring or excluding any measure. The semantic comparability helps to decide if the considered association rules are semantically related i.e comparable. The outcomes of our experiments on real datasets show promising results in terms of reduction in the number of rules

    Semantic data mining and linked data for a recommender system in the AEC industry

    Get PDF
    Even though it can provide design teams with valuable performance insights and enhance decision-making, monitored building data is rarely reused in an effective feedback loop from operation to design. Data mining allows users to obtain such insights from the large datasets generated throughout the building life cycle. Furthermore, semantic web technologies allow to formally represent the built environment and retrieve knowledge in response to domain-specific requirements. Both approaches have independently established themselves as powerful aids in decision-making. Combining them can enrich data mining processes with domain knowledge and facilitate knowledge discovery, representation and reuse. In this article, we look into the available data mining techniques and investigate to what extent they can be fused with semantic web technologies to provide recommendations to the end user in performance-oriented design. We demonstrate an initial implementation of a linked data-based system for generation of recommendations

    Optical tomography: Image improvement using mixed projection of parallel and fan beam modes

    Get PDF
    Mixed parallel and fan beam projection is a technique used to increase the quality images. This research focuses on enhancing the image quality in optical tomography. Image quality can be defined by measuring the Peak Signal to Noise Ratio (PSNR) and Normalized Mean Square Error (NMSE) parameters. The findings of this research prove that by combining parallel and fan beam projection, the image quality can be increased by more than 10%in terms of its PSNR value and more than 100% in terms of its NMSE value compared to a single parallel beam

    How much hybridisation does machine translation need?

    Get PDF
    This is the peer reviewed version of the following article: [Costa-jussà, M. R. (2015), How much hybridization does machine translation Need?. J Assn Inf Sci Tec, 66: 2160–2165. doi:10.1002/asi.23517], which has been published in final form at [10.1002/asi.23517]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.Rule-based and corpus-based machine translation (MT)have coexisted for more than 20 years. Recently, bound-aries between the two paradigms have narrowed andhybrid approaches are gaining interest from bothacademia and businesses. However, since hybridapproaches involve the multidisciplinary interaction oflinguists, computer scientists, engineers, and informa-tion specialists, understandably a number of issuesexist.While statistical methods currently dominate researchwork in MT, most commercial MT systems are techni-cally hybrid systems. The research community shouldinvestigate the bene¿ts and questions surrounding thehybridization of MT systems more actively. This paperdiscusses various issues related to hybrid MT includingits origins, architectures, achievements, and frustra-tions experienced in the community. It can be said thatboth rule-based and corpus- based MT systems havebene¿ted from hybridization when effectively integrated.In fact, many of the current rule/corpus-based MTapproaches are already hybridized since they do includestatistics/rules at some point.Peer ReviewedPostprint (author's final draft

    Comprehensive Review of Opinion Summarization

    Get PDF
    The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe

    Sentiment Lexicon Adaptation with Context and Semantics for the Social Web

    Get PDF
    Sentiment analysis over social streams offers governments and organisations a fast and effective way to monitor the publics' feelings towards policies, brands, business, etc. General purpose sentiment lexicons have been used to compute sentiment from social streams, since they are simple and effective. They calculate the overall sentiment of texts by using a general collection of words, with predetermined sentiment orientation and strength. However, words' sentiment often vary with the contexts in which they appear, and new words might be encountered that are not covered by the lexicon, particularly in social media environments where content emerges and changes rapidly and constantly. In this paper, we propose a lexicon adaptation approach that uses contextual as well as semantic information extracted from DBPedia to update the words' weighted sentiment orientations and to add new words to the lexicon. We evaluate our approach on three different Twitter datasets, and show that enriching the lexicon with contextual and semantic information improves sentiment computation by 3.4% in average accuracy, and by 2.8% in average F1 measure
    corecore