2,825 research outputs found

    Information fusion from multiple databases using meta-association rules

    Get PDF
    Nowadays, data volume, distribution, and volatility make it difficult to search global patterns by applying traditional Data Mining techniques. In the case of data in a distributed environment, sometimes a local analysis of each dataset separately is adequate but some other times a global decision is needed by the analysis of the entire data. Association rules discovering methods typically require a single uniform dataset and managing with the entire set of distributed data is not possible due to its size. To address the scenarios in which satisfying this requirement is not practical or even feasible, we propose a new method for fusing information, in the form of rules, extracted from multiple datasets. The proposed model produces meta-association rules, i.e. rules in which the antecedent or the consequent may contain rules as well, for finding joint correlations among trends found individually in each dataset. In this paper, we describe the formulation and the implementation of two alternative frameworks that obtain, respectively, crisp meta-rules and fuzzy meta-rules. We compare our proposal with the information obtained when the datasets are not separated, in order to see the main differences between traditional association rules and meta-association rules. We also compare crisp and fuzzy methods for meta-association rule mining, observing that the fuzzy approach offers several advantages: it is more accurate since it incorporates the strength or validity of the previous information, produces a more manageable set of rules for human inspection, and allows the incorporation of contextual information to the mining process expressed in a more human-friendly format

    Combining information seeking services into a meta supply chain of facts

    Get PDF
    The World Wide Web has become a vital supplier of information that allows organizations to carry on such tasks as business intelligence, security monitoring, and risk assessments. Having a quick and reliable supply of correct facts from perspective is often mission critical. By following design science guidelines, we have explored ways to recombine facts from multiple sources, each with possibly different levels of responsiveness and accuracy, into one robust supply chain. Inspired by prior research on keyword-based meta-search engines (e.g., metacrawler.com), we have adapted the existing question answering algorithms for the task of analysis and triangulation of facts. We present a first prototype for a meta approach to fact seeking. Our meta engine sends a user's question to several fact seeking services that are publicly available on the Web (e.g., ask.com, brainboost.com, answerbus.com, NSIR, etc.) and analyzes the returned results jointly to identify and present to the user those that are most likely to be factually correct. The results of our evaluation on the standard test sets widely used in prior research support the evidence for the following: 1) the value-added of the meta approach: its performance surpasses the performance of each supplier, 2) the importance of using fact seeking services as suppliers to the meta engine rather than keyword driven search portals, and 3) the resilience of the meta approach: eliminating a single service does not noticeably impact the overall performance. We show that these properties make the meta-approach a more reliable supplier of facts than any of the currently available stand-alone services

    A Survey on Ear Biometrics

    No full text
    Recognizing people by their ear has recently received significant attention in the literature. Several reasons account for this trend: first, ear recognition does not suffer from some problems associated with other non contact biometrics, such as face recognition; second, it is the most promising candidate for combination with the face in the context of multi-pose face recognition; and third, the ear can be used for human recognition in surveillance videos where the face may be occluded completely or in part. Further, the ear appears to degrade little with age. Even though, current ear detection and recognition systems have reached a certain level of maturity, their success is limited to controlled indoor conditions. In addition to variation in illumination, other open research problems include hair occlusion; earprint forensics; ear symmetry; ear classification; and ear individuality. This paper provides a detailed survey of research conducted in ear detection and recognition. It provides an up-to-date review of the existing literature revealing the current state-of-art for not only those who are working in this area but also for those who might exploit this new approach. Furthermore, it offers insights into some unsolved ear recognition problems as well as ear databases available for researchers

    Integrating gene and protein expression data with genome-scale metabolic networks to infer functional pathways

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund. Copyright @ 2013 Pey et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background: The study of cellular metabolism in the context of high-throughput -omics data has allowed us to decipher novel mechanisms of importance in biotechnology and health. To continue with this progress, it is essential to efficiently integrate experimental data into metabolic modeling. Results: We present here an in-silico framework to infer relevant metabolic pathways for a particular phenotype under study based on its gene/protein expression data. This framework is based on the Carbon Flux Path (CFP) approach, a mixed-integer linear program that expands classical path finding techniques by considering additional biophysical constraints. In particular, the objective function of the CFP approach is amended to account for gene/protein expression data and influence obtained paths. This approach is termed integrative Carbon Flux Path (iCFP). We show that gene/protein expression data also influences the stoichiometric balancing of CFPs, which provides a more accurate picture of active metabolic pathways. This is illustrated in both a theoretical and real scenario. Finally, we apply this approach to find novel pathways relevant in the regulation of acetate overflow metabolism in Escherichia coli. As a result, several targets which could be relevant for better understanding of the phenomenon leading to impaired acetate overflow are proposed. Conclusions: A novel mathematical framework that determines functional pathways based on gene/protein expression data is presented and validated. We show that our approach is able to provide new insights into complex biological scenarios such as acetate overflow in Escherichia coli.Basque Governmen

    An information assistant system for the prevention of tunnel vision in crisis management

    Get PDF
    In the crisis management environment, tunnel vision is a set of bias in decision makers’ cognitive process which often leads to incorrect understanding of the real crisis situation, biased perception of information, and improper decisions. The tunnel vision phenomenon is a consequence of both the challenges in the task and the natural limitation in a human being’s cognitive process. An information assistant system is proposed with the purpose of preventing tunnel vision. The system serves as a platform for monitoring the on-going crisis event. All information goes through the system before arrives at the user. The system enhances the data quality, reduces the data quantity and presents the crisis information in a manner that prevents or repairs the user’s cognitive overload. While working with such a system, the users (crisis managers) are expected to be more likely to stay aware of the actual situation, stay open minded to possibilities, and make proper decisions

    A survey on Data Extraction and Data Duplication Detection

    Get PDF
    Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Processing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algorithms are needed to extract useful features from huge amount of data. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. This Paper review the literature on duplicate detection and data fusion (remov e and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF
    corecore