49,905 research outputs found

    Efficient Discovery of Ontology Functional Dependencies

    Full text link
    Poor data quality has become a pervasive issue due to the increasing complexity and size of modern datasets. Constraint based data cleaning techniques rely on integrity constraints as a benchmark to identify and correct errors. Data values that do not satisfy the given set of constraints are flagged as dirty, and data updates are made to re-align the data and the constraints. However, many errors often require user input to resolve due to domain expertise defining specific terminology and relationships. For example, in pharmaceuticals, 'Advil' \emph{is-a} brand name for 'ibuprofen' that can be captured in a pharmaceutical ontology. While functional dependencies (FDs) have traditionally been used in existing data cleaning solutions to model syntactic equivalence, they are not able to model broader relationships (e.g., is-a) defined by an ontology. In this paper, we take a first step towards extending the set of data quality constraints used in data cleaning by defining and discovering \emph{Ontology Functional Dependencies} (OFDs). We lay out theoretical and practical foundations for OFDs, including a set of sound and complete axioms, and a linear inference procedure. We then develop effective algorithms for discovering OFDs, and a set of optimizations that efficiently prune the search space. Our experimental evaluation using real data show the scalability and accuracy of our algorithms.Comment: 12 page

    Mining local staircase patterns in noisy data

    Get PDF
    Most traditional biclustering algorithms identify biclusters with no or little overlap. In this paper, we introduce the problem of identifying staircases of biclusters. Such staircases may be indicative for causal relationships between columns and can not easily be identified by existing biclustering algorithms. Our formalization relies on a scoring function based on the Minimum Description Length principle. Furthermore, we propose a first algorithm for identifying staircase biclusters, based on a combination of local search and constraint programming. Experiments show that the approach is promising

    A Review of the Mass Measurement Techniques proposed for the Large Hadron Collider

    Full text link
    We review the methods which have been proposed for measuring masses of new particles at the Large Hadron Collider paying particular attention to the kinematical techniques suitable for extracting mass information when invisible particles are expected.Comment: 72 pages - in form to be published in JPhys

    FairFuzz: Targeting Rare Branches to Rapidly Increase Greybox Fuzz Testing Coverage

    Full text link
    In recent years, fuzz testing has proven itself to be one of the most effective techniques for finding correctness bugs and security vulnerabilities in practice. One particular fuzz testing tool, American Fuzzy Lop or AFL, has become popular thanks to its ease-of-use and bug-finding power. However, AFL remains limited in the depth of program coverage it achieves, in particular because it does not consider which parts of program inputs should not be mutated in order to maintain deep program coverage. We propose an approach, FairFuzz, that helps alleviate this limitation in two key steps. First, FairFuzz automatically prioritizes inputs exercising rare parts of the program under test. Second, it automatically adjusts the mutation of inputs so that the mutated inputs are more likely to exercise these same rare parts of the program. We conduct evaluation on real-world programs against state-of-the-art versions of AFL, thoroughly repeating experiments to get good measures of variability. We find that on certain benchmarks FairFuzz shows significant coverage increases after 24 hours compared to state-of-the-art versions of AFL, while on others it achieves high program coverage at a significantly faster rate

    Discovering the Higgs with Low Mass Muon Pairs

    Full text link
    Many models of electroweak symmetry breaking have an additional light pseudoscalar. If the Higgs boson can decay to a new pseudoscalar, LEP searches for the Higgs can be significantly altered and the Higgs can be as light as 86 GeV. Discovering the Higgs boson in these models is challenging when the pseudoscalar is lighter than 10 GeV because it decays dominantly into tau leptons. In this paper, we discuss discovering the Higgs in a subdominant decay mode where one of the pseudoscalars decays to a pair of muons. This search allows for potential discovery of a cascade-decaying Higgs boson with the complete Tevatron data set or early data at the LHC.Comment: 10 pages, 7 figure

    Psychological Climate and Work Attitudes: The Importance of Telling the Right Story

    Get PDF
    In this field study, the authors explore how choosing one context over another influences both research results and implications. Using both quantitative and qualitative data, the authors examine context from both an organizational and a business-unit perspective by studying relationships between five psychological climate variables and outcomes of job satisfaction, affective commitment, and intent to leave. Results show different contextual influences between the organization and two business units, suggesting that different bundles of psychological climate variables yield similar outcomes depending on the context studied. These results bolster the contention that researchers need to identify the right context in field research
    corecore