97,610 research outputs found

    Artifact Lifecycle Discovery

    Get PDF
    Artifact-centric modeling is a promising approach for modeling business processes based on the so-called business artifacts - key entities driving the company's operations and whose lifecycles define the overall business process. While artifact-centric modeling shows significant advantages, the overwhelming majority of existing process mining methods cannot be applied (directly) as they are tailored to discover monolithic process models. This paper addresses the problem by proposing a chain of methods that can be applied to discover artifact lifecycle models in Guard-Stage-Milestone notation. We decompose the problem in such a way that a wide range of existing (non-artifact-centric) process discovery and analysis methods can be reused in a flexible manner. The methods presented in this paper are implemented as software plug-ins for ProM, a generic open-source framework and architecture for implementing process mining tools

    Discovery of Linguistic Relations Using Lexical Attraction

    Full text link
    This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the likelihood of such relations. I introduce a new class of probabilistic language models named lexical attraction models which can represent long distance relations between words and I formalize this new class of models using information theory. Within the framework of lexical attraction, I developed an unsupervised language acquisition program that learns to identify linguistic relations in a given sentence. The only explicitly represented linguistic knowledge in the program is lexical attraction. There is no initial grammar or lexicon built in and the only input is raw text. Learning and processing are interdigitated. The processor uses the regularities detected by the learner to impose structure on the input. This structure enables the learner to detect higher level regularities. Using this bootstrapping procedure, the program was trained on 100 million words of Associated Press material and was able to achieve 60% precision and 50% recall in finding relations between content-words. Using knowledge of lexical attraction, the program can identify the correct relations in syntactically ambiguous sentences such as ``I saw the Statue of Liberty flying over New York.''Comment: dissertation, 56 page

    A Physics-Based Approach to Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems

    Full text link
    Given that observational and numerical climate data are being produced at ever more prodigious rates, increasingly sophisticated and automated analysis techniques have become essential. Deep learning is quickly becoming a standard approach for such analyses and, while great progress is being made, major challenges remain. Unlike commercial applications in which deep learning has led to surprising successes, scientific data is highly complex and typically unlabeled. Moreover, interpretability and detecting new mechanisms are key to scientific discovery. To enhance discovery we present a complementary physics-based, data-driven approach that exploits the causal nature of spatiotemporal data sets generated by local dynamics (e.g. hydrodynamic flows). We illustrate how novel patterns and coherent structures can be discovered in cellular automata and outline the path from them to climate data.Comment: 4 pages, 1 figure; http://csc.ucdavis.edu/~cmg/compmech/pubs/ci2017_Rupe_et_al.ht

    Quantifying the search for solid Li-ion electrolyte materials by anion: a data-driven perspective

    Get PDF
    We compile data and machine learned models of solid Li-ion electrolyte performance to assess the state of materials discovery efforts and build new insights for future efforts. Candidate electrolyte materials must satisfy several requirements, chief among them fast ionic conductivity and robust electrochemical stability. Considering these two requirements, we find new evidence to suggest that optimization of the sulfides for fast ionic conductivity and wide electrochemical stability may be more likely than optimization of the oxides, and that the oft-overlooked chlorides and bromides may be particularly promising families for Li-ion electrolytes. We also find that the nitrides and phosphides appear to be the most promising material families for electrolytes stable against Li-metal anodes. Furthermore, the spread of the existing data in performance space suggests that fast conducting materials that are stable against both Li metal and a >4V cathode are exceedingly rare, and that a multiple-electrolyte architecture is a more likely path to successfully realizing a solid-state Li metal battery by approximately an order of magnitude or more. Our model is validated by its reproduction of well-known trends that have emerged from the limited existing data in recent years, namely that the electronegativity of the lattice anion correlates with ionic conductivity and electrochemical stability. In this work, we leverage the existing data to make solid electrolyte performance trends quantitative for the first time, building a roadmap to complement material discovery efforts around desired material performance.Comment: Main text is 41 pages with 3 figures and 2 tables; attached supplemental information is 8 pages with 3 figure

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field
    • …
    corecore