97,610 research outputs found
Artifact Lifecycle Discovery
Artifact-centric modeling is a promising approach for modeling business
processes based on the so-called business artifacts - key entities driving the
company's operations and whose lifecycles define the overall business process.
While artifact-centric modeling shows significant advantages, the overwhelming
majority of existing process mining methods cannot be applied (directly) as
they are tailored to discover monolithic process models. This paper addresses
the problem by proposing a chain of methods that can be applied to discover
artifact lifecycle models in Guard-Stage-Milestone notation. We decompose the
problem in such a way that a wide range of existing (non-artifact-centric)
process discovery and analysis methods can be reused in a flexible manner. The
methods presented in this paper are implemented as software plug-ins for ProM,
a generic open-source framework and architecture for implementing process
mining tools
Discovery of Linguistic Relations Using Lexical Attraction
This work has been motivated by two long term goals: to understand how humans
learn language and to build programs that can understand language. Using a
representation that makes the relevant features explicit is a prerequisite for
successful learning and understanding. Therefore, I chose to represent
relations between individual words explicitly in my model. Lexical attraction
is defined as the likelihood of such relations. I introduce a new class of
probabilistic language models named lexical attraction models which can
represent long distance relations between words and I formalize this new class
of models using information theory.
Within the framework of lexical attraction, I developed an unsupervised
language acquisition program that learns to identify linguistic relations in a
given sentence. The only explicitly represented linguistic knowledge in the
program is lexical attraction. There is no initial grammar or lexicon built in
and the only input is raw text. Learning and processing are interdigitated. The
processor uses the regularities detected by the learner to impose structure on
the input. This structure enables the learner to detect higher level
regularities. Using this bootstrapping procedure, the program was trained on
100 million words of Associated Press material and was able to achieve 60%
precision and 50% recall in finding relations between content-words. Using
knowledge of lexical attraction, the program can identify the correct relations
in syntactically ambiguous sentences such as ``I saw the Statue of Liberty
flying over New York.''Comment: dissertation, 56 page
A Physics-Based Approach to Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Given that observational and numerical climate data are being produced at
ever more prodigious rates, increasingly sophisticated and automated analysis
techniques have become essential. Deep learning is quickly becoming a standard
approach for such analyses and, while great progress is being made, major
challenges remain. Unlike commercial applications in which deep learning has
led to surprising successes, scientific data is highly complex and typically
unlabeled. Moreover, interpretability and detecting new mechanisms are key to
scientific discovery. To enhance discovery we present a complementary
physics-based, data-driven approach that exploits the causal nature of
spatiotemporal data sets generated by local dynamics (e.g. hydrodynamic flows).
We illustrate how novel patterns and coherent structures can be discovered in
cellular automata and outline the path from them to climate data.Comment: 4 pages, 1 figure;
http://csc.ucdavis.edu/~cmg/compmech/pubs/ci2017_Rupe_et_al.ht
Quantifying the search for solid Li-ion electrolyte materials by anion: a data-driven perspective
We compile data and machine learned models of solid Li-ion electrolyte
performance to assess the state of materials discovery efforts and build new
insights for future efforts. Candidate electrolyte materials must satisfy
several requirements, chief among them fast ionic conductivity and robust
electrochemical stability. Considering these two requirements, we find new
evidence to suggest that optimization of the sulfides for fast ionic
conductivity and wide electrochemical stability may be more likely than
optimization of the oxides, and that the oft-overlooked chlorides and bromides
may be particularly promising families for Li-ion electrolytes. We also find
that the nitrides and phosphides appear to be the most promising material
families for electrolytes stable against Li-metal anodes. Furthermore, the
spread of the existing data in performance space suggests that fast conducting
materials that are stable against both Li metal and a >4V cathode are
exceedingly rare, and that a multiple-electrolyte architecture is a more likely
path to successfully realizing a solid-state Li metal battery by approximately
an order of magnitude or more. Our model is validated by its reproduction of
well-known trends that have emerged from the limited existing data in recent
years, namely that the electronegativity of the lattice anion correlates with
ionic conductivity and electrochemical stability. In this work, we leverage the
existing data to make solid electrolyte performance trends quantitative for the
first time, building a roadmap to complement material discovery efforts around
desired material performance.Comment: Main text is 41 pages with 3 figures and 2 tables; attached
supplemental information is 8 pages with 3 figure
From Frequency to Meaning: Vector Space Models of Semantics
Computers understand very little of the meaning of human language. This
profoundly limits our ability to give instructions to computers, the ability of
computers to explain their actions to us, and the ability of computers to
analyse and process text. Vector space models (VSMs) of semantics are beginning
to address these limits. This paper surveys the use of VSMs for semantic
processing of text. We organize the literature on VSMs according to the
structure of the matrix in a VSM. There are currently three broad classes of
VSMs, based on term-document, word-context, and pair-pattern matrices, yielding
three classes of applications. We survey a broad range of applications in these
three categories and we take a detailed look at a specific open source project
in each category. Our goal in this survey is to show the breadth of
applications of VSMs for semantics, to provide a new perspective on VSMs for
those who are already familiar with the area, and to provide pointers into the
literature for those who are less familiar with the field
- …