4 research outputs found
Guard-Function-Constraint-Based Refinement Method to Generate Dynamic Behaviors of Workflow Net with Table
In order to model complex workflow systems with databases, and detect their data-flow errors such as data inconsistency, we defined Workflow Net with Table model (WFT-net) in our previous work. We used a Petri net to describe control flows and data flows of a workflow system, and labeled some abstract table operation statements on transitions so as to simulate database operations. Meanwhile, we proposed a data refinement method to construct the state reachability graph of WFT-nets, and used it to verify some properties. However, this data refinement method has a defect, i.e., it does not consider the constraint relation between guard functions, and its state reachability graph possibly has some pseudo states. In order to overcome these problems, we propose a new data refinement method that considers some constraint relations, which can guarantee the correctness of our state reachability graph. What is more, we develop the related algorithms and tool. We also illustrate the usefulness and effectiveness of our method through some examples
A proposal for supporting text interpretation process by means of NLP and Software Engineering Techniques
RESUMEN: En este artículo, se presenta una propuesta para la asistencia al proceso de interpretación de textos. La
propuesta, se basa en la generación automática, a partir del texto, de un esquema conceptual utilizado en ingeniería de software llamado diagrama Entidad Relación (ER). Además, se muestra la utilidad del diagrama ER en el proceso de interpretación de textos, así como las técnicas de Procesamiento de lenguaje natural y de Ingeniería de Software que se utilizan para su derivación automática. Los resultados obtenidos, muestran cómo el diagrama ER puede ser una valiosa herramienta de apoyo al proceso de interpretación, gracias a las inferencias que, de manera automática, se realizan a través de él. Este trabajo es uno de los resultados obtenidos en la investigación de Maestría: “Método par a el reconocimiento de operaciones del diagrama de clases a partir de grafos conceptuales” culminada en la Universidad Nacional de Colombia bajo la tutoría del Grupo de Investigación en Ingeniería de Software
Doctor of Philosophy
dissertationVisualization has emerged as an effective means to quickly obtain insight from raw data. While simple computer programs can generate simple visualizations, and while there has been constant progress in sophisticated algorithms and techniques for generating insightful pictorial descriptions of complex data, the process of building visualizations remains a major bottleneck in data exploration. In this thesis, we present the main design and implementation aspects of VisTrails, a system designed around the idea of transparently capturing the exploration process that leads to a particular visualization. In particular, VisTrails explores the idea of provenance management in visualization systems: keeping extensive metadata about how the visualizations were created and how they relate to one another. This thesis presents the provenance data model in VisTrails, which can be easily adopted by existing visualization systems and libraries. This lightweight model entirely captures the exploration process of the user, and it can be seen as an electronic analogue of the scientific notebook. The provenance metadata collected during the creation of pipelines can be reused to suggest similar content in related visualizations and guide semi-automated changes. This thesis presents the idea of building visualizations by analogy in a system that allows users to change many visualizations at once, without requiring them to interact with the visualization specifications. It then proposes techniques to help users construct pipelines by consensus, automatically suggesting completions based on a database of previously created pipelines. By presenting these predictions in a carefully designed interface, users can create visualizations and other data products more efficiently because they can augment their normal work patterns with the suggested completions. VisTrails leverages the workflow specifications to identify and avoid redundant operations. This optimization is especially useful while exploring multiple visualizations. When variations of the same pipeline need to be executed, substantial speedups can be obtained by caching the results of overlapping subsequences of the pipelines. We present the design decisions behind the execution engine, and how it easily supports the execution of arbitrary third-party modules. These specifications also facilitate the reproduction of previous results. We will present a description of an infrastructure that makes the workflows a complete description of the computational processes, including information necessary to identify and install necessary system libraries. In an environment where effective visualization and data analysis tasks combine many different software packages, this infrastructure can mean the difference between being able to replicate published results and getting lost in a sea of software dependencies and missing libraries. The thesis concludes with a discussion of the system architecture, design decisions and learned lessons in VisTrails. This discussion is meant to clarify the issues present in creating a system based around a provenance tracking engine, and should help implementors decide how to best incorporate these notions into their own systems
Structural Diversity of Biological Ligands and their Binding Sites in Proteins
The phenomenon of molecular recognition, which underpins almost all biological processes, is dynamic,
complex and subtle. Establishing an interaction between a pair of molecules involves mutual structural
rearrangements guided by a highly convoluted energy landscape, the accurate mapping of which continues
to elude us. The analysis of interactions between proteins and small molecules has been a focus of intense
interest for many years, offering as it does the promise of increased insight into many areas of biology, and
the potential for greatly improved drug design methodologies. Computational methods for predicting which
types of ligand a given protein may bind, and what conformation two molecules will adopt once paired, are
particularly sought after.
The work presented in this thesis aims to quantify the amount of structural variability observed in the ways
in which proteins interact with ligands. This diversity is considered from two perspectives: to what extent
ligands bind to different proteins in distinct conformations, and the degree to which binding sites specific for
the same ligand have different atomic structures.
The first study could be of value to approaches which aim to predict the bound pose of a ligand, since
by cataloguing the range of conformations previously observed, it may be possible to better judge the
biological likelihood of a newly predicted molecular arrangement. The findings show that several common
biological ligands exhibit considerable conformational diversity when bound to proteins. Although binding
in predominantly extended conformations, the analysis presented here highlights several cases in which the
biological requirements of a given protein force its ligand to adopt a highly compact form. Comparing the
conformational diversity observed within several protein families, the hypothesis that homologous proteins
tend to bind ligands in a similar arrangement is generally upheld, but several families are identified in which
this is demonstrably not the case.
Consideration of diversity in the binding site itself, on the other hand, may be useful in guiding methods
which search for binding sites in uncharacterised protein structures: identifying those regions of known sites
which are less variable could help to focus the search only on the most important features. Analysis of the
diversity of a non-redundant dataset of adenine binding sites shows that a small number of key interactions are
conserved, with the majority of the fragment environment being highly variable. Just as ligand conformation
varies between protein families, so the degree of binding site diversity is observed to be significantly higher
in some families than others.
Taken together, the results of this work suggest that the repertoire of strategies produced by nature for the
purposes of molecular recognition are extremely extensive. Moreover, the importance of a given ligand
conformation or pattern of interaction appears to vary greatly depending on the function of the particular
group of proteins studied. As such, it is proposed that diversity analysis may form a significant part of future
large-scale studies of ligand-protein interactions