70 research outputs found
Incremental materialization of object-oriented views
We present an approach to handle incremental materialization of object-oriented views. Queries that define views are implemented as methods that are invoked to compute corresponding views. To avoid computation from scratch each time a view is accessed, we introduce some deferred update algorithms that reflect for a view only related modifications introduced into the database while that view was inactive. A view is updated by considering modifications performed within all classes along the inheritance and class-composition subhierarchies rooted at every class used in deriving that view. To each class, we add a modification list to keep one modification tuple per view dependent on that class. Such a tuple acts as a reference point that marks the start of the next update to the corresponding view. © 1999 Elsevier Science B.V. All rights reserved
View maintenance in object-oriented databases
in this paper, we present a model that facilitates view maintenance within object-oriented databases. For that purpose, we differentiate between two categories of classes, base classes and brother classes. While the former constitute the actual database, the latter are introduced to hold virtual database, i.e., views derived from base classes. To achieve incremental view update, we introduce a modification list into each base class. A series of algorithms are developed to serve the purpose. Finally it happened that, view maintenance within object-oriented databases subsumes that within the nested and hence conventional relational models
Object-oriented query language facilitating construction of new objects
In object-oriented database systems, messages can be used to manipulate the database; however, a query language is still a required component of any kind of database system. In the paper, we describe a query language for object-oriented databases where both objects as well as behaviour defined in them are handled. Not only existing objects are manipulated; the introduction of new relationships and new objects constructed out of existing ones is also facilitated. The operations supported in the described query language subsumes those of the relational algebra aiming at a more powerful query language than the relational algebra. Among the additional operators, there is an operator that handles the application of an aggregate function on objects in an operand while still having the result possessing the characteristics of an operand. The result of a query as well as the operands are considered to have a pair of sets, a set of objects and a set of message expressions; where a message expression is a sequence of messages. A message expression handles both stored and derived values and hence provides a full computational power without having an embedded query language with impedance mismatch. Therefore the closure property is maintained by having the result of a query possessing the characteristics of an operand. Furthermore, we define a set of objects and derive a set of message expressions for every class; hence any class can be an operand. Moreover, the result of a query has the characteristics of a class and its superclass/subclass relationships with the operands are established to make it persistent. © 1993
Model inference for spreadsheets
Many errors in spreadsheet formulas can be avoided if spreadsheets are built automati-
cally from higher-level models that can encode and enforce consistency constraints in the generated
spreadsheets. Employing this strategy for legacy spreadsheets is dificult, because the model has
to be reverse engineered from an existing spreadsheet and existing data must be transferred into
the new model-generated spreadsheet.
We have developed and implemented a technique that automatically infers relational schemas
from spreadsheets. This technique uses particularities from the spreadsheet realm to create better
schemas. We have evaluated this technique in two ways: First, we have demonstrated its appli-
cability by using it on a set of real-world spreadsheets. Second, we have run an empirical study
with users. The study has shown that the results produced by our technique are comparable to
the ones developed by experts starting from the same (legacy) spreadsheet data.
Although relational schemas are very useful to model data, they do not t well spreadsheets as
they do not allow to express layout. Thus, we have also introduced a mapping between relational
schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards
it against a large class of formula errors. The developed tool is a contribution to spreadsheet
(reverse) engineering, because it lls an important gap and allows a promising design method
(ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.We would like to thank Orlando Belo for his help on running and analyzing the empirical study. We would also like to thank Paulo Azevedo for his help in conducting the statistical analysis of our empirical study. We would also like to thank the anonymous reviewers for their suggestions which helped us to improve the paper. This work is funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundacao para a Ciencia e a Tecnologia (Portuguese Foundation for Science and Technology) within project FCOMP-01-0124-FEDER-010048. The first author was also supported by FCT grant SFRH/BPD/73358/2010
Representative transcript sets for evaluating a translational initiation sites predictor
<p>Abstract</p> <p>Background</p> <p>Translational initiation site (TIS) prediction is a very important and actively studied topic in bioinformatics. In order to complete a comparative analysis, it is desirable to have several benchmark data sets which can be used to test the effectiveness of different algorithms. An ideal benchmark data set should be reliable, representative and readily available. Preferably, proteins encoded by members of the data set should also be representative of the protein population actually expressed in cellular specimens.</p> <p>Results</p> <p>In this paper, we report a general algorithm for constructing a reliable sequence collection that only includes mRNA sequences whose corresponding protein products present an average profile of the general protein population of a given organism, with respect to three major structural parameters. Four representative transcript collections, each derived from a model organism, have been obtained following the algorithm we propose. Evaluation of these data sets shows that they are reasonable representations of the spectrum of proteins obtained from cellular proteomic studies. Six state-of-the-art predictors have been used to test the usefulness of the construction algorithm that we proposed. Comparative study which reports the predictors' performance on our data set as well as three other existing benchmark collections has demonstrated the actual merits of our data sets as benchmark testing collections.</p> <p>Conclusion</p> <p>The proposed data set construction algorithm has demonstrated its property of being a general and widely applicable scheme. Our comparison with published proteomic studies has shown that the expression of our data set of transcripts generates a polypeptide population that is representative of that obtained from evaluation of biological specimens. Our data set thus represents "real world" transcripts that will allow more accurate evaluation of algorithms dedicated to identification of TISs, as well as other translational regulatory motifs within mRNA sequences. The algorithm proposed by us aims at compiling a redundancy-free data set by removing redundant copies of homologous proteins. The existence of such data sets may be useful for conducting statistical analyses of protein sequence-structure relations. At the current stage, our approach's focus is to obtain an "average" protein data set for any particular organism without posing much selection bias. However, with the three major protein structural parameters deeply integrated into the scheme, it would be a trivial task to extend the current method for obtaining a more selective protein data set, which may facilitate the study of some particular protein structure.</p
Risk factors for endometrial cancer in black and white women: a pooled analysis from the epidemiology of endometrial cancer consortium (E2C2)
Endometrial cancer (EC) is the most common gynecologic cancer in the United States. Over the last decade, the incidence rate has been increasing, with a larger increase among blacks. The aim of this study was to compare risk factors for EC in black and white women
A comprehensive approach for validating p53 binding site predictions
8th International Conference on Information Technology (2017 : Amman; Jordan)Predicting the locations of Response Elements (RE) has received considerable attention in the field of gene sequence analysis and bioinformatics. Protein53 (p53) has a prominent role in the cell cycle and cancer prevention; it functions as a transcription factor and binds with p53 REs in the DNA. The identification of p53 response elements enlightens the unknown functions and characteristics of p53 besides the genes containing binding sites. In this work, we have proposed an algorithm for validating the prediction of the possible p53 binding sites in the human genome, by incorporating the recent findings on the p53 REs into our suggested profile hidden Markov model (PHMM). We constructed two PHMMs and the results described in this paper are very promising. In the experiments, we have used the p53 REs data reported by Riley et al. [21]. © 2017 IEEE
- …