995 research outputs found
Recommended from our members
A practical mandatory access control model for XML databases
A practical mandatory access control (MAC) model for XML databases is presented in this paper. The
label type and label access policy can be defined according to the requirements of different applications. In order to
preserve the integrity of data in XML databases, a constraint between a read-access rule and a write-access rule in
label access policy is introduced. Rules for label assignment and propagation are presented to alleviate the workload
of label assignments. Furthermore, a solution for resolving conflicts in label assignments is proposed. Rules for
update-related operations, rules for exceptional privileges of ordinary users and the administrator are also proposed
to preserve the security of operations in XML databases. The MAC model, we proposed in this study, has been
implemented in an XML database. Test results demonstrated that our approach provides rational and scalable
performance
On Fine-Grained Access Control for XML
Fine-grained access control for XML is about controlling access to XML documents at the granularity of individual elements or attributes. This thesis addresses two problems related to XML access controls. The first is efficient, secure evaluation of XPath expressions. We present a technique that secures path expressions by means of query modification, and we show that the query modification algorithm is correct under a language-independent semantics for secure query evaluation. The second problem is to provide a compact, yet useful, representation of the access matrix. Since determining a user's privilege directly from access control policies can be extremely inefficient, materializing the access matrix---the net effect of the access control policies---is a common approach to speed up the authorization decision making. The fine-grained nature of XML access controls, however, makes the space cost of matrix materialization a significant issue. We present a codebook-based technique that records access matrices compactly. Our experimental study shows that the codebook approach exhibits significant space savings over other storage schemes, such as the access control list and the compressed accessibility map. The solutions to the above two problems provide a foundation for the development of an efficient mechanism that enforces fine-grained access controls for XML databases in the cases of query access
Query Evaluation in the Presence of Fine-grained Access Control
Access controls are mechanisms to enhance security by protecting
data from unauthorized accesses. In contrast to traditional access
controls that grant access rights at the granularity of the whole
tables or views, fine-grained access controls specify access
controls at finer granularity, e.g., individual nodes in XML
databases and individual tuples in relational databases.
While there is a voluminous literature on specifying and modeling
fine-grained access controls, less work has been done to address
the performance issues of database systems with fine-grained
access controls. This thesis addresses the performance issues of
fine-grained access controls and proposes corresponding solutions.
In particular, the following issues are addressed: effective
storage of massive access controls, efficient query planning for
secure query evaluation, and accurate cardinality estimation for
access controlled data.
Because fine-grained access controls specify access rights from
each user to each piece of data in the system, they are
effectively a massive matrix of the size as the product of the
number of users and the size of data. Therefore, fine-grained
access controls require a very compact encoding to be feasible.
The proposed storage system in this thesis achieves an
unprecedented level of compactness by leveraging the high
correlation of access controls found in real system data. This
correlation comes from two sides: the structural similarity of
access rights between data, and the similarity of access patterns
from different users. This encoding can be embedded into a
linearized representation of XML data such that a query evaluation
framework is able to compute the answer to the access controlled
query with minimal disk I/O to the access controls.
Query optimization is a crucial component for database systems.
This thesis proposes an intelligent query plan caching mechanism
that has lower amortized cost for query planning in the presence
of fine-grained access controls. The rationale behind this query
plan caching mechanism is that the queries, customized by
different access controls from different users, may share common
upper-level join trees in their optimal query plans. Since join
plan generation is an expensive step in query optimization,
reusing the upper-level join trees will reduce query optimization
significantly. The proposed caching mechanism is able to match
efficient query plans to access controlled query plans with
minimal runtime cost.
In case of a query plan cache miss, the optimizer needs to
optimize an access controlled query from scratch. This depends on
accurate cardinality estimation on the size of the intermediate
query results. This thesis proposes a novel sampling scheme that
has better accuracy than traditional cardinality estimation
techniques
Rule-based Methodologies for the Specification and Analysis of Complex Computing Systems
Desde los orígenes del hardware y el software hasta la época actual, la complejidad
de los sistemas de cálculo ha supuesto un problema al cual informáticos, ingenieros
y programadores han tenido que enfrentarse. Como resultado de este esfuerzo han
surgido y madurado importantes áreas de investigación. En esta disertación abordamos
algunas de las líneas de investigación actuales relacionada con el análisis y
la verificación de sistemas de computación complejos utilizando métodos formales y
lenguajes de dominio específico.
En esta tesis nos centramos en los sistemas distribuidos, con un especial interés por
los sistemas Web y los sistemas biológicos. La primera parte de la tesis está dedicada
a aspectos de seguridad y técnicas relacionadas, concretamente la certificación del
software. En primer lugar estudiamos sistemas de control de acceso a recursos y proponemos
un lenguaje para especificar políticas de control de acceso que están fuertemente
asociadas a bases de conocimiento y que proporcionan una descripción sensible
a la semántica de los recursos o elementos a los que se accede. También hemos desarrollado
un marco novedoso de trabajo para la Code-Carrying Theory, una metodología
para la certificación del software cuyo objetivo es asegurar el envío seguro de código
en un entorno distribuido. Nuestro marco de trabajo está basado en un sistema de
transformación de teorías de reescritura mediante operaciones de plegado/desplegado.
La segunda parte de esta tesis se concentra en el análisis y la verificación de sistemas
Web y sistemas biológicos. Proponemos un lenguaje para el filtrado de información
que permite la recuperación de informaciones en grandes almacenes de datos. Dicho
lenguaje utiliza información semántica obtenida a partir de ontologías remotas
para re nar el proceso de filtrado. También estudiamos métodos de validación para
comprobar la consistencia de contenidos web con respecto a propiedades sintácticas
y semánticas. Otra de nuestras contribuciones es la propuesta de un lenguaje que
permite definir y comprobar automáticamente restricciones semánticas y sintácticas
en el contenido estático de un sistema Web. Finalmente, también consideramos los
sistemas biológicos y nos centramos en un formalismo basado en lógica de reescritura
para el modelado y el análisis de aspectos cuantitativos de los procesos biológicos.
Para evaluar la efectividad de todas las metodologías propuestas, hemos prestado
especial atención al desarrollo de prototipos que se han implementado utilizando
lenguajes basados en reglas.Baggi ., M. (2010). Rule-based Methodologies for the Specification and Analysis of Complex Computing Systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8964Palanci
A compact and scalable encoding for updating XML based on node labeling schemes
The eXtensible Markup Language (XML) has been adopted as the new standard for data exchange on the World Wide Web. As the rate of adoption increases, there is an ever pressing need to store, query and update XML in its native format, thereby eliminating the overhead of parsing and transforming XML in and out of various data formats. However, the hierarchical, ordered and semi-structured properties of the tree structure underlying the XML data model presents many challenges to updating XML. In particular, many of the tree labeling schemes were designed to solve a particular problem or provide a particular feature, often at the expense of other important features. In this dissertation, we identify the core properties that are representative of the desirable characteristics of a good dynamic labeling scheme for XML. We focus on four features central to the outstanding problems in existing dynamic labeling schemes; namely a compact label encoding, scalability, deleted node label reuse and a label storage scheme for binary-encoded bit-string node labels. At present there is no dynamic labeling scheme that integrates support for all four features. We present a novel compact and scalable adaptive encoding method to facilitate a highly constrained growth rate of label size under arbitrary node insertion and deletion scenarios and our encoding method can scale efficiently. We deploy our encoding method in two novel dynamic labeling schemes for XML that can completely avoid node relabeling, process frequently skewed insertions gracefully and reuse deleted node labels
Compressing Labels of Dynamic XML Data using Base-9 Scheme and Fibonacci Encoding
The flexibility and self-describing nature of XML has made it the most common mark-up language used for data representation over the Web. XML data is naturally modelled as a tree, where the structural tree information can be encoded into labels via XML labelling scheme in order to permit answers to queries without the need to access original XML files. As the transmission of XML data over the Internet has become vibrant, it has also become necessary to have an XML labelling scheme that supports dynamic XML data. For a large-scale and frequently updated XML document, existing dynamic XML labelling schemes still suffer from high growth rates in terms of their label size, which can result in overflow problems and/or ambiguous data/query retrievals.
This thesis considers the compression of XML labels. A novel XML labelling scheme, named “Base-9”, has been developed to generate labels that are as compact as possible and yet provide efficient support for queries to both static and dynamic XML data. A Fibonacci prefix-encoding method has been used for the first time to store Base-9’s XML labels in a compressed format, with the intention of minimising the storage space without degrading XML querying performance. The thesis also investigates the compression of XML labels using various existing prefix-encoding methods. This investigation has resulted in the proposal of a novel prefix-encoding method named “Elias-Fibonacci of order 3”, which has achieved the fastest encoding time of all prefix-encoding methods studied in this thesis, whereas Fibonacci encoding was found to require the minimum storage.
Unlike current XML labelling schemes, the new Base-9 labelling scheme ensures the generation of short labels even after large, frequent, skewed insertions. The advantages of such short labels as those generated by the combination of applying the Base-9 scheme and the use of Fibonacci encoding in terms of storing, updating, retrieving and querying XML data are supported by the experimental results reported herein
Content Based Image Retrieval (CBIR) in Remote Clinical Diagnosis and Healthcare
Content-Based Image Retrieval (CBIR) locates, retrieves and displays images
alike to one given as a query, using a set of features. It demands accessible
data in medical archives and from medical equipment, to infer meaning after
some processing. A problem similar in some sense to the target image can aid
clinicians. CBIR complements text-based retrieval and improves evidence-based
diagnosis, administration, teaching, and research in healthcare. It facilitates
visual/automatic diagnosis and decision-making in real-time remote
consultation/screening, store-and-forward tests, home care assistance and
overall patient surveillance. Metrics help comparing visual data and improve
diagnostic. Specially designed architectures can benefit from the application
scenario. CBIR use calls for file storage standardization, querying procedures,
efficient image transmission, realistic databases, global availability, access
simplicity, and Internet-based structures. This chapter recommends important
and complex aspects required to handle visual content in healthcare.Comment: 28 pages, 6 figures, Book Chapter from "Encyclopedia of E-Health and
Telemedicine
Mining a Small Medical Data Set by Integrating the Decision Tree and t-test
[[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI
Indexing collections of XML documents with arbitrary links
In recent years, the popularity of XML has increased significantly. XML is the extensible markup language of the World Wide Web Consortium (W3C). XML is used to represent data in many areas, such as traditional database management systems, e-business environments, and the World Wide Web. XML data, unlike relational and object-oriented data, has no fixed schema known in advance and is stored separately from the data. XML data is self-describing and can model heterogeneity more naturally than relational or object-oriented data models. Moreover, XML data usually has XLinks or XPointers to data in other documents (e.g., global-links). In addition to XLink or XPointer links, the XML standard allows to add internal-links between different elements in the same XML document using the ID/IDREF attributes. The rise in popularity of XML has generated much interest in query processing over graph-structured data. In order to facilitate efficient evaluation of path expressions, structured indexes have been proposed. However, most variants of structured indexes ignore global- or interior-document references. They assume a tree-like structure of XML-documents, which do not contain such global-and internal-links. Extending these indexes to work with large XML graphs considering of global- or internal-document links, firstly requires a lot of computing power for the creation process. Secondly, this would also require a great deal of space in which to store the indexes. As a latter demonstrates, the efficient evaluation of ancestors-descendants queries over arbitrary graphs with long paths is indeed a complex issue. This thesis proposes the HID index (2-Hop cover path Index based on DAG) is based on the concept of a two-hop cover for a directed graph. The algorithms proposed for the HID index creation, in effect, scales down the original graph size substantially. As a result, a directed acyclic graph (DAG) with a smaller number of nodes and edges will emerge. This reduces the number of computing steps required for building the index. In addition to this, computing time and space will be reduced as well. The index also permits to efficiently evaluate ancestors-descendants relationships. Moreover, the proposed index has an advantage over other comparable indexes: it is optimized for descendants- or-self queries on arbitrary graphs with link relationship, a task that would stress any index structures. Our experiments with real life XML data show that, the HID index provides better performance than other indexes
- …