609 research outputs found

    Trust management for the World Wide Web

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 62-[63]).by Yang-hua Chu.M.Eng

    A framework for domain-specific modeling on graph databases

    Full text link
    La complexité du logiciel augmente tout le temps: les systèmes deviennent plus grands et plus complexes. La modélisation est un élément central de génie logicielle pour relever les défis de la complexité. Cependant, un défi majeur auquel est confronté le développement de logiciels axés sur les modèles est l'évolutivité des outils de modélisation avec une taille croissante de modèles. Certaines initiatives ont commencé à explorer la modélisation tout en stockant des modèles dans une base de données de graphes. Dans cette thèse, nous présentons NMF, un framework pour créer et éditer des modèles dans une base de données Neo4j élevée à l'abstraction du langage de modélisation.Software complexity increases all the time: systems become larger and more complex. Modeling is a central part of software engineering to tackle challenges of complexity. However, a prominent challenge model-driven software development is facing is scalability of modeling tools with a growing size of models. Some initiatives started exploring modeling while storing models in a graph database. In this thesis, we present NMF, a framework to create and edit MDE models in a Neo4j database lifted to the abstraction of the modeling language

    Spreadsheet-driven web applications

    Get PDF
    Creating and publishing read-write-compute web applications requires programming skills beyond what most end users possess. But many end users know how to make spreadsheets that act as simple information management applications, some even with computation. We present a system for creating basic web applications using such spreadsheets in place of a server and using HTML to describe the client UI. Authors connect the two by placing spreadsheet references inside HTML attributes. Data computation is provided by spreadsheet formulas. The result is a reactive read-write-compute web page without a single line of Javascript code. Nearly all of the fifteen HTML novices we studied were able to connect HTML to spreadsheets using our method with minimal instruction. We draw conclusions from their experience and discuss future extensions to this programming model

    Layout Optimization for Distributed Relational Databases Using Machine Learning

    Get PDF
    A common problem when running Web-based applications is how to scale-up the database. The solution to this problem usually involves having a smart Database Administrator determine how to spread the database tables out amongst computers that will work in parallel. Laying out database tables across multiple machines so they can act together as a single efficient database is hard. Automated methods are needed to help eliminate the time required for database administrators to create optimal configurations. There are four operators that we consider that can create a search space of possible database layouts: 1) denormalizing, 2) horizontally partitioning, 3) vertically partitioning, and 4) fully replicating. Textbooks offer general advice that is useful for dealing with extreme cases - for instance you should fully replicate a table if the level of insert to selects is close to zero. But even this seemingly obvious statement is not necessarily one that will lead to a speed up once you take into account that some nodes might be a bottle neck. There can be complex interactions between the 4 different operators which make it even more difficult to predict what the best thing to do is. Instead of using best practices to do database layout, we need a system that collects empirical data on when these 4 different operators are effective. We have implemented a state based search technique to try different operators, and then we used the empirically measured data to see if any speed up occurred. We recognized that the costs of creating the physical database layout are potentially large, but it is necessary since we want to know the Ground Truth about what is effective and under what conditions. After creating a dataset where these four different operators have been applied to make different databases, we can employ machine learning to induce rules to help govern the physical design of the database across an arbitrary number of computer nodes. This learning process, in turn, would allow the database placement algorithm to get better over time as it trains over a set of examples. What this algorithm calls for is that it will try to learn 1) What is a good database layout for a particular application given a query workload? and 2) Can this algorithm automatically improve itself in making recommendations by using machine learned rules to try to generalize when it makes sense to apply each of these operators? There has been considerable research done in parallelizing databases where large amounts of data are shipped from one node to another to answer a single query. Sometimes the costs of shipping the data back and forth might be high, so in this work we assume that it might be more efficient to create a database layout where each query can be answered by a single node. To make this assumption requires that all the incoming query templates are known beforehand. This requirement can easily be satisfied in the case of a Web-based application due to the characteristic that users typically interact with the system through a web interface such as web forms. In this case, unseen queries are not necessarily answerable, without first possibly reconstructing the data on a single machine. Prior knowledge of these exact query templates allows us to select the best possible database table placements across multiple nodes. But in the case of trying to improve the efficiency of a Web-based application, a web site provider might feel that they are willing to suffer the inconvenience of not being able to answer an arbitrary query, if they are in turn provided with a system that runs more efficiently

    Exploring run-time reduction in programming codes via query optimization and caching

    Get PDF
    Object oriented programming languages raised the level of abstraction by supporting the explicit first class query constructs in the programming codes. These query constructs allow programmers to express operations on collections more abstractly than relying on their realization in loops or through provided libraries. Join optimization techniques from the field of database technology support efficient realizations of such language constructs. However, the problem associated with the existing techniques such as query optimization in Java Query Language (JQL) incurs run time overhead. Besides the programming languages supporting first-class query constructs, the usage of annotations has also increased in the software engineering community recently. Annotations are a common means of providing metadata information to the source code. The object oriented programming languages such as C# provides attributes constraints and Java has its own annotation constructs that allow the developers to include the metadata information in the program codes. This work introduces a series of query optimization approaches to reduce the run time of the programs involving explicit queries over collections. The proposed approaches rely on histograms to estimate the selectivity of the predicates and the joins in order to construct the query plans. The annotations in the source code are also utilized to gather the metadata required for the selectivity estimation of the numerical as well as the string valued predicates and joins in the queries. Several cache heuristics are proposed that effectively cache the results of repeated queries in the program codes. The cached query results are incrementally maintained up-to-date after the update operations to the collections --Abstract, page iv

    CazDataProvider: a solution to the object-relational mismatch

    Get PDF
    Dissertação de mestrado em Engenharia de InformáticaToday, most software applications require mechanisms to store information persistently. For decades, Relational Database Management Systems (RDBMSs) have been the most common technology to provide efficient and reliable persistence. Due to the object-relational paradigm mismatch, object oriented applications that store data in relational databases have to deal with Object Relational Mapping (ORM) problems. Since the emerging of new ORM frameworks, there has been an attempt to lure developers for a radical paradigm shift. However, they still often have troubles finding the best persistence mechanism for their applications, especially when they have to bear with legacy database systems. The aim of this dissertation is to discuss the persistence problem on object oriented applications and find the best solutions. The main focus lies on the ORM limitations, patterns, technologies and alternatives. The project supporting this dissertation was implemented at Cachapuz under the Project Global Weighting Solutions (GWS). Essentially, the objectives of GWS were centred on finding the optimal persistence layer for CazFramework, mostly providing database interoperability with close-to-Structured Query Language (SQL) querying. Therefore, this work provides analyses on ORM patterns, frameworks, alternatives to ORM like Object-Oriented Database Management Systems (OODBMSs). It also describes the implementation of CazDataProvider, a .NET library tool providing database interoperability and dynamic query features. In the end, there is a performance comparison of all the technologies debated in this dissertation. The result of this dissertation provides guidance for adopting the best persistence technology or implement the most suitable ORM architectures.Hoje, a maioria dos aplicações requerem mecanismos para armazenar informação persistentemente. Durante décadas, as RDBMSs têm sido a tecnologia mais comum para fornecer persistência eficiente e confiável. Devido à incompatibilidade dos paradigmas objetos-relacional, as aplicações orientadas a objetos que armazenam dados em bases de dados relacionais têm de lidar com os problemas do ORM. Desde o surgimento de novas frameworks ORM, houve uma tentativa de atrair programadores para uma mudança radical de paradigmas. No entanto, eles ainda têm muitas vezes dificuldade em encontrar o melhor mecanismo de persistência para as suas aplicações, especialmente quando eles têm de lidar com bases de dados legadss. O objetivo deste trabalho é discutir o problema de persistência em aplicações orientadas a objetos e encontrar as melhores soluções. O foco principal está nas limitações, padrões e tecnologias do ORM bem como suas alternativas. O projeto de apoio a esta dissertação foi implementado na Cachapuz no âmbito do Projeto GWS. Essencialmente, os objetivos do GWS foram centrados em encontrar a camada de persistência ideal para a CazFramework, principalmente fornecendo interoperabilidade de base de dados e consultas em SQL. Portanto, este trabalho fornece análises sobre padrões, frameworks e alternativas ao ORM como OODBMS. Além disso descreve a implementação do CazDataProvider, uma biblioteca .NET que fornece interoperabilidade de bases de dados e consultas dinâmicas. No final, há uma comparação de desempenho de todas as tecnologias discutidas nesta dissertação. O resultado deste trabalho fornece orientação para adotar a melhor tecnologia de persistência ou implementar as arquiteturas ORM mais adequadas
    corecore