39 research outputs found

    A Review of Trusted Broker Architectures for Data Sharing

    Get PDF
    Sharing data across organizational boundaries must strike a balance between the competing data quality dimensions of access and security. Without access to data, it can't be used and, consequently, is of no value. At the same time, uncontrolled access to data, especially sensitive personal data, can result in dire legal and ethical consequences. This paper discusses the trade-offs between security and access for three styles of trusted broker architectures in hopes that this will provide guidance for organizations trying to implement data sharing systems.Naval Postgraduate School Acquisition Research Progra

    Methods to Measure Importance of Data Attributes to Consumers of Information Products

    Get PDF
    Errors in data sources of information product (IP) manufacturing systems can degrade overall IP quality as perceived by consumers. Data defects from inputs propagate throughout the IP manufacturing process. Information Quality (IQ) research has focused on improving the quality of inputs to mitigate error propagation and ensure an IP will be fit for use by consumers. However, the feedback loop from IP consumers to IP producers is often incomplete since the overall quality of the IP is not based solely on quality of inputs but rather by the IP’s fitness for use as a whole. It remains uncertain that high quality inputs directly correlate to a high quality IP. The methods proposed in this paper investigate the effects of intentionally decreasing, or disrupting, quality of inputs, measuring the consumers\u27 evaluations as compared to an undisrupted IP, and proposes scenarios illustrating the advantage of these methods over traditional survey methods. Fitness for use may then be increased using those attributes deemed “important” by consumers in future IP revisions

    CoDoSA: A Lightweight, XML-Based Framework for Integrating Unstructured Textual Information

    Get PDF
    One of the most fundamental dimensions of information quality is access. For many organizations, a large part of their information assets is locked away in Unstructured Textual Information (UTI) in the form of email, letters, contracts, call notes, and spreadsheet. In addition to internal UTI, there is also a wealth of publicly available UTI on websites, in newspapers, courthouse records and other sources that can add value when combined with internally managed information. This paper describes a system called Compressed Document Set Architecture (CoDoSA) designed to facilitate the integration of UTI into a structured database environment where it can be more readily accessed and manipulated. The CoDoSA Framework comprises an XML-based metadata standard and an associated Application Program Interface (API). It further describes how CoDoSA can facilitate the storage and management of information during the ETL (Extract, Transform, and Load) process to integrate unstructured UTI information. It also explains how CoDoSA promotes higher information quality by providing several features that simplify the governance of metadata standards and enforcement of data quality constraints across different UTI applications and development teams. In addition, CoDoSA provides a mechanism for inserting semantic tags into captured UTI, tags that can be used in later steps to drive semantic-mediated queries and processes

    Critical Cultural Success Factors for Achieving High Quality Information in an Organization

    Get PDF
    While information and data quality practitioners are in general agreement that social, cultural, and organizational factors are the most important in determining the success or failure of an organization’s data quality programs, there is little to no existing research quantifying these factors. In this research we build from both our previous research and others’ to distill and clarify those cultural factors which are the Critical Cultural Success Factors (CCSFs) for successful Information and Data Quality programs in an organization. Using the Delphi method for gaining consensus from a group of experts, we distilled fourteen factors down to six and clarified the definitions of those six factors. We begin explaining how these CCSFs fit into Organizational Learning Theory and plan to ultimately define a new system dynamics model incorporating them so that organizations and information quality practitioners can positively affect the success of information and data quality programs

    Theme-weighted Ranking of Keywords from Text Documents using Phrase Embeddings

    Full text link
    Keyword extraction is a fundamental task in natural language processing that facilitates mapping of documents to a concise set of representative single and multi-word phrases. Keywords from text documents are primarily extracted using supervised and unsupervised approaches. In this paper, we present an unsupervised technique that uses a combination of theme-weighted personalized PageRank algorithm and neural phrase embeddings for extracting and ranking keywords. We also introduce an efficient way of processing text documents and training phrase embeddings using existing techniques. We share an evaluation dataset derived from an existing dataset that is used for choosing the underlying embedding model. The evaluations for ranked keyword extraction are performed on two benchmark datasets comprising of short abstracts (Inspec), and long scientific papers (SemEval 2010), and is shown to produce results better than the state-of-the-art systems.Comment: preprint for paper accepted in Proceedings of 1st IEEE International Conference on Multimedia Information Processing and Retrieva

    An Empirical Study of Extreme Programming

    Get PDF
    Extreme Programming (XP) is a drastic departure from the traditional software development processes in which a complete planning cycle usually proceeds any design and implementation work. We report empirical study results from two object-oriented systems, which were developed using a process similar to XP. In particular, we used two metrics¾ System Design Instability (SDI) and Class Implementation Instability (CII) ¾to track the design evolution. We found that both systems experienced a significant increase in classes in the middle of the process. The new stories introduced at the beginning of each cycle may change existing design unpredictably. The CII metric seems to give good indication of project completeness

    Software Metrics and Object-Oriented System Evolution

    Get PDF
    Object Oriented (OO) system design process can be quantitatively measured by metrics. Results from an empirical study of two OO systems that employed some of these metrics are discussed

    A Curriculum for a Master of Science in Information Quality

    Get PDF
    The first Master of Science in Information Quality (IQ) degree is designed and being offered to prepare students for careers in industry and government as well as advanced graduate studies. The curriculum is guided by the Model Curriculum and Guidelines for Graduate Degree Programs in Information Systems, which are endorsed by the Association for Computing Machinery and the Association for Information Systems. The curriculum integrates two key educational innovations: (1) an interdisciplinary approach to curriculum design, and (2) a balance between theoretical rigor and practical relevance. In response to the demand from industry, the curriculum aims to educate students who can lead the effort to solve current and future information quality problems. As such, problem-based learning is balanced with foundation-building learning to effectively deliver the intellectual contents of the curriculum. Much of the individual course content is based on cumulated research results and practices developed over the last two decades. The curriculum is designed to balance information quality theory with industry best practices using modern tools and technology. It includes the skill sets that are critical to succeed as IQ professionals. Since IQ is an inter-disciplinary field, the curriculum draws upon total quality management, database, core knowledge of IQ, change management, project management, and IQ policy and strategy. The courses are delivered using case studies, hands-on laboratories, theory building, and team projects to enhance the student\u27s learning experience. Upon completing the program, students will be equipped with sufficient breadth and depth in the IQ field to solve real world problems and pursue further studies

    An Empirical Study of Java Design Efficiency in a Client-Sever Database System

    Get PDF
    This study shows performance comparisons among four Java design architectures in a client-server database system. We have found that among the four designs, native JDBCODBC bridge, servlet, XML parser and serialized objects, the last one is the most efficient in terms of response time. The four architectures are provided as Java source code for reference
    corecore