11,298 research outputs found

    Cost-benefit Analysis of Web Bag in a Web Warehouse

    Get PDF
    Sets and bags are closely related structures and have been studied in relational databases. A bag is different from a set in that it is sensitive to the number of times an element occurs, while a set is not. In this paper, we introduce the concept of a Web bag in the context of a World Wide Web warehouse called WHOWEDA (WareHouse Of WEb DAta) which we are currently building. Informally, a Web bag is a Web table which allows multiple occurrences of identical Web types. A Web bag helps one to discover useful knowledge from a Web table, such as visible documents or Web sites (i.e. documents/sites which can be reached by many paths), luminous documents (i.e. documents with many outgoing links) and luminous paths (i.e. frequently traversed paths). In this paper, we provide a cost-benefit analysis of materializing Web bags as compared to Web tables with distinct Web tuple

    Modeling views in the layered view model for XML using UML

    Get PDF
    In data engineering, view formalisms are used to provide flexibility to users and user applications by allowing them to extract and elaborate data from the stored data sources. Conversely, since the introduction of Extensible Markup Language (XML), it is fast emerging as the dominant standard for storing, describing, and interchanging data among various web and heterogeneous data sources. In combination with XML Schema, XML provides rich facilities for defining and constraining user-defined data semantics and properties, a feature that is unique to XML. In this context, it is interesting to investigate traditional database features, such as view models and view design techniques for XML. However, traditional view formalisms are strongly coupled to the data language and its syntax, thus it proves to be a difficult task to support views in the case of semi-structured data models. Therefore, in this paper we propose a Layered View Model (LVM) for XML with conceptual and schemata extensions. Here our work is three-fold; first we propose an approach to separate the implementation and conceptual aspects of the views that provides a clear separation of concerns, thus, allowing analysis and design of views to be separated from their implementation. Secondly, we define representations to express and construct these views at the conceptual level. Thirdly, we define a view transformation methodology for XML views in the LVM, which carries out automated transformation to a view schema and a view query expression in an appropriate query language. Also, to validate and apply the LVM concepts, methods and transformations developed, we propose a view-driven application development framework with the flexibility to develop web and database applications for XML, at varying levels of abstraction

    Warehouse Redesign for Bay State Milling Corp.

    Get PDF
    The goal of the project was to increase the efficiency of BSMs inventory management system at their Clifton Facility. The team evaluated the procedures, diagnosed the problems, and built recommendations to address their problems. Through the provided data and observation, we discovered their current inventory layout inefficient for their inventory flow system, that the current policy regarding customer practices was too lenient, and that BSM does not collect metrics regarding inventory movement. We then developed short term, intermediate and, long term recommendations to combat the problems we identified

    Impliance: A Next Generation Information Management Appliance

    Full text link
    ably successful in building a large market and adapting to the changes of the last three decades, its impact on the broader market of information management is surprisingly limited. If we were to design an information management system from scratch, based upon today's requirements and hardware capabilities, would it look anything like today's database systems?" In this paper, we introduce Impliance, a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information. We first summarize the trends that will shape information management for the foreseeable future. Those trends imply three major requirements for Impliance: (1) to be able to store, manage, and uniformly query all data, not just structured records; (2) to be able to scale out as the volume of this data grows; and (3) to be simple and robust in operation. We then describe four key ideas that are uniquely combined in Impliance to address these requirements, namely the ideas of: (a) integrating software and off-the-shelf hardware into a generic information appliance; (b) automatically discovering, organizing, and managing all data - unstructured as well as structured - in a uniform way; (c) achieving scale-out by exploiting simple, massive parallel processing, and (d) virtualizing compute and storage resources to unify, simplify, and streamline the management of Impliance. Impliance is an ambitious, long-term effort to define simpler, more robust, and more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

    Can Dairy Manure be Profitably Composted in Maine?

    Get PDF
    Manure contains many important nutrients that are vital to the growth of crops. When this material is applied to fields in an inappropriate manner or in quantities too large for the soil to handle, this leads to pollution in the form of leaching and runoff, which causes contamination of ground and surface waters. An average cow produces one hundred pounds of manure per day (1 8 tons per year). Composted manure could provide farmers with a more environmentally friendly alternative to traditional manure management practices. A review of the composting literature determined that a wide variety of markets do exist for composted dairy manure. The cost of producing the raw compost product was calculated along with the cost of bagging and transporting compost. It was determined that bulk compost could not be profitably transported to market, but that bagged compost can be profitably transported to market

    Graph BI & analytics: current state and future challenges

    Get PDF
    In an increasingly competitive market, making well-informed decisions requires the analysis of a wide range of heterogeneous, large and complex data. This paper focuses on the emerging field of graph warehousing. Graphs are widespread structures that yield a great expressive power. They are used for modeling highly complex and interconnected domains, and efficiently solving emerging big data application. This paper presents the current status and open challenges of graph BI and analytics, and motivates the need for new warehousing frameworks aware of the topological nature of graphs. We survey the topics of graph modeling, management, processing and analysis in graph warehouses. Then we conclude by discussing future research directions and positioning them within a unified architecture of a graph BI and analytics framework.Peer ReviewedPostprint (author's final draft

    Expedite requests in Raytheon's North Texas supply chain

    Get PDF
    Thesis (M.B.A.)--Massachusetts Institute of Technology, Sloan School of Management; and, (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science; in conjunction with the Leaders for Manufacturing Program at MIT, 2006.Includes bibliographical references (p. 69-70).In December 2004, a manager at Raytheon Company articulated in the form of an LFM (Leaders for Manufacturing) internship proposal his belief that someone should do something about the amounts of time and money that Raytheon's North Texas plants spent handling expedite requests-requests that someone provide goods or services more quickly than normal. This thesis attempts to summarize the thoughts, learnings, initiatives, and outcomes associated with the ensuing effort. In particular, a large section of the paper is devoted to a case study of the most involved initiative: the devising and implementing of a new dispatching method in one small but central operation in an organization with a long history of processing things first in, first out. While for the project team the compelling factor was achieving a specific dollar impact, the reader of this paper will probably be more interested in the methodology than in Raytheon's ROI. Research for this thesis was conducted during a six-month internship with Raytheon Company's Space and Airborne Systems Supply Chain Management group in McKinney, TX, and Dallas, TX. The internship was affiliated with the Massachusetts Institute of Technology's Leaders for Manufacturing (LFM) Program.by Scott K. Hiroshige.S.M.M.B.A

    Reduce, Reuse, Recycle

    Get PDF
    The goal of the project was to reduce Nypro\u27s impact on the environment by facilitating the company\u27s recent Reduce-Reuse-Recycle initiative. Through multiple site visits, extensive interviews, and process analyses, the project team identified numerous internal opportunities within Nypro facilities in China as well as the external ones from selected Nypro suppliers. The effects of the opportunities with the highest interests are then evaluated by monetary savings and the amount of reduction in carbon emissions and waste materials

    Reverse Logistics Optimization

    Get PDF
    The purpose of the Reverse Logistic Optimization project is to achieve best in class logistics and repair capabilities by integrating with an insurance providing company as a reverse logistics return center for remorse returns for all channels to a cellular providing company. Product recovery, which comprises return, refurbish and repair processes, requires an efficient reverse logistic network. One of the main characteristics of reverse logistics network problem is uncertainty that further amplifies the complexity of the problem. The degree of uncertainty in terms of the capacities, demands and quantity of products exists in reverse logistics parameters. The goal is to expedite the repair process, leverage their dynamic handling of blind receipts and improved Reverse Logistics Planning–Repair Plan and Kitting Plan to optimize cost and provide new opportunities for revenue creation. The benefits of integration and control will reduce costs in megabucks by optimizing the return, refurbish and repair processes. Very nearly all organizations expect an increment of administration consideration for reverse logistics, and the results demonstrate a substantial potential regarding outsider administration suppliers, including choice help devices around there. Besides, the paper also presents the analyzed hidden reasons for the general low turn around logistics execution and call attention to issues that need change

    Survey of Parallel Processing on Big Data

    Get PDF
    No doubt we are entering the big data epoch. The datasets have gone from small to super large scale, which not only brings us benefits but also some challenges. It becomes more and more difficult to handle them with traditional data processing methods. Many companies have started to invest in parallel processing frameworks and systems for their own products because the serial methods cannot feasibly handle big data problems. The parallel database systems, MapReduce, Hadoop, Pig, Hive, Spark, and Twister are some examples of these products. Many of these frameworks and systems can handle different kinds of big data problems, but none of them can cover all the big data issues. How to wisely use existing parallel frameworks and systems to deal with large-scale data becomes the biggest challenge. We investigate and analyze the performance of parallel processing for big data. We review and analyze various parallel processing architectures and frameworks, and their capabilities for large-scale data. We also present the potential challenges on multiple techniques according to the characteristics of big data. At last, we present possible solutions for those challenges
    • …
    corecore