3,041 research outputs found

    3D Partition-Based Clustering for Supply Chain Data Management

    Get PDF
    Supply Chain Management (SCM) is the management of the products and goods flow from its origin point to point of consumption. During the process of SCM, information and dataset gathered for this application is massive and complex. This is due to its several processes such as procurement, product development and commercialization, physical distribution, outsourcing and partnerships. For a practical application, SCM datasets need to be managed and maintained to serve a better service to its three main categories; distributor, customer and supplier. To manage these datasets, a structure of data constellation is used to accommodate the data into the spatial database. However, the situation in geospatial database creates few problems, for example the performance of the database deteriorate especially during the query operation. We strongly believe that a more practical hierarchical tree structure is required for efficient process of SCM. Besides that, three-dimensional approach is required for the management of SCM datasets since it involve with the multi-level location such as shop lots and residential apartments. 3D R-Tree has been increasingly used for 3D geospatial database management due to its simplicity and extendibility. However, it suffers from serious overlaps between nodes. In this paper, we proposed a partition-based clustering for the construction of a hierarchical tree structure. Several datasets are tested using the proposed method and the percentage of the overlapping nodes and volume coverage are computed and compared with the original 3D R-Tree and other practical approaches. The experiments demonstrated in this paper substantiated that the hierarchical structure of the proposed partitionbased clustering is capable of preserving minimal overlap and coverage. The query performance was tested using 300,000 points of a SCM dataset and the results are presented in this paper. This paper also discusses the outlook of the structure for future reference

    A Query-Driven Spatial Data Warehouse Conceptual Schema For Disaster Management

    Get PDF
    Malaysia has experienced various types of disasters. Such events cause billions of USD and posing great challenges to a nation’s government to provide better disaster management. Indeed, disaster management is an important global problem. The National Security Council’s (NSC) Directive No. 20 outlines Malaysia’s policy on disaster and relief management demonstrates government efforts and initiatives to efficiently respond to disasters. In this regard, decision making is a key factor for organizational success. Positive outcomes are dependent on available data that can be manipulated to provide information to the decision maker, who faces the difficult and complex task of anticipating upcoming events and analyzing multiple parameters. Disaster management involves multiple sources for data collection at various levels as well as a wide array of stakeholders. Hence, accessibility to heterogenous spatial data is challenging. It is crucial to address this problem in terms of data distribution, query operation, and the analyzation task because each resource, level, and stakeholder involved has personal preferences with regard to its format, structure, syntax, and schema.The main purpose of this research is to support the complex decision-making process during disaster management by enriching the body of knowledge on spatial data warehousing, particularly for conceptual schema design. A major research problem identified are the heterogeneity of a spatial resource data model, the most appropriate approach to schema design, and the level to which the schema is dependent on the given tools. These problems must be addressed as they are main roadblocks to the process of accessing and retrieving information. The existence of heterogeneous data sources and restricted accessibility to relevant information during a disaster causes several issues with spatial data warehouse design. It can be classified into three considerations namely, the need for guidelines and formalism, schema generation model and a schema design framework and finally, a generalized schema. Four strategies have been designed to address the aforementioned problems: identifying relevant requirements, creating a conceptual design framework, deriving an appropriate schema, and refining the proposed method. User queries are prioritized in the conceptual design framework. Outputs from the formalization process are used with a schema algorithm to effectively derive a generalized schema. The conceptual model framework is taken to be representative of a potential application/ system that has been developed to design a conceptual schema using the problematic heterogeneous data and a restricted approach concerning any corresponding query formalisms. In the schema derivation phase, the conceptual schema that was produced by implementing the proposed framework is presented along with the final conceptual schema. This design is then incorporated into a tool to run an experiment demonstrating that queries from a heterogeneous context are capable of performing context-appropriate conceptual schema design in generic way. Such results outshine the capabilities of a restricted design approach and could potentially answer any relevant queries in less time

    A survey on big data indexing strategies

    Get PDF
    The operations of the Internet have led to a significant growth and accumulation of data known as Big Data.Individuals and organizations that utilize this data, had no idea, nor were they prepared for this data explosion.Hence, the available solutions cannot meet the needs of the growing heterogeneous data in terms of processing. This results in inefficient information retrieval or search query results.The design of indexing strategies that can support this need is required. A survey on various indexing strategies and how they are utilized for solving Big Data management issues can serve as a guide for choosing the strategy best suited for a problem, and can also serve as a base for the design of more efficient indexing strategies.The aim of the study is to explore the characteristics of the indexing strategies used in Big Data manageability by covering some of the weaknesses and strengths of B-tree, R-tree, to name but a few. This paper covers some popular indexing strategies used for Big Data management. It exposes the potentials of each by carefully exploring their properties in ways that are related to problem solving

    Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval

    Full text link
    This paper presents a new state-of-the-art for document image classification and retrieval, using features learned by deep convolutional neural networks (CNNs). In object and scene analysis, deep neural nets are capable of learning a hierarchical chain of abstraction from pixel inputs to concise and descriptive representations. The current work explores this capacity in the realm of document analysis, and confirms that this representation strategy is superior to a variety of popular hand-crafted alternatives. Experiments also show that (i) features extracted from CNNs are robust to compression, (ii) CNNs trained on non-document images transfer well to document analysis tasks, and (iii) enforcing region-specific feature-learning is unnecessary given sufficient training data. This work also makes available a new labelled subset of the IIT-CDIP collection, containing 400,000 document images across 16 categories, useful for training new CNNs for document analysis

    A Multi-Agent Architecture for Distributed Domain-Specific Information Integration

    Get PDF
    On both the public Internet and private Intranets, there is a vast amount of data available that is owned and maintained by different organizations, distributed all around the world. These data resources are rich and recent; however, information gathering and knowledge discovery from them, in a particular knowledge domain, confronts major difficulties. The objective of this article is to introduce an autonomous methodology to provide for domain-specific information gathering and integration from multiple distributed sources

    Many-Task Computing and Blue Waters

    Full text link
    This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware
    • …
    corecore