101 research outputs found

    Frequent itemset mining in big data with effective single scan algorithms

    Get PDF
    © 2013 IEEE. This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases

    A multithreaded hybrid framework for mining frequent itemsets

    Get PDF
    Mining frequent itemsets is an area of data mining that has beguiled several researchers in recent years. Varied data structures such as Nodesets, DiffNodesets, NegNodesets, N-lists, and Diffsets are among a few that were employed to extract frequent items. However, most of these approaches fell short either in respect of run time or memory. Hybrid frameworks were formulated to repress these issues that encompass the deployment of two or more data structures to facilitate effective mining of frequent itemsets. Such an approach aims to exploit the advantages of either of the data structures while mitigating the problems of relying on either of them alone. However, limited efforts have been made to reinforce the efficiency of such frameworks. To address these issues this paper proposes a novel multithreaded hybrid framework comprising of NegNodesets and N-list structure that uses the multicore feature of today’s processors. While NegNodesets offer a concise representation of itemsets, N-lists rely on List intersection thereby speeding up the mining process. To optimize the extraction of frequent items a hash-based algorithm has been designed here to extract the resultant set of frequent items which further enhances the novelty of the framework

    Digital imaging technology assessment: Digital document storage project

    Get PDF
    An ongoing technical assessment and requirements definition project is examining the potential role of digital imaging technology at NASA's STI facility. The focus is on the basic components of imaging technology in today's marketplace as well as the components anticipated in the near future. Presented is a requirement specification for a prototype project, an initial examination of current image processing at the STI facility, and an initial summary of image processing projects at other sites. Operational imaging systems incorporate scanners, optical storage, high resolution monitors, processing nodes, magnetic storage, jukeboxes, specialized boards, optical character recognition gear, pixel addressable printers, communications, and complex software processes

    Fifth NASA Goddard Conference on Mass Storage Systems and Technologies

    Get PDF
    This document contains copies of those technical papers received in time for publication prior to the Fifth Goddard Conference on Mass Storage Systems and Technologies held September 17 - 19, 1996, at the University of Maryland, University Conference Center in College Park, Maryland. As one of an ongoing series, this conference continues to serve as a unique medium for the exchange of information on topics relating to the ingestion and management of substantial amounts of data and the attendant problems involved. This year's discussion topics include storage architecture, database management, data distribution, file system performance and modeling, and optical recording technology. There will also be a paper on Application Programming Interfaces (API) for a Physical Volume Repository (PVR) defined in Version 5 of the Institute of Electrical and Electronics Engineers (IEEE) Reference Model (RM). In addition, there are papers on specific archives and storage products

    Sixth Goddard Conference on Mass Storage Systems and Technologies Held in Cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems

    Get PDF
    This document contains copies of those technical papers received in time for publication prior to the Sixth Goddard Conference on Mass Storage Systems and Technologies which is being held in cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems at the University of Maryland-University College Inn and Conference Center March 23-26, 1998. As one of an ongoing series, this Conference continues to provide a forum for discussion of issues relevant to the management of large volumes of data. The Conference encourages all interested organizations to discuss long term mass storage requirements and experiences in fielding solutions. Emphasis is on current and future practical solutions addressing issues in data management, storage systems and media, data acquisition, long term retention of data, and data distribution. This year's discussion topics include architecture, tape optimization, new technology, performance, standards, site reports, vendor solutions. Tutorials will be available on shared file systems, file system backups, data mining, and the dynamics of obsolescence

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 16th International Annual Conference on Cyber Security, CNCERT 2020, held in Beijing, China, in August 2020. The 17 papers presented were carefully reviewed and selected from 58 submissions. The papers are organized according to the following topical sections: access control; cryptography; denial-of-service attacks; hardware security implementation; intrusion/anomaly detection and malware mitigation; social network security and privacy; systems security

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 16th International Annual Conference on Cyber Security, CNCERT 2020, held in Beijing, China, in August 2020. The 17 papers presented were carefully reviewed and selected from 58 submissions. The papers are organized according to the following topical sections: access control; cryptography; denial-of-service attacks; hardware security implementation; intrusion/anomaly detection and malware mitigation; social network security and privacy; systems security

    Building regulatory compliant storage systems

    Full text link
    In the past decade, informational records have become entirely digital. These include financial statements, health care records, student records, private consumer information and other sensitive data. Because of the delicate nature of the data these records contain, Congress and the courts have begun to recognize the importance of properly storing and securing electronic records. Examples of legislation in-clude the Health Insurance Portability and Accountabilit

    A new framework for a technological perspective of knowledge management

    Get PDF
    Rapid change is a defining characteristic of our modern society. This has huge impact on society, governments, and businesses. Businesses are forced to fundamentally transform themselves to survive in a challenging economy. Transformation implies change in the way business is conducted, in the way people perform their contribution to the organisation, and in the way the organisation perceives and manages its vital assets – which increasingly are built around the key assets of intellectual capital and knowledge. The latest management tool and realisation of how to respond to the challenges of the economy in the new millennium, is the idea of "knowledge management" (KM). In this study we have focused on synthesising the many confusing points of view about the subject area, such as: a. different focus points or perspectives; b. different definitions and positioning of the subject; as well as c. a bewildering number of definitions of what knowledge is and what KM entails. There exists a too blurred distinction in popular-magazine-like sources about this area between subjects and concepts such as: knowledge versus information versus data; the difference between information management and knowledge management; tools available to tackle the issues in this field of study and practice; and the role technology plays versus the huge hype from some journalists and within the vendor community. Today there appears to be a lack of a coherent set of frameworks to abstract, comprehend, and explain this subject area; let alone to build successful systems and technologies with which to apply KM. The study is comprised of two major parts: 1. In the first part the study investigates the concepts, elements, drivers, and challenges related to KM. A set of models for comprehending these issues and notions is contributed as we considered intellectual capital, organizational learning, communities of practice, and best practices. 2. The second part focuses on the technology perspective of KM. Although KM is primarily concerned with non-technical issues this study concentrates on the technical issues and challenges. A new technology framework for KM is proposed to position and relate the different KM technologies as well as the two key applications of KM, namely knowledge portals and knowledge discovery (including text mining). It is concluded that KM and related concepts and notions need to be understood firmly as well as effectively positioned and employed to support the modern business organisation in its quest to survive and grow. The main thesis is that KM technology is a necessary but insufficient prerequisite and a key enabler for successful KM in a rapidly changing business environment.Thesis (PhD (Computer Science))--University of Pretoria, 2010.Computer Scienceunrestricte

    Technology for the Future: In-Space Technology Experiments Program, part 2

    Get PDF
    The purpose of the Office of Aeronautics and Space Technology (OAST) In-Space Technology Experiments Program In-STEP 1988 Workshop was to identify and prioritize technologies that are critical for future national space programs and require validation in the space environment, and review current NASA (In-Reach) and industry/ university (Out-Reach) experiments. A prioritized list of the critical technology needs was developed for the following eight disciplines: structures; environmental effects; power systems and thermal management; fluid management and propulsion systems; automation and robotics; sensors and information systems; in-space systems; and humans in space. This is part two of two parts and contains the critical technology presentations for the eight theme elements and a summary listing of critical space technology needs for each theme
    • …
    corecore