90,390 research outputs found

    Set processing in a network environment

    Get PDF
    A combination of a local network, a mass storage system, and an autonomous set processor serving as a data/storage management machine is described. Its characteristics include: content-accessible data bases usable from all connected devices; efficient storage/access of large data bases; simple and direct programming with data manipulation and storage management handled by the set processor; simple data base design and entry from source representation to set processor representation with no predefinition necessary; capability available for user sort/order specification; significant reduction in tape/disk pack storage and mounts; flexible environment that allows upgrading hardware/software configuration without causing major interruptions in service; minimal traffic on data communications network; and improved central memory usage on large processors

    Expert System for UNIX System Reliability and Availability Enhancement

    Get PDF
    Highly reliable and available systems are critical to the airline industry. However, most off-the-shelf computer operating systems and hardware do not have built-in fault tolerant mechanisms, the UNIX workstation is one example. In this research effort, we have developed a rule-based Expert System (ES) to monitor, command, and control a UNIX workstation system with hot-standby redundancy. The ES on each workstation acts as an on-line system administrator to diagnose, report, correct, and prevent certain types of hardware and software failures. If a primary station is approaching failure, the ES coordinates the switch-over to a hot-standby secondary workstation. The goal is to discover and solve certain fatal problems early enough to prevent complete system failure from occurring and therefore to enhance system reliability and availability. Test results show that the ES can diagnose all targeted faulty scenarios and take desired actions in a consistent manner regardless of the sequence of the faults. The ES can perform designated system administration tasks about ten times faster than an experienced human operator. Compared with a single workstation system, our hot-standby redundancy system downtime is predicted to be reduced by more than 50 percent by using the ES to command and control the system

    A Parallel Algorithm for Exact Bayesian Structure Discovery in Bayesian Networks

    Full text link
    Exact Bayesian structure discovery in Bayesian networks requires exponential time and space. Using dynamic programming (DP), the fastest known sequential algorithm computes the exact posterior probabilities of structural features in O(2(d+1)n2n)O(2(d+1)n2^n) time and space, if the number of nodes (variables) in the Bayesian network is nn and the in-degree (the number of parents) per node is bounded by a constant dd. Here we present a parallel algorithm capable of computing the exact posterior probabilities for all n(n1)n(n-1) edges with optimal parallel space efficiency and nearly optimal parallel time efficiency. That is, if p=2kp=2^k processors are used, the run-time reduces to O(5(d+1)n2nk+k(nk)d)O(5(d+1)n2^{n-k}+k(n-k)^d) and the space usage becomes O(n2nk)O(n2^{n-k}) per processor. Our algorithm is based the observation that the subproblems in the sequential DP algorithm constitute a nn-DD hypercube. We take a delicate way to coordinate the computation of correlated DP procedures such that large amount of data exchange is suppressed. Further, we develop parallel techniques for two variants of the well-known \emph{zeta transform}, which have applications outside the context of Bayesian networks. We demonstrate the capability of our algorithm on datasets with up to 33 variables and its scalability on up to 2048 processors. We apply our algorithm to a biological data set for discovering the yeast pheromone response pathways.Comment: 32 pages, 12 figure

    ARM Wrestling with Big Data: A Study of Commodity ARM64 Server for Big Data Workloads

    Full text link
    ARM processors have dominated the mobile device market in the last decade due to their favorable computing to energy ratio. In this age of Cloud data centers and Big Data analytics, the focus is increasingly on power efficient processing, rather than just high throughput computing. ARM's first commodity server-grade processor is the recent AMD A1100-series processor, based on a 64-bit ARM Cortex A57 architecture. In this paper, we study the performance and energy efficiency of a server based on this ARM64 CPU, relative to a comparable server running an AMD Opteron 3300-series x64 CPU, for Big Data workloads. Specifically, we study these for Intel's HiBench suite of web, query and machine learning benchmarks on Apache Hadoop v2.7 in a pseudo-distributed setup, for data sizes up to 20GB20GB files, 5M5M web pages and 500M500M tuples. Our results show that the ARM64 server's runtime performance is comparable to the x64 server for integer-based workloads like Sort and Hive queries, and only lags behind for floating-point intensive benchmarks like PageRank, when they do not exploit data parallelism adequately. We also see that the ARM64 server takes 13rd\frac{1}{3}^{rd} the energy, and has an Energy Delay Product (EDP) that is 5071%50-71\% lower than the x64 server. These results hold promise for ARM64 data centers hosting Big Data workloads to reduce their operational costs, while opening up opportunities for further analysis.Comment: Accepted for publication in the Proceedings of the 24th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC), 201

    Asymptotic Analysis of Plausible Tree Hash Modes for SHA-3

    Get PDF
    Discussions about the choice of a tree hash mode of operation for a standardization have recently been undertaken. It appears that a single tree mode cannot address adequately all possible uses and specifications of a system. In this paper, we review the tree modes which have been proposed, we discuss their problems and propose remedies. We make the reasonable assumption that communicating systems have different specifications and that software applications are of different types (securing stored content or live-streamed content). Finally, we propose new modes of operation that address the resource usage problem for the three most representative categories of devices and we analyse their asymptotic behavior

    Semantic Compression for Edge-Assisted Systems

    Full text link
    A novel semantic approach to data selection and compression is presented for the dynamic adaptation of IoT data processing and transmission within "wireless islands", where a set of sensing devices (sensors) are interconnected through one-hop wireless links to a computational resource via a local access point. The core of the proposed technique is a cooperative framework where local classifiers at the mobile nodes are dynamically crafted and updated based on the current state of the observed system, the global processing objective and the characteristics of the sensors and data streams. The edge processor plays a key role by establishing a link between content and operations within the distributed system. The local classifiers are designed to filter the data streams and provide only the needed information to the global classifier at the edge processor, thus minimizing bandwidth usage. However, the better the accuracy of these local classifiers, the larger the energy necessary to run them at the individual sensors. A formulation of the optimization problem for the dynamic construction of the classifiers under bandwidth and energy constraints is proposed and demonstrated on a synthetic example.Comment: Presented at the Information Theory and Applications Workshop (ITA), February 17, 201

    Enhancing an Embedded Processor Core with a Cryptographic Unit for Performance and Security

    Get PDF
    We present a set of low-cost architectural enhancements to accelerate the execution of certain arithmetic operations common in cryptographic applications on an extensible embedded processor core. The proposed enhancements are generic in the sense that they can be beneficially applied in almost any RISC processor. We implemented the enhancements in form of a cryptographic unit (CU) that offers the programmer an extended instruction set. The CU features a 128-bit wide register file and datapath, which enables it to process 128-bit words and perform 128-bit loads/stores. We analyze the speed-up factors for some arithmetic operations and public-key cryptographic algorithms obtained through these enhancements. In addition, we evaluate the hardware overhead (i.e. silicon area) of integrating the CU into an embedded RISC processor. Our experimental results show that the proposed architectural enhancements allow for a significant performance gain for both RSA and ECC at the expense of an acceptable increase in silicon area. We also demonstrate that the proposed enhancements facilitate the protection of cryptographic algorithms against certain types of side-channel attacks and present an AES implementation hardened against cache-based attacks as a case study
    corecore