12,280 research outputs found
FLASH: Randomized Algorithms Accelerated over CPU-GPU for Ultra-High Dimensional Similarity Search
We present FLASH (\textbf{F}ast \textbf{L}SH \textbf{A}lgorithm for
\textbf{S}imilarity search accelerated with \textbf{H}PC), a similarity search
system for ultra-high dimensional datasets on a single machine, that does not
require similarity computations and is tailored for high-performance computing
platforms. By leveraging a LSH style randomized indexing procedure and
combining it with several principled techniques, such as reservoir sampling,
recent advances in one-pass minwise hashing, and count based estimations, we
reduce the computational and parallelization costs of similarity search, while
retaining sound theoretical guarantees.
We evaluate FLASH on several real, high-dimensional datasets from different
domains, including text, malicious URL, click-through prediction, social
networks, etc. Our experiments shed new light on the difficulties associated
with datasets having several million dimensions. Current state-of-the-art
implementations either fail on the presented scale or are orders of magnitude
slower than FLASH. FLASH is capable of computing an approximate k-NN graph,
from scratch, over the full webspam dataset (1.3 billion nonzeros) in less than
10 seconds. Computing a full k-NN graph in less than 10 seconds on the webspam
dataset, using brute-force (), will require at least 20 teraflops. We
provide CPU and GPU implementations of FLASH for replicability of our results
Impliance: A Next Generation Information Management Appliance
ably successful in building a large market and adapting to the changes of the
last three decades, its impact on the broader market of information management
is surprisingly limited. If we were to design an information management system
from scratch, based upon today's requirements and hardware capabilities, would
it look anything like today's database systems?" In this paper, we introduce
Impliance, a next-generation information management system consisting of
hardware and software components integrated to form an easy-to-administer
appliance that can store, retrieve, and analyze all types of structured,
semi-structured, and unstructured information. We first summarize the trends
that will shape information management for the foreseeable future. Those trends
imply three major requirements for Impliance: (1) to be able to store, manage,
and uniformly query all data, not just structured records; (2) to be able to
scale out as the volume of this data grows; and (3) to be simple and robust in
operation. We then describe four key ideas that are uniquely combined in
Impliance to address these requirements, namely the ideas of: (a) integrating
software and off-the-shelf hardware into a generic information appliance; (b)
automatically discovering, organizing, and managing all data - unstructured as
well as structured - in a uniform way; (c) achieving scale-out by exploiting
simple, massive parallel processing, and (d) virtualizing compute and storage
resources to unify, simplify, and streamline the management of Impliance.
Impliance is an ambitious, long-term effort to define simpler, more robust, and
more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement
(http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute,
display, and perform the work, make derivative works and make commercial use
of the work, but, you must attribute the work to the author and CIDR 2007.
3rd Biennial Conference on Innovative Data Systems Research (CIDR) January
710, 2007, Asilomar, California, US
Parallel Architectures for Planetary Exploration Requirements (PAPER)
The Parallel Architectures for Planetary Exploration Requirements (PAPER) project is essentially research oriented towards technology insertion issues for NASA's unmanned planetary probes. It was initiated to complement and augment the long-term efforts for space exploration with particular reference to NASA/LaRC's (NASA Langley Research Center) research needs for planetary exploration missions of the mid and late 1990s. The requirements for space missions as given in the somewhat dated Advanced Information Processing Systems (AIPS) requirements document are contrasted with the new requirements from JPL/Caltech involving sensor data capture and scene analysis. It is shown that more stringent requirements have arisen as a result of technological advancements. Two possible architectures, the AIPS Proof of Concept (POC) configuration and the MAX Fault-tolerant dataflow multiprocessor, were evaluated. The main observation was that the AIPS design is biased towards fault tolerance and may not be an ideal architecture for planetary and deep space probes due to high cost and complexity. The MAX concepts appears to be a promising candidate, except that more detailed information is required. The feasibility for adding neural computation capability to this architecture needs to be studied. Key impact issues for architectural design of computing systems meant for planetary missions were also identified
An Overview on Application of Machine Learning Techniques in Optical Networks
Today's telecommunication networks have become sources of enormous amounts of
widely heterogeneous data. This information can be retrieved from network
traffic traces, network alarms, signal quality indicators, users' behavioral
data, etc. Advanced mathematical tools are required to extract meaningful
information from these data and take decisions pertaining to the proper
functioning of the networks from the network-generated data. Among these
mathematical tools, Machine Learning (ML) is regarded as one of the most
promising methodological approaches to perform network-data analysis and enable
automated network self-configuration and fault management. The adoption of ML
techniques in the field of optical communication networks is motivated by the
unprecedented growth of network complexity faced by optical networks in the
last few years. Such complexity increase is due to the introduction of a huge
number of adjustable and interdependent system parameters (e.g., routing
configurations, modulation format, symbol rate, coding schemes, etc.) that are
enabled by the usage of coherent transmission/reception technologies, advanced
digital signal processing and compensation of nonlinear effects in optical
fiber propagation. In this paper we provide an overview of the application of
ML to optical communications and networking. We classify and survey relevant
literature dealing with the topic, and we also provide an introductory tutorial
on ML for researchers and practitioners interested in this field. Although a
good number of research papers have recently appeared, the application of ML to
optical networks is still in its infancy: to stimulate further work in this
area, we conclude the paper proposing new possible research directions
Design and Service Provisioning Methods for Optical Networks in 5G and Beyond Scenarios
Network operators are deploying 5G while also considering the evolution towards 6G. They consider different enablers and address various challenges. One trend in the 5G deployment is network densification, i.e., deploying many small cell sites close to the users, which need a well-designed transport network (TN). The choice of the TN technology and the location for processing the 5G protocol stack functions are critical to contain capital and operational expenditures. Furthermore, it is crucial to ensure the resiliency of the TN infrastructure in case of a failure in nodes and/or links while the resource efficiency is maximized.Operators are also interested in 5G networks with flexibility and scalability features. In this context, one main question is where to deploy network functions so that the connectivity and compute resources are utilized efficiently while meeting strict service latency and availability requirements. Off-loading compute resources to large and central data centers (DCs) has some advantages, i.e., better utilization of compute resources at a lower cost. A backup path can be added to address service availability requirements when using compute off-loading strategies. This might impact the service blocking ratio and limit operators’ profit. The importance of this trade-off becomes more critical with the emergence of new 6G verticals.This thesis proposes novel methods to address the issues outlined above. To address the challenge of cost-efficient TN deployment, the thesis introduces a framework to study the total cost of ownership (TCO), latency, and reliability performance of a set of TN architectures for high-layer and low-layer functional split options. The architectural options are fiber- or microwave-based. To address the strict availability requirement, the thesis proposes a resource-efficient protection strategy against single node/link failure of the midhaul segment. The method selects primary and backup DCs for each aggregation node (i.e., nodes to which cell sites are connected) while maximizing the sharing of backup resources. Finally, to address the challenge of resource efficiency while provisioning services, the thesis proposes a backup-enhanced compute off-loading strategy (i.e., resource-efficient provisioning (REP)). REP selects a DC, a connectivity path, and (optionally) a backup path for each service request with the aim of minimizing resource usage while the service latency and availability requirements are met.Our results of the techno-economic assessment of the TN options reveal that, in some cases, microwave can be a good substitute for fiber technology. Several factors, including the geo-type, functional split option, and the cost of fiber trenching and microwave equipment, influence the effectiveness of the microwave. The considered architectures show similar latency and reliability performance and meet the 5G service requirements. The thesis also shows that a protection strategy based on shared connectivity and compute resources can lead to significant cost savings compared to benchmarks based on dedicated backup resources. Finally, the thesis shows that the proposed backup-enhanced compute off-loading strategy offers advantages in service blocking ratio and profit gain compared to a conventional off-loading approach that does not add a backup path. Benefits are even more evident considering next-generation services, e.g., expected on the market in 3 to 5 years, as the demand for services with stringent latency and availability will increase
Is There Light at the Ends of the Tunnel? Wireless Sensor Networks for Adaptive Lighting in Road Tunnels
Existing deployments of wireless sensor networks (WSNs) are often conceived as stand-alone monitoring tools. In this paper, we report instead on a deployment where the WSN is a key component of a closed-loop control system for adaptive lighting in operational road tunnels. WSN nodes along the tunnel walls report light readings to a control station, which closes the loop by setting the intensity of lamps to match a legislated curve. The ability to match dynamically the lighting levels to the actual environmental conditions improves the tunnel safety and reduces its power consumption. The use of WSNs in a closed-loop system, combined with the real-world, harsh setting of operational road tunnels, induces tighter requirements on the quality and timeliness of sensed data, as well as on the reliability and lifetime of the network. In this work, we test to what extent mainstream WSN technology meets these challenges, using a dedicated design that however relies on wellestablished techniques. The paper describes the hw/sw architecture we devised by focusing on the WSN component, and analyzes its performance through experiments in a real, operational tunnel
Real-Time Fault Detection and Diagnosis System for Analog and Mixed-Signal Circuits of Acousto-Magnetic EAS Devices
© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The paper discusses fault diagnosis of the electronic circuit board, part of acousto-magnetic electronic article surveillance detection devices. The aim is that the end-user can run the fault diagnosis in real time using a portable FPGA-based platform so as to gain insight into the failures that have occurred.Peer reviewe
- …