1,162 research outputs found
Recommended from our members
Breaking Computational Barriers to Perform Time Series Pattern Mining at Scale and at the Edge
Uncovering repeated behavior in time series is an important problem in many domains such as medicine, geophysics, meteorology, and many more. With the continuing surge of smart/embedded devices generating time series data, there is an ever growing need to perform analysis on datasets of increasing size. Additionally, there is an increasing need for analysis at low power edge devices due to latency problems inherent to the speed of light and the sheer amount of data being recorded. The matrix profile has proven to be a tool highly suitable for pattern mining in time series; however, a naive approach to computing the matrix profile makes it impossible to use effectively in both the cloud and at the edge. This dissertation shows how, through the use of GPUs and machine learning, the matrix profile is computed more feasibly, both at cloud-scale and at sensor-scale. In addition, it illustrates why both of these types of computation are important and what new insights they can provide to practitioners working with time series data
An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration
We empirically evaluate an undervolting technique, i.e., underscaling the
circuit supply voltage below the nominal level, to improve the power-efficiency
of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable
Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing
faults due to excessive circuit latency increase. We evaluate the
reliability-power trade-off for such accelerators. Specifically, we
experimentally study the reduced-voltage operation of multiple components of
real FPGAs, characterize the corresponding reliability behavior of CNN
accelerators, propose techniques to minimize the drawbacks of reduced-voltage
operation, and combine undervolting with architectural CNN optimization
techniques, i.e., quantization and pruning. We investigate the effect of
environmental temperature on the reliability-power trade-off of such
accelerators. We perform experiments on three identical samples of modern
Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification
CNN benchmarks. This approach allows us to study the effects of our
undervolting technique for both software and hardware variability. We achieve
more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain
is the result of eliminating the voltage guardband region, i.e., the safe
voltage region below the nominal level that is set by FPGA vendor to ensure
correct functionality in worst-case environmental and circuit conditions. 43%
of the power-efficiency gain is due to further undervolting below the
guardband, which comes at the cost of accuracy loss in the CNN accelerator. We
evaluate an effective frequency underscaling technique that prevents this
accuracy loss, and find that it reduces the power-efficiency gain from 43% to
25%.Comment: To appear at the DSN 2020 conferenc
KOIOS: Top-k Semantic Overlap Set Search
We study the top-k set similarity search problem using semantic overlap.
While vanilla overlap requires exact matches between set elements, semantic
overlap allows elements that are syntactically different but semantically
related to increase the overlap. The semantic overlap is the maximum matching
score of a bipartite graph, where an edge weight between two set elements is
defined by a user-defined similarity function, e.g., cosine similarity between
embeddings. Common techniques like token indexes fail for semantic search since
similar elements may be unrelated at the character level. Further, verifying
candidates is expensive (cubic versus linear for syntactic overlap), calling
for highly selective filters. We propose KOIOS, the first exact and efficient
algorithm for semantic overlap search. KOIOS leverages sophisticated filters to
minimize the number of required graph-matching calculations. Our experiments
show that for medium to large sets less than 5% of the candidate sets need
verification, and more than half of those sets are further pruned without
requiring the expensive graph matching. We show the efficiency of our algorithm
on four real datasets and demonstrate the improved result quality of semantic
over vanilla set similarity search
Structure-Grounded Pretraining for Text-to-SQL
Learning to capture text-table alignment is essential for tasks like
text-to-SQL. A model needs to correctly recognize natural language references
to columns and values and to ground them in the given database schema. In this
paper, we present a novel weakly supervised Structure-Grounded pretraining
framework (StruG) for text-to-SQL that can effectively learn to capture
text-table alignment based on a parallel text-table corpus. We identify a set
of novel prediction tasks: column grounding, value grounding and column-value
mapping, and leverage them to pretrain a text-table encoder. Additionally, to
evaluate different methods under more realistic text-table alignment settings,
we create a new evaluation set Spider-Realistic based on Spider dev set with
explicit mentions of column names removed, and adopt eight existing text-to-SQL
datasets for cross-database evaluation. STRUG brings significant improvement
over BERT-LARGE in all settings. Compared with existing pretraining methods
such as GRAPPA, STRUG achieves similar performance on Spider, and outperforms
all baselines on more realistic sets. All the code and data used in this work
is public available at https://aka.ms/strug.Comment: Accepted to NAACL 2021. Please contact the first author for questions
regarding the spider-realistic datase
- …