1,400 research outputs found
Computing Double Precision Euclidean Distances using GPU Tensor Cores
Tensor cores (TCs) are a type of Application-Specific Integrated Circuit
(ASIC) and are a recent addition to Graphics Processing Unit (GPU)
architectures. As such, TCs are purposefully designed to greatly improve the
performance of Matrix Multiply-Accumulate (MMA) operations. While TCs are
heavily studied for machine learning and closely related fields, where their
high efficiency is undeniable, MMA operations are not unique to these fields.
More generally, any computation that can be expressed as MMA operations can
leverage TCs, and potentially benefit from their higher computational
throughput compared to other general-purpose cores, such as CUDA cores on
Nvidia GPUs. In this paper, we propose the first double precision (FP64)
Euclidean distance calculation algorithm, which is expressed as MMA operations
to leverage TCs on Nvidia GPUs, rather than the more commonly used CUDA cores.
To show that the Euclidean distance can be accelerated in a real-world
application, we evaluate our proposed TC algorithm on the distance similarity
self-join problem, as the most computationally intensive part of the algorithm
consists of computing distances in a multi-dimensional space. We find that the
performance gain from using the tensor core algorithm over the CUDA core
algorithm depends weakly on the dataset size and distribution, but is strongly
dependent on data dimensionality. Overall, TCs are a compelling alternative to
CUDA cores, particularly when the data dimensionality is low (), as we
achieve an average speedup of and up to against a
state-of-the-art GPU distance similarity self-join algorithm. Furthermore,
because this paper is among the first to explore the use of TCs for FP64
general-purpose computation, future research is promising.Comment: Accepted for publicatio
Fast Knowledge Graph Completion using Graphics Processing Units
Knowledge graphs can be used in many areas related to data semantics such as
question-answering systems, knowledge based systems. However, the currently
constructed knowledge graphs need to be complemented for better knowledge in
terms of relations. It is called knowledge graph completion. To add new
relations to the existing knowledge graph by using knowledge graph embedding
models, we have to evaluate vector operations, where
is the number of entities and is the number of relation types. It is very
costly.
In this paper, we provide an efficient knowledge graph completion framework
on GPUs to get new relations using knowledge graph embedding vectors. In the
proposed framework, we first define "transformable to a metric space" and then
provide a method to transform the knowledge graph completion problem into the
similarity join problem for a model which is "transformable to a metric space".
After that, to efficiently process the similarity join problem, we derive
formulas using the properties of a metric space. Based on the formulas, we
develop a fast knowledge graph completion algorithm. Finally, we experimentally
show that our framework can efficiently process the knowledge graph completion
problem
DeepJoin: Joinable Table Discovery with Pre-trained Language Models
Due to the usefulness in data enrichment for data analysis tasks, joinable
table discovery has become an important operation in data lake management.
Existing approaches target equi-joins, the most common way of combining tables
for creating a unified view, or semantic joins, which tolerate misspellings and
different formats to deliver more join results. They are either exact solutions
whose running time is linear in the sizes of query column and target table
repository or approximate solutions lacking precision. In this paper, we
propose Deepjoin, a deep learning model for accurate and efficient joinable
table discovery. Our solution is an embedding-based retrieval, which employs a
pre-trained language model (PLM) and is designed as one framework serving both
equi- and semantic joins. We propose a set of contextualization options to
transform column contents to a text sequence. The PLM reads the sequence and is
fine-tuned to embed columns to vectors such that columns are expected to be
joinable if they are close to each other in the vector space. Since the output
of the PLM is fixed in length, the subsequent search procedure becomes
independent of the column size. With a state-of-the-art approximate nearest
neighbor search algorithm, the search time is logarithmic in the repository
size. To train the model, we devise the techniques for preparing training data
as well as data augmentation. The experiments on real datasets demonstrate that
by training on a small subset of a corpus, Deepjoin generalizes to large
datasets and its precision consistently outperforms other approximate
solutions'. Deepjoin is even more accurate than an exact solution to semantic
joins when evaluated with labels from experts. Moreover, when equipped with a
GPU, Deepjoin is up to two orders of magnitude faster than existing solutions
Recommended from our members
Breaking Computational Barriers to Perform Time Series Pattern Mining at Scale and at the Edge
Uncovering repeated behavior in time series is an important problem in many domains such as medicine, geophysics, meteorology, and many more. With the continuing surge of smart/embedded devices generating time series data, there is an ever growing need to perform analysis on datasets of increasing size. Additionally, there is an increasing need for analysis at low power edge devices due to latency problems inherent to the speed of light and the sheer amount of data being recorded. The matrix profile has proven to be a tool highly suitable for pattern mining in time series; however, a naive approach to computing the matrix profile makes it impossible to use effectively in both the cloud and at the edge. This dissertation shows how, through the use of GPUs and machine learning, the matrix profile is computed more feasibly, both at cloud-scale and at sensor-scale. In addition, it illustrates why both of these types of computation are important and what new insights they can provide to practitioners working with time series data
- …