61 research outputs found
Recommended from our members
Main-memory near-data acceleration with concurrent host access
Processing-in-memory is attractive for applications that exhibit low temporal locality and low arithmetic intensity. By bringing computation close to data, PIMs utilize proximity to overcome the bandwidth bottleneck of a main memory bus. Unlike discrete accelerators, such as GPUs, PIMs can potentially accelerate within main memory so that the overhead for loading data from main memory to processor/accelerator memories can be saved. There are a set of challenges for realizing processing in the main memory of conventional CPUs, including: (1) mitigating contention/interference between the CPU and PIM as both access the same shared memory devices, and (2) sharing the same address space between the CPU and PIM for efficient in-place acceleration. In this dissertation, I present solutions to these challenges that achieve high PIM performance without significantly affecting CPU performance (up to 2.4\% degradation). Another major contribution is that I identify killer applications that cannot be effectively accelerated with discrete accelerators. I introduce two compelling use cases in the AI domain for the main-memory accelerators where the unique advantage of a PIM over other acceleration schemes can be leveraged.Electrical and Computer Engineerin
A Distributed ADMM Approach to Non-Myopic Path Planning for Multi-Target Tracking
This paper investigates non-myopic path planning of mobile sensors for
multi-target tracking. Such problem has posed a high computational complexity
issue and/or the necessity of high-level decision making. Existing works tackle
these issues by heuristically assigning targets to each sensing agent and
solving the split problem for each agent. However, such heuristic methods
reduce the target estimation performance in the absence of considering the
changes of target state estimation along time. In this work, we detour the
task-assignment problem by reformulating the general non-myopic planning
problem to a distributed optimization problem with respect to targets. By
combining alternating direction method of multipliers (ADMM) and local
trajectory optimization method, we solve the problem and induce consensus
(i.e., high-level decisions) automatically among the targets. In addition, we
propose a modified receding-horizon control (RHC) scheme and edge-cutting
method for efficient real-time operation. The proposed algorithm is validated
through simulations in various scenarios.Comment: Copyright 2019 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
A study on the differences in the perceived importance of jet fighter performance improvement factors
The rapid advancement in software-based technology has significantly shortened product life cycles, leading to the proliferation of new products. However, the high initial investment makes it practically impossible for armed forces to rapidly replace existing weapons systems with new ones due to technological obsolescence. A more realistic alternative is to focus on performance improvements (or weapon upgrades) in existing systems. The challenge lies in making the right upgrades with the right technology at the right cost and time given the limited defense budget. Unfortunately, weapons upgrade decisions have mostly been based on costs and politically considered budget allocations to different branches of the armed forces rather than by considering a comprehensive range of decision factors. In light of the escalating national security threats, it is necessary to maximize the cost-effectiveness of weapons upgrade projects and effectively address rising national security challenges. The objective of this study is to develop a performance improvement Decision Index that quantifies the opinions of field-operating experts. Field experts are believed to possess the necessary expertise to select the appropriate fighter types, technologies, and upgrade timings, making it beneficial to factor in their opinions to determine what, how, and when to upgrade. Specifically, this study aims to establish weighted values for major decision factors regarding fighter performance improvement programs in the Republic of Korea Air Force. To achieve this, we collected survey data from 134 active-duty pilots and maintenance, operations, and repair (MRO) personnel from major fighter wings of the Republic of Korea Air Force and analyzed the data using the Fuzzy-AHP (Analytical Hierarchy Process). The analysis results indicate that the highest weighted value is given to the “relative (fighter) performance”against hostile nations, followed by “operating rate,” “durability,” “performance improvement cycle,” and “budget.” Furthermore, this study identified perceptual differences among field experts—particularly between pilots and MRO personnel—regarding the importance of relative performance, budget, performance improvement intervals, and operating rates of different fighter types. The proposed performance improvement index aims to provide a quantitative tool that incorporates field experts’ opinions into the decision-making process to upgrade weapons, facilitating balanced decisions and departing from a policymaker-centered approach. This balanced approach to weapons upgrade decisions will contribute to maximizing cost-effectiveness and, eventually, enhancing combat readiness
Make Prompts Adaptable: Bayesian Modeling for Vision-Language Prompt Learning with Data-Dependent Prior
Recent Vision-Language Pretrained (VLP) models have become the backbone for
many downstream tasks, but they are utilized as frozen model without learning.
Prompt learning is a method to improve the pre-trained VLP model by adding a
learnable context vector to the inputs of the text encoder. In a few-shot
learning scenario of the downstream task, MLE training can lead the context
vector to over-fit dominant image features in the training data. This
overfitting can potentially harm the generalization ability, especially in the
presence of a distribution shift between the training and test dataset. This
paper presents a Bayesian-based framework of prompt learning, which could
alleviate the overfitting issues on few-shot learning application and increase
the adaptability of prompts on unseen instances. Specifically, modeling
data-dependent prior enhances the adaptability of text features for both seen
and unseen image features without the trade-off of performance between them.
Based on the Bayesian framework, we utilize the Wasserstein Gradient Flow in
the estimation of our target posterior distribution, which enables our prompt
to be flexible in capturing the complex modes of image features. We demonstrate
the effectiveness of our method on benchmark datasets for several experiments
by showing statistically significant improvements on performance compared to
existing methods. The code is available at https://github.com/youngjae-cho/APP.Comment: Accepted to AAAI-202
DeepVM: Integrating Spot and On-Demand VMs for Cost-Efficient Deep Learning Clusters in the Cloud
Distributed Deep Learning (DDL), as a paradigm, dictates the use of GPU-based
clusters as the optimal infrastructure for training large-scale Deep Neural
Networks (DNNs). However, the high cost of such resources makes them
inaccessible to many users. Public cloud services, particularly Spot Virtual
Machines (VMs), offer a cost-effective alternative, but their unpredictable
availability poses a significant challenge to the crucial checkpointing process
in DDL. To address this, we introduce DeepVM, a novel solution that recommends
cost-effective cluster configurations by intelligently balancing the use of
Spot and On-Demand VMs. DeepVM leverages a four-stage process that analyzes
instance performance using the FLOPP (FLoating-point Operations Per Price)
metric, performs architecture-level analysis with linear programming, and
identifies the optimal configuration for the user-specific needs. Extensive
simulations and real-world deployments in the AWS environment demonstrate that
DeepVM consistently outperforms other policies, reducing training costs and
overall makespan. By enabling cost-effective checkpointing with Spot VMs,
DeepVM opens up DDL to a wider range of users and facilitates a more efficient
training of complex DNNs.Comment: 14 pages, 8 figure
Least squares estimation of acoustic reflection coeffficient
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Least squares estimation of acoustic reflection coefficient
The work presented in this thesis develops further the two-microphone transfer function method used for the measurement of acoustic reflection coefficient of a porous material in an impedance tube. With the use of a least squares solution, the measurement of the transfer functions between multiple microphones can be used to produce an optimal estimation of reflection coefficient. The advantage of using this technique is to extend the frequency range of broadband measurements. The limitations of using the two-microphone transfer function method are analysed in terms of the microphone separations that dictate the upper frequency limit of measurements and it is shown how the measurement of multiple transfer functions can assist in extending the frequency range. Least squares estimation with multiple transfer functions is also applied to free-field measurements based on an image source model of the reflection process. The use of an image source model is found to give good results when used with the least squares solution for measurement of reflection coefficient at normal incidence. Results at oblique incidence seem more difficult to measure accurately in practice because of the precision required in locating microphones. The use of a reflection model, that is associated with plane wave decomposition, is also introduced although this needs a numerical approach in order to enable the application of least squares estimation. The numerical process is demonstrated in a simulation that suggests this technique may ultimately be of practical use.</p
Development of village appraisal system for constructing ecovillages
University of Tokyo (東京大学
- …