42 research outputs found
A Good Score Does not Lead to A Good Generative Model
Score-based Generative Models (SGMs) is one leading method in generative
modeling, renowned for their ability to generate high-quality samples from
complex, high-dimensional data distributions. The method enjoys empirical
success and is supported by rigorous theoretical convergence properties. In
particular, it has been shown that SGMs can generate samples from a
distribution that is close to the ground-truth if the underlying score function
is learned well, suggesting the success of SGM as a generative model. We
provide a counter-example in this paper. Through the sample complexity
argument, we provide one specific setting where the score function is learned
well. Yet, SGMs in this setting can only output samples that are Gaussian
blurring of training data points, mimicking the effects of kernel density
estimation. The finding resonates a series of recent finding that reveal that
SGMs can demonstrate strong memorization effect and fail to generate
Overview of Sensing Attacks on Autonomous Vehicle Technologies and Impact on Traffic Flow
While perception systems in Connected and Autonomous Vehicles (CAVs), which
encompass both communication technologies and advanced sensors, promise to
significantly reduce human driving errors, they also expose CAVs to various
cyberattacks. These include both communication and sensing attacks, which
potentially jeopardize not only individual vehicles but also overall traffic
safety and efficiency. While much research has focused on communication
attacks, sensing attacks, which are equally critical, have garnered less
attention. To address this gap, this study offers a comprehensive review of
potential sensing attacks and their impact on target vehicles, focusing on
commonly deployed sensors in CAVs such as cameras, LiDAR, Radar, ultrasonic
sensors, and GPS. Based on this review, we discuss the feasibility of
integrating hardware-in-the-loop experiments with microscopic traffic
simulations. We also design baseline scenarios to analyze the macro-level
impact of sensing attacks on traffic flow. This study aims to bridge the
research gap between individual vehicle sensing attacks and broader macroscopic
impacts, thereby laying the foundation for future systemic understanding and
mitigation
Sequencing-enabled Hierarchical Cooperative CAV On-ramp Merging Control with Enhanced Stability and Feasibility
This paper develops a sequencing-enabled hierarchical connected automated
vehicle (CAV) cooperative on-ramp merging control framework. The proposed
framework consists of a two-layer design: the upper level control sequences the
vehicles to harmonize the traffic density across mainline and on-ramp segments
while enhancing lower-level control efficiency through a mixed-integer linear
programming formulation. Subsequently, the lower-level control employs a
longitudinal distributed model predictive control (MPC) supplemented by a
virtual car-following (CF) concept to ensure asymptotic local stability, l_2
norm string stability, and safety. Proofs of asymptotic local stability and l_2
norm string stability are mathematically derived. Compared to other prevalent
asymptotic local-stable MPC controllers, the proposed distributed MPC
controller greatly expands the initial feasible set. Additionally, an auxiliary
lateral control is developed to maintain lane-keeping and merging smoothness
while accommodating ramp geometric curvature. To validate the proposed
framework, multiple numerical experiments are conducted. Results indicate a
notable outperformance of our upper-level controller against a distance-based
sequencing method. Furthermore, the lower-level control effectively ensures
smooth acceleration, safe merging with adequate spacing, adherence to proven
longitudinal local and string stability, and rapid regulation of lateral
deviations
Beyond 1D and oversimplified kinematics: A generic analytical framework for surrogate safety measures
This paper presents a generic analytical framework tailored for surrogate
safety measures (SSMs) that is versatile across various highway geometries,
capable of encompassing vehicle dynamics of differing dimensionality and
fidelity, and suitable for dynamic, real-world environments. The framework
incorporates a generic vehicle movement model, accommodating a spectrum of
scenarios with varying degrees of complexity and dimensionality, facilitating
the prediction of future vehicle trajectories. It establishes a generic
mathematical criterion to denote potential collisions, characterized by the
spatial overlap between a vehicle and any other entity. A collision risk is
present if the collision criterion is met at any non-negative time point, with
the minimum threshold representing the remaining time to collision. The
framework's proficiency spans from conventional one-dimensional (1D) SSMs to
extended multi-dimensional, high-fidelity SSMs. Its validity is corroborated
through simulation experiments that assess the precision of the framework when
linearization is performed on the vehicle movement model. The outcomes showcase
remarkable accuracy in predicting vehicle trajectories and the time remaining
before potential collisions occur. The necessity of higher-dimensional and
higher-fidelity SSMs is highlighted through a comparison of conventional 1D
SSMs and extended three-dimensional (3D) SSMs. The results showed that using 1D
SSMs over 3D SSMs could be off by 300% for non-critical Time-to-Collision (TTC)
values and about 20% for critical TTC values (below 1.5 seconds). Furthermore,
the framework's practical application is demonstrated through a case study that
actively evaluates all potential conflicts, underscoring its effectiveness in
dynamic, real-world traffic situations
FedCBO: reaching group consensus in clustered Federated learning through Consensus-Based Optimization
Federated learning is an important framework in modern machine learning that
seeks to integrate the training of learning models from multiple users, each
user having their own local data set, in a way that is sensitive to data
privacy and to communication loss constraints. In clustered federated learning,
one assumes an additional unknown group structure among users, and the goal is
to train models that are useful for each group, rather than simply training a
single global model for all users. In this paper, we propose a novel solution
to the problem of clustered federated learning that is inspired by ideas in
consensus-based optimization (CBO). Our new CBO-type method is based on a
system of interacting particles that is oblivious to group memberships. Our
model is motivated by rigorous mathematical reasoning, including a mean field
analysis describing the large number of particles limit of our particle system,
as well as convergence guarantees for the simultaneous global optimization of
general non-convex objective functions (corresponding to the loss functions of
each cluster of users) in the mean-field regime. Experimental results
demonstrate the efficacy of our FedCBO algorithm compared to other
state-of-the-art methods and help validate our methodological and theoretical
work
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
The remarkable capabilities and intricate nature of Artificial Intelligence
(AI) have dramatically escalated the imperative for specialized AI
accelerators. Nonetheless, designing these accelerators for various AI
workloads remains both labor- and time-intensive. While existing design
exploration and automation tools can partially alleviate the need for extensive
human involvement, they still demand substantial hardware expertise, posing a
barrier to non-experts and stifling AI accelerator development. Motivated by
the astonishing potential of large language models (LLMs) for generating
high-quality content in response to human language instructions, we embark on
this work to examine the possibility of harnessing LLMs to automate AI
accelerator design. Through this endeavor, we develop GPT4AIGChip, a framework
intended to democratize AI accelerator design by leveraging human natural
languages instead of domain-specific languages. Specifically, we first perform
an in-depth investigation into LLMs' limitations and capabilities for AI
accelerator design, thus aiding our understanding of our current position and
garnering insights into LLM-powered automated AI accelerator design.
Furthermore, drawing inspiration from the above insights, we develop a
framework called GPT4AIGChip, which features an automated demo-augmented
prompt-generation pipeline utilizing in-context learning to guide LLMs towards
creating high-quality AI accelerator design. To our knowledge, this work is the
first to demonstrate an effective pipeline for LLM-powered automated AI
accelerator generation. Accordingly, we anticipate that our insights and
framework can serve as a catalyst for innovations in next-generation
LLM-powered design automation tools.Comment: Accepted by ICCAD 202
Instant-3D: Instant Neural Radiance Field Training Towards On-Device AR/VR 3D Reconstruction
Neural Radiance Field (NeRF) based 3D reconstruction is highly desirable for
immersive Augmented and Virtual Reality (AR/VR) applications, but achieving
instant (i.e., < 5 seconds) on-device NeRF training remains a challenge. In
this work, we first identify the inefficiency bottleneck: the need to
interpolate NeRF embeddings up to 200,000 times from a 3D embedding grid during
each training iteration. To alleviate this, we propose Instant-3D, an
algorithm-hardware co-design acceleration framework that achieves instant
on-device NeRF training. Our algorithm decomposes the embedding grid
representation in terms of color and density, enabling computational redundancy
to be squeezed out by adopting different (1) grid sizes and (2) update
frequencies for the color and density branches. Our hardware accelerator
further reduces the dominant memory accesses for embedding grid interpolation
by (1) mapping multiple nearby points' memory read requests into one during the
feed-forward process, (2) merging embedding grid updates from the same sliding
time window during back-propagation, and (3) fusing different computation cores
to support the different grid sizes needed by the color and density branches of
Instant-3D algorithm. Extensive experiments validate the effectiveness of
Instant-3D, achieving a large training time reduction of 41x - 248x while
maintaining the same reconstruction quality. Excitingly, Instant-3D has enabled
instant 3D reconstruction for AR/VR, requiring a reconstruction time of only
1.6 seconds per scene and meeting the AR/VR power consumption constraint of 1.9
W.Comment: Accepted by ISCA'2
Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design
Novel view synthesis is an essential functionality for enabling immersive
experiences in various Augmented- and Virtual-Reality (AR/VR) applications, for
which generalizable Neural Radiance Fields (NeRFs) have gained increasing
popularity thanks to their cross-scene generalization capability. Despite their
promise, the real-device deployment of generalizable NeRFs is bottlenecked by
their prohibitive complexity due to the required massive memory accesses to
acquire scene features, causing their ray marching process to be
memory-bounded. To this end, we propose Gen-NeRF, an algorithm-hardware
co-design framework dedicated to generalizable NeRF acceleration, which for the
first time enables real-time generalizable NeRFs. On the algorithm side,
Gen-NeRF integrates a coarse-then-focus sampling strategy, leveraging the fact
that different regions of a 3D scene contribute differently to the rendered
pixel, to enable sparse yet effective sampling. On the hardware side, Gen-NeRF
highlights an accelerator micro-architecture to maximize the data reuse
opportunities among different rays by making use of their epipolar geometric
relationship. Furthermore, our Gen-NeRF accelerator features a customized
dataflow to enhance data locality during point-to-hardware mapping and an
optimized scene feature storage strategy to minimize memory bank conflicts.
Extensive experiments validate the effectiveness of our proposed Gen-NeRF
framework in enabling real-time and generalizable novel view synthesis.Comment: Accepted by ISCA 202
Identification of an eight-LncRNA signature as the prognostic LncRNA markers in hepatocellular carcinoma patients
Long non-coding RNA (lncRNA) signature has been reputable for the predetermination of cancer prognosis. In the present study, we constructed a lncRNA model to predict the survival outcomes for hepatocellular carcinoma (HCC) patients. Using the transcriptome data from TCGA HCC samples, we identified that NEAT1 and MALAT1 were highly expressed in HCC and other tumor subtypes compared to the adjacent normal tissues. Based on the LASSO Cox regression model, we identified an eight-lncRNA signature that significantly correlated with the overall survival and disease-free survival of the HCC training group. The prognostic value of this signature was validated using the test group. Further analysis suggested that this signature was associated with the clinicopathological parameters such as vascular tumor invasion, pathological stage, and tumor grade. Integrated functional analysis showed that these eight-lncRNAs were involved in the cell cycle, metabolic process, and immune response. In conclusion, we constructed an applicable eight-lncRNA signature that is robust and reliable for the prognosis of HCC. This signature may provide an efficient clinical prediction for HCC patients, and further study is required to uncover the function of the identified lncRNAs better