226 research outputs found
Geo2SigMap: High-Fidelity RF Signal Mapping Using Geographic Databases
Radio frequency (RF) signal mapping, which is the process of analyzing and
predicting the RF signal strength and distribution across specific areas, is
crucial for cellular network planning and deployment. Traditional approaches to
RF signal mapping rely on statistical models constructed based on measurement
data, which offer low complexity but often lack accuracy, or ray tracing tools,
which provide enhanced precision for the target area but suffer from increased
computational complexity. Recently, machine learning (ML) has emerged as a
data-driven method for modeling RF signal propagation, which leverages models
trained on synthetic datasets to perform RF signal mapping in "unseen" areas.
In this paper, we present Geo2SigMap, an ML-based framework for efficient and
high-fidelity RF signal mapping using geographic databases. First, we develop
an automated framework that seamlessly integrates three open-source tools:
OpenStreetMap (geographic databases), Blender (computer graphics), and Sionna
(ray tracing), enabling the efficient generation of large-scale 3D building
maps and ray tracing models. Second, we propose a cascaded U-Net model, which
is pre-trained on synthetic datasets and employed to generate detailed RF
signal maps, leveraging environmental information and sparse measurement data.
Finally, we evaluate the performance of Geo2SigMap via a real-world measurement
campaign, where three types of user equipment (UE) collect over 45,000 data
points related to cellular information from six LTE cells operating in the
citizens broadband radio service (CBRS) band. Our results show that Geo2SigMap
achieves an average root-mean-square-error (RMSE) of 6.04 dB for predicting the
reference signal received power (RSRP) at the UE, representing an average RMSE
improvement of 3.59 dB compared to existing methods
Examining the Effects of a Bike and E-Bike Lending Program on Commuting Behavior
In 2015, Google added a new transportation demand management (TDM) program to increase bike commuting to their two main campuses in Mountain View and Sunnyvale, California. An initial survey of employees indicated that bike ownership and worry about maintenance were primary bicycling barriers. With this information, Google began a program that loaned high-quality electric-assisted and conventional bicycles for a period of six months at no cost to interested employees. This research evaluates the effectiveness of the program at changing travel behavior to the corporate campuses by using self-reported and smartphone-integrated travel data. The lending program at Google represents one of, if not the largest, employer-sponsored bike and e-bike lending program in North America with over 1,000 bikes in its inventory. Thus, the evaluation of this program is a critical first step for understanding how bike lending can influence travel behavior in North American suburban contexts
CluCDD:Contrastive Dialogue Disentanglement via Clustering
A huge number of multi-participant dialogues happen online every day, which
leads to difficulty in understanding the nature of dialogue dynamics for both
humans and machines. Dialogue disentanglement aims at separating an entangled
dialogue into detached sessions, thus increasing the readability of long
disordered dialogue. Previous studies mainly focus on message-pair
classification and clustering in two-step methods, which cannot guarantee the
whole clustering performance in a dialogue. To address this challenge, we
propose a simple yet effective model named CluCDD, which aggregates utterances
by contrastive learning. More specifically, our model pulls utterances in the
same session together and pushes away utterances in different ones. Then a
clustering method is adopted to generate predicted clustering labels.
Comprehensive experiments conducted on the Movie Dialogue dataset and IRC
dataset demonstrate that our model achieves a new state-of-the-art result.Comment: 5 page
Toward Real-World Light Field Super-Resolution
Deep learning has opened up new possibilities for light field
super-resolution (SR), but existing methods trained on synthetic datasets with
simple degradations (e.g., bicubic downsampling) suffer from poor performance
when applied to complex real-world scenarios. To address this problem, we
introduce LytroZoom, the first real-world light field SR dataset capturing
paired low- and high-resolution light fields of diverse indoor and outdoor
scenes using a Lytro ILLUM camera. Additionally, we propose the Omni-Frequency
Projection Network (OFPNet), which decomposes the omni-frequency components and
iteratively enhances them through frequency projection operations to address
spatially variant degradation processes present in all frequency components.
Experiments demonstrate that models trained on LytroZoom outperform those
trained on synthetic datasets and are generalizable to diverse content and
devices. Quantitative and qualitative evaluations verify the superiority of
OFPNet. We believe this work will inspire future research in real-world light
field SR.Comment: CVPRW 202
An intelligent mobile application testing experience report
Artificial intelligence applications provide tremendous opportunities to improve human life and drive innovation. AI systems/applications which operate in a real-world environment have to encounter an infinite set of feasible scenarios. Conventional testing approach to test the AI application allows only limited testing and does not allow taking the different contexts into consideration and may lead to insufficient validation and characterization. Therefore, to ensure robustness, certainty and reliability of AI applications, the authors applied classification-based AI software testing framework and 3D decision tables to generate test cases. Moreover, the authors compared the quality assurance metrics (accuracy, correctness, reliability and consistency) of AI and non-AI functions in the AI mobile application scenario. Our results indicate and confirm that complete AI function validation is not possible with conventional testing methods, but AI software testing strategy proposed based on classification framework and 3D decision tables has a good effect
Spiking Neural Network for Ultra-low-latency and High-accurate Object Detection
Spiking Neural Networks (SNNs) have garnered widespread interest for their
energy efficiency and brain-inspired event-driven properties. While recent
methods like Spiking-YOLO have expanded the SNNs to more challenging object
detection tasks, they often suffer from high latency and low detection
accuracy, making them difficult to deploy on latency sensitive mobile
platforms. Furthermore, the conversion method from Artificial Neural Networks
(ANNs) to SNNs is hard to maintain the complete structure of the ANNs,
resulting in poor feature representation and high conversion errors. To address
these challenges, we propose two methods: timesteps compression and
spike-time-dependent integrated (STDI) coding. The former reduces the timesteps
required in ANN-SNN conversion by compressing information, while the latter
sets a time-varying threshold to expand the information holding capacity. We
also present a SNN-based ultra-low latency and high accurate object detection
model (SUHD) that achieves state-of-the-art performance on nontrivial datasets
like PASCAL VOC and MS COCO, with about remarkable 750x fewer timesteps and 30%
mean average precision (mAP) improvement, compared to the Spiking-YOLO on MS
COCO datasets. To the best of our knowledge, SUHD is the deepest spike-based
object detection model to date that achieves ultra low timesteps to complete
the lossless conversion.Comment: 14 pages, 10 figure
Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Dynamic computation has emerged as a promising avenue to enhance the
inference efficiency of deep networks. It allows selective activation of
computational units, leading to a reduction in unnecessary computations for
each input sample. However, the actual efficiency of these dynamic models can
deviate from theoretical predictions. This mismatch arises from: 1) the lack of
a unified approach due to fragmented research; 2) the focus on algorithm design
over critical scheduling strategies, especially in CUDA-enabled GPU contexts;
and 3) challenges in measuring practical latency, given that most libraries
cater to static operations. Addressing these issues, we unveil the
Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates
three primary dynamic paradigms-spatially adaptive computation, dynamic layer
skipping, and dynamic channel skipping. To bridge the theoretical and practical
efficiency gap, LAUDNet merges algorithmic design with scheduling optimization,
guided by a latency predictor that accurately gauges dynamic operator latency.
We've tested LAUDNet across multiple vision tasks, demonstrating its capacity
to notably reduce the latency of models like ResNet-101 by over 50% on
platforms such as V100, RTX3090, and TX2 GPUs. Notably, LAUDNet stands out in
balancing accuracy and efficiency. Code is available at:
https://www.github.com/LeapLabTHU/LAUDNet
kTrans: Knowledge-Aware Transformer for Binary Code Embedding
Binary Code Embedding (BCE) has important applications in various reverse
engineering tasks such as binary code similarity detection, type recovery,
control-flow recovery and data-flow analysis. Recent studies have shown that
the Transformer model can comprehend the semantics of binary code to support
downstream tasks. However, existing models overlooked the prior knowledge of
assembly language. In this paper, we propose a novel Transformer-based
approach, namely kTrans, to generate knowledge-aware binary code embedding. By
feeding explicit knowledge as additional inputs to the Transformer, and fusing
implicit knowledge with a novel pre-training task, kTrans provides a new
perspective to incorporating domain knowledge into a Transformer framework. We
inspect the generated embeddings with outlier detection and visualization, and
also apply kTrans to 3 downstream tasks: Binary Code Similarity Detection
(BCSD), Function Type Recovery (FTR) and Indirect Call Recognition (ICR).
Evaluation results show that kTrans can generate high-quality binary code
embeddings, and outperforms state-of-the-art (SOTA) approaches on downstream
tasks by 5.2%, 6.8%, and 12.6% respectively. kTrans is publicly available at:
https://github.com/Learner0x5a/kTrans-releas
- …