163 research outputs found
A complete catalogue of broad-line AGNs and double-peaked emission lines from MaNGA integral-field spectroscopy of 10K galaxies: stellar population of AGNs, supermassive black holes, and dual AGNs
We analyse the integral-field spectroscopy data for the
galaxies in final data release of the MaNGA survey. We identify 188 galaxies
for which the emission lines cannot be described by single Gaussian components.
These galaxies can be classified into (1) 38 galaxies with broad and
[OIII] 5007 lines, (2) 101 galaxies with broad lines but no
broad [OIII] 5007 lines, and (3) 49 galaxies with double-peaked narrow
emission lines. Most of the broad line galaxies are classified as Active
Galactic Nuclei (AGN) from their line ratios. The catalogue helps us further
understand the AGN-galaxy coevolution through the stellar population of
broad-line region host galaxies and the relation between broad lines'
properties and the host galaxies' dynamical properties. The stellar population
properties (including mass, age and metallicity) of broad-line host galaxies
suggest there is no significant difference between narrow-line Seyfert-2
galaxies and Type-1 AGN with broad lines. We use the broad-
line width and luminosity to estimate masses of black hole in these galaxies,
and test the relation in Type-1 AGN host galaxies.
Furthermore we find three dual AGN candidates supported by radio images from
the VLA FIRST survey. This sample may be useful for further studies on AGN
activities and feedback processes.Comment: 21 pages, 17 figures, LaTeX. Accepted by MNRA
A Data Middleware for Obtaining Trusted Price Data for Blockchain
As a trusted middleware connecting the blockchain and the real world, the
blockchain oracle can obtain trusted real-time price information for financial
applications such as payment and settlement, and asset valuation on the
blockchain. However, the current oracle schemes face the dilemma of security
and service quality in the process of node selection, and the implicit interest
relationship in financial applications leads to a significant conflict of
interest between the task publisher and the executor, which reduces the
participation enthusiasm of both parties and system security. Therefore, this
paper proposes an anonymous node selection scheme that anonymously selects
nodes with high reputations to participate in tasks to ensure the security and
service quality of nodes. Then, this paper also details the interest
requirements and behavioral motives of all parties in the payment settlement
and asset valuation scenarios. Under the assumption of rational participants,
an incentive mechanism based on the Stackelberg game is proposed. It can
achieve equilibrium under the pursuit of the interests of task publishers and
executors, thereby ensuring the interests of all types of users and improving
the enthusiasm of participation. Finally, we verify the security of the
proposed scheme through security analysis. The experimental results show that
the proposed scheme can reduce the variance of obtaining price data by about
55\% while ensuring security, and meeting the interests of all parties.Comment: 12 pages,8 figure
FedDCT: A Dynamic Cross-Tier Federated Learning Scheme in Wireless Communication Networks
With the rapid proliferation of Internet of Things (IoT) devices and the
growing concern for data privacy among the public, Federated Learning (FL) has
gained significant attention as a privacy-preserving machine learning paradigm.
FL enables the training of a global model among clients without exposing local
data. However, when a federated learning system runs on wireless communication
networks, limited wireless resources, heterogeneity of clients, and network
transmission failures affect its performance and accuracy. In this study, we
propose a novel dynamic cross-tier FL scheme, named FedDCT to increase training
accuracy and performance in wireless communication networks. We utilize a
tiering algorithm that dynamically divides clients into different tiers
according to specific indicators and assigns specific timeout thresholds to
each tier to reduce the training time required. To improve the accuracy of the
model without increasing the training time, we introduce a cross-tier client
selection algorithm that can effectively select the tiers and participants.
Simulation experiments show that our scheme can make the model converge faster
and achieve a higher accuracy in wireless communication networks
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
The robustness of 3D perception systems under natural corruptions from
environments and sensors is pivotal for safety-critical applications. Existing
large-scale 3D perception datasets often contain data that are meticulously
cleaned. Such configurations, however, cannot reflect the reliability of
perception models during the deployment stage. In this work, we present Robo3D,
the first comprehensive benchmark heading toward probing the robustness of 3D
detectors and segmentors under out-of-distribution scenarios against natural
corruptions that occur in real-world environments. Specifically, we consider
eight corruption types stemming from adversarial weather conditions, external
disturbances, and internal sensor failure. We uncover that, although promising
results have been progressively achieved on standard benchmarks,
state-of-the-art 3D perception models are at risk of being vulnerable to
corruptions. We draw key observations on the use of data representations,
augmentation schemes, and training strategies, that could severely affect the
model's performance. To pursue better robustness, we propose a
density-insensitive training framework along with a simple flexible
voxelization strategy to enhance the model resiliency. We hope our benchmark
and approach could inspire future research in designing more robust and
reliable 3D perception models. Our robustness benchmark suite is publicly
available.Comment: 33 pages, 26 figures, 26 tables; code at
https://github.com/ldkong1205/Robo3D project page at
https://ldkong.com/Robo3
LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion
LiDAR-camera fusion methods have shown impressive performance in 3D object
detection. Recent advanced multi-modal methods mainly perform global fusion,
where image features and point cloud features are fused across the whole scene.
Such practice lacks fine-grained region-level information, yielding suboptimal
fusion performance. In this paper, we present the novel Local-to-Global fusion
network (LoGoNet), which performs LiDAR-camera fusion at both local and global
levels. Concretely, the Global Fusion (GoF) of LoGoNet is built upon previous
literature, while we exclusively use point centroids to more precisely
represent the position of voxel features, thus achieving better cross-modal
alignment. As to the Local Fusion (LoF), we first divide each proposal into
uniform grids and then project these grid centers to the images. The image
features around the projected grid points are sampled to be fused with
position-decorated point cloud features, maximally utilizing the rich
contextual information around the proposals. The Feature Dynamic Aggregation
(FDA) module is further proposed to achieve information interaction between
these locally and globally fused features, thus producing more informative
multi-modal features. Extensive experiments on both Waymo Open Dataset (WOD)
and KITTI datasets show that LoGoNet outperforms all state-of-the-art 3D
detection methods. Notably, LoGoNet ranks 1st on Waymo 3D object detection
leaderboard and obtains 81.02 mAPH (L2) detection performance. It is noteworthy
that, for the first time, the detection performance on three classes surpasses
80 APH (L2) simultaneously. Code will be available at
\url{https://github.com/sankin97/LoGoNet}.Comment: Accepted by CVPR202
Rethinking Range View Representation for LiDAR Segmentation
LiDAR segmentation is crucial for autonomous driving perception. Recent
trends favor point- or voxel-based methods as they often yield better
performance than the traditional range view representation. In this work, we
unveil several key factors in building powerful range view models. We observe
that the "many-to-one" mapping, semantic incoherence, and shape deformation are
possible impediments against effective learning from range view projections. We
present RangeFormer -- a full-cycle framework comprising novel designs across
network architecture, data augmentation, and post-processing -- that better
handles the learning and processing of LiDAR point clouds from the range view.
We further introduce a Scalable Training from Range view (STR) strategy that
trains on arbitrary low-resolution 2D range images, while still maintaining
satisfactory 3D segmentation accuracy. We show that, for the first time, a
range view method is able to surpass the point, voxel, and multi-view fusion
counterparts in the competing LiDAR semantic and panoptic segmentation
benchmarks, i.e., SemanticKITTI, nuScenes, and ScribbleKITTI.Comment: ICCV 2023; 24 pages, 10 figures, 14 tables; Webpage at
https://ldkong.com/RangeForme
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP
Contrastive Language-Image Pre-training (CLIP) achieves promising results in
2D zero-shot and few-shot learning. Despite the impressive performance in 2D,
applying CLIP to help the learning in 3D scene understanding has yet to be
explored. In this paper, we make the first attempt to investigate how CLIP
knowledge benefits 3D scene understanding. We propose CLIP2Scene, a simple yet
effective framework that transfers CLIP knowledge from 2D image-text
pre-trained models to a 3D point cloud network. We show that the pre-trained 3D
network yields impressive performance on various downstream tasks, i.e.,
annotation-free and fine-tuning with labelled data for semantic segmentation.
Specifically, built upon CLIP, we design a Semantic-driven Cross-modal
Contrastive Learning framework that pre-trains a 3D network via semantic and
spatial-temporal consistency regularization. For the former, we first leverage
CLIP's text semantics to select the positive and negative point samples and
then employ the contrastive loss to train the 3D network. In terms of the
latter, we force the consistency between the temporally coherent point cloud
features and their corresponding image features. We conduct experiments on
SemanticKITTI, nuScenes, and ScanNet. For the first time, our pre-trained
network achieves annotation-free 3D semantic segmentation with 20.8% and 25.08%
mIoU on nuScenes and ScanNet, respectively. When fine-tuned with 1% or 100%
labelled data, our method significantly outperforms other self-supervised
methods, with improvements of 8% and 1% mIoU, respectively. Furthermore, we
demonstrate the generalizability for handling cross-domain datasets. Code is
publicly available https://github.com/runnanchen/CLIP2Scene.Comment: CVPR 202
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase
Point-, voxel-, and range-views are three representative forms of point
clouds. All of them have accurate 3D measurements but lack color and texture
information. RGB images are a natural complement to these point cloud views and
fully utilizing the comprehensive information of them benefits more robust
perceptions. In this paper, we present a unified multi-modal LiDAR segmentation
network, termed UniSeg, which leverages the information of RGB images and three
views of the point cloud, and accomplishes semantic segmentation and panoptic
segmentation simultaneously. Specifically, we first design the Learnable
cross-Modal Association (LMA) module to automatically fuse voxel-view and
range-view features with image features, which fully utilize the rich semantic
information of images and are robust to calibration errors. Then, the enhanced
voxel-view and range-view features are transformed to the point space,where
three views of point cloud features are further fused adaptively by the
Learnable cross-View Association module (LVA). Notably, UniSeg achieves
promising results in three public benchmarks, i.e., SemanticKITTI, nuScenes,
and Waymo Open Dataset (WOD); it ranks 1st on two challenges of two benchmarks,
including the LiDAR semantic segmentation challenge of nuScenes and panoptic
segmentation challenges of SemanticKITTI. Besides, we construct the OpenPCSeg
codebase, which is the largest and most comprehensive outdoor LiDAR
segmentation codebase. It contains most of the popular outdoor LiDAR
segmentation algorithms and provides reproducible implementations. The
OpenPCSeg codebase will be made publicly available at
https://github.com/PJLab-ADG/PCSeg.Comment: ICCV 2023; 21 pages; 9 figures; 18 tables; Code at
https://github.com/PJLab-ADG/PCSe
- …