448 research outputs found
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Although remarkable advancements have been made recently in point cloud
analysis through the exploration of transformer architecture, it remains
challenging to effectively learn local and global structures within point
clouds. In this paper, we propose a new transformer architecture equipped with
a collect-and-distribute mechanism to communicate short- and long-range
contexts of point clouds, which we refer to as CDFormer. Specifically, we first
utilize self-attention to capture short-range interactions within each local
patch, and the updated local features are then collected into a set of proxy
reference points from which we can extract long-range contexts. Afterward, we
distribute the learned long-range contexts back to local points via
cross-attention. To address the position clues for short- and long-range
contexts, we also introduce context-aware position encoding to facilitate
position-aware communications between points. We perform experiments on four
popular point cloud datasets, namely ModelNet40, ScanObjectNN, S3DIS, and
ShapeNetPart, for classification and segmentation. Results show the
effectiveness of the proposed CDFormer, delivering several new state-of-the-art
performances on point cloud classification and segmentation tasks. The code is
available at \url{https://github.com/haibo-qiu/CDFormer}.Comment: Code is available at https://github.com/haibo-qiu/CDForme
Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning
A key assumption in most existing works on FL algorithms' convergence
analysis is that the noise in stochastic first-order information has a finite
variance. Although this assumption covers all light-tailed (i.e.,
sub-exponential) and some heavy-tailed noise distributions (e.g., log-normal,
Weibull, and some Pareto distributions), it fails for many fat-tailed noise
distributions (i.e., ``heavier-tailed'' with potentially infinite variance)
that have been empirically observed in the FL literature. To date, it remains
unclear whether one can design convergent algorithms for FL systems that
experience fat-tailed noise. This motivates us to fill this gap in this paper
by proposing an algorithmic framework called FAT-Clipping (\ul{f}ederated
\ul{a}veraging with \ul{t}wo-sided learning rates and \ul{clipping}), which
contains two variants: FAT-Clipping per-round (FAT-Clipping-PR) and
FAT-Clipping per-iteration (FAT-Clipping-PI). Specifically, for the largest
such that the fat-tailed noise in FL still has a bounded
-moment, we show that both variants achieve
and
convergence rates in the
strongly-convex and general non-convex settings, respectively, where and
are the numbers of clients and communication rounds. Moreover, at the
expense of more clipping operations compared to FAT-Clipping-PR,
FAT-Clipping-PI further enjoys a linear speedup effect with respect to the
number of local updates at each client and being lower-bound-matching (i.e.,
order-optimal). Collectively, our results advance the understanding of
designing efficient algorithms for FL systems that exhibit fat-tailed
first-order oracle information.Comment: Published as a conference paper at NeurIPS 202
Learning Composite Representations for Point Cloud Scene Understanding
In recent years, there has been significant interest in 3D point clouds from academia and industry. Unlike 2D images, point clouds consist of 3D points with Cartesian coordinates, offering an accurate, view-invariant representation of real-world scenes. Understanding 3D point cloud scenes is crucial for applications like autonomous driving, robotics, and AR/VR. However, comprehending these scenes is challenging due to their complex spatial structures and objects at various scales. Previous methods have often focused on learning representations that capture only local details or a specific scale, leading to suboptimal results.
This thesis aims to advance learning composite representations for scene understanding by integrating multiple clues. It explores composite learning schemes from two angles: the data side and the model side. From the data perspective, it suggests projecting 3D point clouds into various 2D views and using multi-view feature fusion to learn composite representations. An end-to-end trainable geometric flow network is introduced to achieve this, enabling the learning and fusion of multi-view representations. The model side investigates the design of local basic operators and a global framework. The thesis introduces the collect-and-distribute block for local operators, which capture short- and long-range contexts simultaneously, effectively learning composite representations that incorporate sufficient contextual information. For the global framework, it addresses the challenge of multi-scale objects by employing high-resolution architectures that maintain high resolutions throughout the network and facilitate communication of multiple resolution features, efficiently learning multi-scale composite representations. Extensive experiments and analysis on popular benchmarks demonstrate the effectiveness of these approaches
Soliton collisions in Bose-Einstein condensates with current-dependent interactions
We study general collisions between chiral solitons in Bose-Einstein
condensates subject to combined attractive and current-dependent interatomic
interactions. A simple analysis based on the linear superposition of the
solitons allows us to determine the relevant time and space scales of the
dynamics, which is illustrated by extensive numerical simulations. By varying
the differential amplitude, the relative phase, the average velocity, and the
relative velocity of the solitons, we characterize the different dynamical
regimes that give rise to oscillatory and interference phenomena. Apart from
the known inelastic character of the collisions, we show that the chiral
dynamics involves an amplitude reduction with respect to the case of regular
solitons. To compare with feasible ultracold gas experiments, the influence of
harmonic confinement is analyzed in both the emergence and the interaction of
chiral solitons.Comment: 15 pages, 12 figure
Hybrid synchronization in coupled ultracold atomic gases
We study the time evolution of two coupled many-body quantum systems, one of which is assumed to be Bose condensed. Specifically, we consider two ultracold atomic clouds each populating two localized single-particle states, i.e., a two-component bosonic Josephson junction. The cold atom cloud can retain its coherence when coupled to the condensate and displays synchronization with the latter, differing from usual entrainment. We term this effect among the ultracold and the condensed clouds as hybrid synchronization. The onset of synchronization, which we observe in the evolution of average properties of both gases when increasing their coupling, is found to be related to the many-body properties of the quantum gas, e.g., condensed fraction quantum fluctuations of the particle number differences. We discuss the effects of different initial preparations and the influence of unequal particle numbers for the two clouds, and we explore the dependence on the initial quantum state, e.g., coherent state, squeezed state, and Fock state, finding essentially the same phenomenology in all cases.This work was supported by China Scholarship Council, the National Natural Science Foundation of China (Grants No. 11104217, No. 11205121, and No. 11402199). We acknowledge also partial financial support from the MINECO (Spain) Grants No. FIS2011-24154, No. FIS2014-54672-P, and No. FIS2014-60343-P; the Generalitat de Catalunya Grant No. 2014SGR-401; and European Union project QuProCS (Grant No. 641277). B.J.-D. is supported by the Ramón y Cajal program.Peer Reviewe
- …