3,693 research outputs found
Learning Representation for Clustering via Prototype Scattering and Positive Sampling
Existing deep clustering methods rely on either contrastive or
non-contrastive representation learning for downstream clustering task.
Contrastive-based methods thanks to negative pairs learn uniform
representations for clustering, in which negative pairs, however, may
inevitably lead to the class collision issue and consequently compromise the
clustering performance. Non-contrastive-based methods, on the other hand, avoid
class collision issue, but the resulting non-uniform representations may cause
the collapse of clustering. To enjoy the strengths of both worlds, this paper
presents a novel end-to-end deep clustering method with prototype scattering
and positive sampling, termed ProPos. Specifically, we first maximize the
distance between prototypical representations, named prototype scattering loss,
which improves the uniformity of representations. Second, we align one
augmented view of instance with the sampled neighbors of another view --
assumed to be truly positive pair in the embedding space -- to improve the
within-cluster compactness, termed positive sampling alignment. The strengths
of ProPos are avoidable class collision issue, uniform representations,
well-separated clusters, and within-cluster compactness. By optimizing ProPos
in an end-to-end expectation-maximization framework, extensive experimental
results demonstrate that ProPos achieves competing performance on
moderate-scale clustering benchmark datasets and establishes new
state-of-the-art performance on large-scale datasets. Source code is available
at \url{https://github.com/Hzzone/ProPos}.Comment: Accepted by TPAMI 202
Modification of the AdaBoost-based Detector for Partially Occluded Faces
While face detection seems a solved problem under general conditions, most state-of-the-art systems degrade rapidly when faces are partially occluded by other objects. This paper presents a solution to detect partially occluded faces by reasonably modifying the AdaBoost-based face detector. Our basic idea is that the weak classifiers in the AdaBoost-based face detector, each corresponding to a Haar-like feature, are inherently a patch-based model. Therefore, one can divide the whole face region into multiple patches, and map those weak classifiers to the patches. The weak classifiers belonging to each patch are re-formed to be a new classifier to determine if it is a valid face patch—without occlusion. Finally, we combine all of the valid face patches by assigning the patches with different weights to make the final decision whether the input subwindow is a face. The experimental results show that the proposed method is promising for the detection of occluded faces. 1
New mixed adaptive detection algorithm for moving target with big data
Aiming at the troubles (such as complex background, illumination changes, shadows and others on traditional methods) for detecting of a walking person, we put forward a new adaptive detection algorithm through mixing Gaussian Mixture Model (GMM), edge detection algorithm and continuous frame difference algorithm in this paper. In time domain, the new algorithm uses GMM to model and updates the background. In spatial domain, it uses the hybrid detection algorithm which mixes the edge detection algorithm, continuous frame difference algorithm and GMM to get the initial contour of moving target with big data, and gets the ultimate moving target with big data. This algorithm not only can adapt to the illumination gradients and background disturbance occurred on scene, but also can solve some problems such as inaccurate target detection, incomplete edge detection, cavitation and ghost which usually appears in traditional algorithm. As experimental result showing, this algorithm holds better real-time and robustness. It is not only easily implemented, but also can accurately detect the moving target with big data
Learning to Distill Global Representation for Sparse-View CT
Sparse-view computed tomography (CT) -- using a small number of projections
for tomographic reconstruction -- enables much lower radiation dose to patients
and accelerated data acquisition. The reconstructed images, however, suffer
from strong artifacts, greatly limiting their diagnostic value. Current trends
for sparse-view CT turn to the raw data for better information recovery. The
resultant dual-domain methods, nonetheless, suffer from secondary artifacts,
especially in ultra-sparse view scenarios, and their generalization to other
scanners/protocols is greatly limited. A crucial question arises: have the
image post-processing methods reached the limit? Our answer is not yet. In this
paper, we stick to image post-processing methods due to great flexibility and
propose global representation (GloRe) distillation framework for sparse-view
CT, termed GloReDi. First, we propose to learn GloRe with Fourier convolution,
so each element in GloRe has an image-wide receptive field. Second, unlike
methods that only use the full-view images for supervision, we propose to
distill GloRe from intermediate-view reconstructed images that are readily
available but not explored in previous literature. The success of GloRe
distillation is attributed to two key components: representation directional
distillation to align the GloRe directions, and band-pass-specific contrastive
distillation to gain clinically important details. Extensive experiments
demonstrate the superiority of the proposed GloReDi over the state-of-the-art
methods, including dual-domain ones. The source code is available at
https://github.com/longzilicart/GloReDi.Comment: ICCV 202
- …