22 research outputs found
Open-World Weakly-Supervised Object Localization
While remarkable success has been achieved in weakly-supervised object
localization (WSOL), current frameworks are not capable of locating objects of
novel categories in open-world settings. To address this issue, we are the
first to introduce a new weakly-supervised object localization task called
OWSOL (Open-World Weakly-Supervised Object Localization). During training, all
labeled data comes from known categories and, both known and novel categories
exist in the unlabeled data. To handle such data, we propose a novel paradigm
of contrastive representation co-learning using both labeled and unlabeled data
to generate a complete G-CAM (Generalized Class Activation Map) for object
localization, without the requirement of bounding box annotation. As no class
label is available for the unlabelled data, we conduct clustering over the full
training set and design a novel multiple semantic centroids-driven contrastive
loss for representation learning. We re-organize two widely used datasets,
i.e., ImageNet-1K and iNatLoc500, and propose OpenImages150 to serve as
evaluation benchmarks for OWSOL. Extensive experiments demonstrate that the
proposed method can surpass all baselines by a large margin. We believe that
this work can shift the close-set localization towards the open-world setting
and serve as a foundation for subsequent works. Code will be released at
https://github.com/ryylcc/OWSOL
Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing
Biometric systems are vulnerable to Presentation Attacks (PA) performed using
various Presentation Attack Instruments (PAIs). Even though there are numerous
Presentation Attack Detection (PAD) techniques based on both deep learning and
hand-crafted features, the generalization of PAD for unknown PAI is still a
challenging problem. In this work, we empirically prove that the initialization
of the PAD model is a crucial factor for the generalization, which is rarely
discussed in the community. Based on such observation, we proposed a
self-supervised learning-based method, denoted as DF-DM. Specifically, DF-DM is
based on a global-local view coupled with De-Folding and De-Mixing to derive
the task-specific representation for PAD. During De-Folding, the proposed
technique will learn region-specific features to represent samples in a local
pattern by explicitly minimizing generative loss. While De-Mixing drives
detectors to obtain the instance-specific features with global information for
more comprehensive representation by minimizing interpolation-based
consistency. Extensive experimental results show that the proposed method can
achieve significant improvements in terms of both face and fingerprint PAD in
more complicated and hybrid datasets when compared with state-of-the-art
methods. When training in CASIA-FASD and Idiap Replay-Attack, the proposed
method can achieve an 18.60% Equal Error Rate (EER) in OULU-NPU and MSU-MFSD,
exceeding baseline performance by 9.54%. The source code of the proposed
technique is available at https://github.com/kongzhecn/dfdm.Comment: Accepted by IEEE Transactions on Neural Networks and Learning Systems
(TNNLS
Scene Consistency Representation Learning for Video Scene Segmentation
A long-term video, such as a movie or TV show, is composed of various scenes,
each of which represents a series of shots sharing the same semantic story.
Spotting the correct scene boundary from the long-term video is a challenging
task, since a model must understand the storyline of the video to figure out
where a scene starts and ends. To this end, we propose an effective
Self-Supervised Learning (SSL) framework to learn better shot representations
from unlabeled long-term videos. More specifically, we present an SSL scheme to
achieve scene consistency, while exploring considerable data augmentation and
shuffling methods to boost the model generalizability. Instead of explicitly
learning the scene boundary features as in the previous methods, we introduce a
vanilla temporal model with less inductive bias to verify the quality of the
shot features. Our method achieves the state-of-the-art performance on the task
of Video Scene Segmentation. Additionally, we suggest a more fair and
reasonable benchmark to evaluate the performance of Video Scene Segmentation
methods. The code is made available.Comment: Accepted to CVPR 202
Analysis of index gases of coal spontaneous combustion using fourier transform infrared spectrometer
Analysis of the index gases of coal for the prevention of spontaneous combustion is of great importance for the enhancement of coal mine safety. In this work, Fourier Transform Infrared Spectrometer (FTIRS) is presented to be used to analyze the index gases of coal in real time to monitor spontaneous combustion conditions. Both the instrument parameters and the analysis method are introduced at first by combining characteristics of the absorption spectra of the target analyte with the analysis requirements. Next, more than ten sets of the gas mixture containing ten components (CH 4 , C 2 H 6 , C 3 H 8 , iso-C 4 H 10 , n-C 4 H 10 , C 2 H 4 , C 3 H 6 , C 2 H 2 , CO, and CO 2 ) are included and analyzed with a Spectrum Two FTIRS made by Perkin Elmer. The testing results show that the detection limit of most analytes is less than 2 × 10 −6 . All the detection limits meet the monitoring requirements of coal spontaneous combustion in China, which means that FTIRS may be an ideal instrument and the analysis method used in this paper is sufficient for spontaneous combustion gas monitoring on-line and even in situ, since FTIRS has many advantages such as fast analysis, being maintenance-free, and good safety
Chalcogenide Glass-on-Graphene Photonics
Two-dimensional (2-D) materials are of tremendous interest to integrated
photonics given their singular optical characteristics spanning light emission,
modulation, saturable absorption, and nonlinear optics. To harness their
optical properties, these atomically thin materials are usually attached onto
prefabricated devices via a transfer process. In this paper, we present a new
route for 2-D material integration with planar photonics. Central to this
approach is the use of chalcogenide glass, a multifunctional material which can
be directly deposited and patterned on a wide variety of 2-D materials and can
simultaneously function as the light guiding medium, a gate dielectric, and a
passivation layer for 2-D materials. Besides claiming improved fabrication
yield and throughput compared to the traditional transfer process, our
technique also enables unconventional multilayer device geometries optimally
designed for enhancing light-matter interactions in the 2-D layers.
Capitalizing on this facile integration method, we demonstrate a series of
high-performance glass-on-graphene devices including ultra-broadband on-chip
polarizers, energy-efficient thermo-optic switches, as well as graphene-based
mid-infrared (mid-IR) waveguide-integrated photodetectors and modulators
Study on prediction and optimization of gas–solid erosion on S-Zorb reactor distribution plate
Adsorption desulfurization of catalytic gasoline (S Zorb) is an important desulfurization measure that is performed to meet the environmental protection requirements before the final product oil is sold in the market. The desulfurization reactor is a gas–solid two-phase flow environment composed of high-temperature and high-pressure hydrogen-oil mixed gas and sorbent particles; erosion prominently occurs on the reactor distribution plate. This study selects the typical gas–solid two-phase flow conditions and defines the erosion mechanism of the gas–solid two-phase flow environment for the plastic material of E347. Moreover, an S Zorb desulfurization reactor model is constructed, the CFD-DEM model is adopted to predict the wall erosion characteristics in a gas–solid two-phase flow environment, typical erosion laws are obtained via calculations. The erosion laws under the influence of variable parameters are studied based on the orthogonal test, the orthogonal test results show the best parameter combination, the parameter combination yields the maximum erosion rate and high erosion area that are 29.9% and 17.3%, respectively, lower than the existing values. Moreover, an optimum scheme of the inner structure parameters of the reactor is determined for reducing erosion rate and area
A Semi-supervised Learning Application for Hand Posture Classification
The rapid growth of HCI applications results in increased data size and complexity. For this, advanced machine learning techniques and data analysis solutions are used to prepare and process data patterns. However, the cost of data pre-processing, labelling, and classification can be significantly increased if the dataset is huge, complex, and unlabelled. This paper aims to propose a data pre-processing approach and semi-supervised learning technique to prepare and classify a big Motion Capture Hand Postures dataset. It builds the solutions via Tri-training and Co-forest techniques and compares them to figure out the best-fitted approach for hand posture classification. According to the results, Co-forest outperforms Tri-training in terms of Accuracy, Precision, recall, and F1-score
Possible structural polymorphism in Al-bearing magnesiumsilicate post-perovskite
In the present study, we summarize indications for the existence of kinked post-perovskite structures in the MAS system. X-ray diffraction data and Raman spectra of aluminous magnesium metasilicate post-perovskite are inconsistent with the CaIrO3 structure. Instead the observations are consistent with structures intermediate between the perovskite and the CaIrO3 structure. Ab initio calculations show that the enthalpies of the kinked structures are slightly higher than the CaIrO_3 structure at 0 K. Finite temperature, minor element chemistry, kinetics of phase transformation, and actual stress regime are plausible reasons for the observed differences between the present and the previously reported post-perovskite phases
Digitized Construction of Iontronic Pressure Sensor with Self-Defined Configuration and Widely Regulated Performance
Flexible pressure sensors are essential components for wearable smart devices and intelligent systems. Significant progress has been made in this area, reporting on excellent sensor performance and fascinating sensor functionalities. Nevertheless, geometrical and morphological engineering of pressure sensors is usually neglected, which, however, is significant for practical application. Here, we present a digitized manufacturing methodology to construct a new class of iontronic pressure sensors with optionally defined configurations and widely modulated performance. These pressure sensors are composed of self-defined electrode patterns prepared by a screen printing method and highly tunable pressure-sensitive microstructures fabricated using 3D printed templates. Importantly, the iontronic pressure sensors employ an iontronic capacitive sensing mechanism based on mechanically regulating the electrical double layer at the electrolyte/electrode interfaces. The resultant pressure sensors exhibit high sensitivity (58 kPa−1), fast response/recovery time (45 ms/75 ms), low detectability (6.64 Pa), and good repeatability (2000 cycles). Moreover, our pressure sensors show remarkable tunability and adaptability in device configuration and performance, which is challenging to achieve via conventional manufacturing processes. The promising applications of these iontronic pressure sensors in monitoring various human physiological activities, fabricating flexible electronic skin, and resolving the force variation during manipulation of an object with a robotic hand are successfully demonstrated