4,994 research outputs found
Hierarchical Video Frame Sequence Representation with Deep Convolutional Graph Network
High accuracy video label prediction (classification) models are attributed
to large scale data. These data could be frame feature sequences extracted by a
pre-trained convolutional-neural-network, which promote the efficiency for
creating models. Unsupervised solutions such as feature average pooling, as a
simple label-independent parameter-free based method, has limited ability to
represent the video. While the supervised methods, like RNN, can greatly
improve the recognition accuracy. However, the video length is usually long,
and there are hierarchical relationships between frames across events in the
video, the performance of RNN based models are decreased. In this paper, we
proposes a novel video classification method based on a deep convolutional
graph neural network(DCGN). The proposed method utilize the characteristics of
the hierarchical structure of the video, and performed multi-level feature
extraction on the video frame sequence through the graph network, obtained a
video representation re ecting the event semantics hierarchically. We test our
model on YouTube-8M Large-Scale Video Understanding dataset, and the result
outperforms RNN based benchmarks.Comment: ECCV 201
9-Ethyl-9H-carbazole-3-carbaldehyde
The title molecule, C15H13NO, approximates a planar conformation except for the alkyl chain (ethyl group) bonded to the N atom with a maximum deviation from the least-squares plane through the 15 planar atoms of 0.120 (2) Å for the O atom. The distance of the formyl O atom from the plane of the carbazole ring is 0.227 (2) Å. The N—C bond lengths in the central ring are significantly different, reflecting the electron-withdrawing properties of the aldehyde group. As a consequence, charge transfer may occur from the carbazole N atom to the substituted benzene ring
Impairment of pulmonary function and changes in the right cardiac structure of pneumoconiotic coal workers in China
Introduction Information on the changes of pulmonary function and the right cardiac structure in patients with coal worker’s pneumoconiosis in China is very scarce. This study was performed to clarify the changes of pulmonary function and right cardiac structure in patients with coal worker’s pneumoconiosis in China. Material and methods Pulmonary function, pulmonary artery systolic pressure, and the right cardiac structure were evaluated by spirometry and color Doppler echocardiography. Results The pulmonary artery systolic pressure of patients with coal worker’s pneumoconiosis was increased with disease severity. Patients with coal worker’s pneumoconiosis also exhibited an impaired pulmonary function and altered right cardiac structure compared with control subjects. A significant linear correlation of the variables of pulmonary ventilation and diffusion function with the indicators of the right cardiac structure was found in patients with coal worker’s pneumoconiosis in China. Conclusions This study elucidated a deterioration of pulmonary function and right cardiac structure in patients with coal worker’s pneumoconiosis in China
Thermal conductivity of deformed carbon nanotubes
We investigate the thermal conductivity of four types of deformed carbon
nanotubes by using the nonequilibrium molecular dynamics method. It is reported
that various deformations have different influence on the thermal properties of
carbon nanotubes. For the bending carbon nanotubes, the thermal conductivity is
independent on the bending angle. However, the thermal conductivity increases
lightly with XY-distortion and decreases rapidly with Z-distortion. The thermal
conductivity does not change with the screw ratio before the breaking of carbon
nanotubes but decreases sharply after the critical screw ratio.Comment: 6figure
Model Inversion Attack via Dynamic Memory Learning
Model Inversion (MI) attacks aim to recover the private training data from
the target model, which has raised security concerns about the deployment of
DNNs in practice. Recent advances in generative adversarial models have
rendered them particularly effective in MI attacks, primarily due to their
ability to generate high-fidelity and perceptually realistic images that
closely resemble the target data. In this work, we propose a novel Dynamic
Memory Model Inversion Attack (DMMIA) to leverage historically learned
knowledge, which interacts with samples (during the training) to induce diverse
generations. DMMIA constructs two types of prototypes to inject the information
about historically learned knowledge: Intra-class Multicentric Representation
(IMR) representing target-related concepts by multiple learnable prototypes,
and Inter-class Discriminative Representation (IDR) characterizing the
memorized samples as learned prototypes to capture more privacy-related
information. As a result, our DMMIA has a more informative representation,
which brings more diverse and discriminative generated results. Experiments on
multiple benchmarks show that DMMIA performs better than state-of-the-art MI
attack methods
Robust Automatic Speech Recognition via WavAugment Guided Phoneme Adversarial Training
Developing a practically-robust automatic speech recognition (ASR) is
challenging since the model should not only maintain the original performance
on clean samples, but also achieve consistent efficacy under small volume
perturbations and large domain shifts. To address this problem, we propose a
novel WavAugment Guided Phoneme Adversarial Training (wapat). wapat use
adversarial examples in phoneme space as augmentation to make the model
invariant to minor fluctuations in phoneme representation and preserve the
performance on clean samples. In addition, wapat utilizes the phoneme
representation of augmented samples to guide the generation of adversaries,
which helps to find more stable and diverse gradient-directions, resulting in
improved generalization. Extensive experiments demonstrate the effectiveness of
wapat on End-to-end Speech Challenge Benchmark (ESB). Notably, SpeechLM-wapat
outperforms the original model by 6.28% WER reduction on ESB, achieving the new
state-of-the-art
COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts
Practical object detection application can lose its effectiveness on image
inputs with natural distribution shifts. This problem leads the research
community to pay more attention on the robustness of detectors under
Out-Of-Distribution (OOD) inputs. Existing works construct datasets to
benchmark the detector's OOD robustness for a specific application scenario,
e.g., Autonomous Driving. However, these datasets lack universality and are
hard to benchmark general detectors built on common tasks such as COCO. To give
a more comprehensive robustness assessment, we introduce
COCO-O(ut-of-distribution), a test dataset based on COCO with 6 types of
natural distribution shifts. COCO-O has a large distribution gap with training
data and results in a significant 55.7% relative performance drop on a Faster
R-CNN detector. We leverage COCO-O to conduct experiments on more than 100
modern object detectors to investigate if their improvements are credible or
just over-fitting to the COCO test set. Unfortunately, most classic detectors
in early years do not exhibit strong OOD generalization. We further study the
robustness effect on recent breakthroughs of detector's architecture design,
augmentation and pre-training techniques. Some empirical findings are revealed:
1) Compared with detection head or neck, backbone is the most important part
for robustness; 2) An end-to-end detection transformer design brings no
enhancement, and may even reduce robustness; 3) Large-scale foundation models
have made a great leap on robust object detection. We hope our COCO-O could
provide a rich testbed for robustness study of object detection. The dataset
will be available at
\url{https://github.com/alibaba/easyrobust/tree/main/benchmarks/coco_o}.Comment: To appear in ICCV2023,
https://github.com/alibaba/easyrobust/tree/main/benchmarks/coco_
- …