147 research outputs found
Supervised Contrastive Learning with Nearest Neighbor Search for Speech Emotion Recognition
Speech Emotion Recognition (SER) is a challenging task due to limited data
and blurred boundaries of certain emotions. In this paper, we present a
comprehensive approach to improve the SER performance throughout the model
lifecycle, including pre-training, fine-tuning, and inference stages. To
address the data scarcity issue, we utilize a pre-trained model, wav2vec2.0.
During fine-tuning, we propose a novel loss function that combines
cross-entropy loss with supervised contrastive learning loss to improve the
model's discriminative ability. This approach increases the inter-class
distances and decreases the intra-class distances, mitigating the issue of
blurred boundaries. Finally, to leverage the improved distances, we propose an
interpolation method at the inference stage that combines the model prediction
with the output from a k-nearest neighbors model. Our experiments on IEMOCAP
demonstrate that our proposed methods outperform current state-of-the-art
results.Comment: Accepted by lnterspeech 2023, poste
A Novel Method for the Absolute Pose Problem with Pairwise Constraints
Absolute pose estimation is a fundamental problem in computer vision, and it
is a typical parameter estimation problem, meaning that efforts to solve it
will always suffer from outlier-contaminated data. Conventionally, for a fixed
dimensionality d and the number of measurements N, a robust estimation problem
cannot be solved faster than O(N^d). Furthermore, it is almost impossible to
remove d from the exponent of the runtime of a globally optimal algorithm.
However, absolute pose estimation is a geometric parameter estimation problem,
and thus has special constraints. In this paper, we consider pairwise
constraints and propose a globally optimal algorithm for solving the absolute
pose estimation problem. The proposed algorithm has a linear complexity in the
number of correspondences at a given outlier ratio. Concretely, we first
decouple the rotation and the translation subproblems by utilizing the pairwise
constraints, and then we solve the rotation subproblem using the
branch-and-bound algorithm. Lastly, we estimate the translation based on the
known rotation by using another branch-and-bound algorithm. The advantages of
our method are demonstrated via thorough testing on both synthetic and
real-world dataComment: 10 pages, 7figure
THE IMPACT OF TEAM RANKING ON TEAM LENDING PERFORMANCE: AN EMPIRICAL STUDY ON KIVA
Prosocial crowdfunding, such as Kiva, puzzles researchers regarding what motivates online peers to lend for free, and how voluntary online participation could be organized to create great social goods. A common practice of prosocial lending websites is to enable self-organizing teams. In this paper, we are interested in the impact of team ranking, and thus team reputation on its lending performance. Contradicting predictions could be derived depending on the theoretical lenses. While social identity theory suggests that better ranking strengthens individual identification and promotes lending participation; economic theory on public goods indicates that good ranking may trigger a crowd-out effect. To empirically explore the relationship between team ranking and team performance, we collected data from Kiva, the largest prosocial crowdfunding platform. Kiva enables lenders to form teams, and teams are ranked monthly on both lending performance and member recruitment. Our data analysis suggests that appearance on the top ranking list leads to a reduction in future team lending indicating that good team-rank triggers the crowd-out effect. Meanwhile, salience on the member recruitment list does not show any significant impact on lending performance. Our finding suggests that team reputation may not promote identification in this context
Blind Inpainting with Object-aware Discrimination for Artificial Marker Removal
Medical images often contain artificial markers added by doctors, which can
negatively affect the accuracy of AI-based diagnosis. To address this issue and
recover the missing visual contents, inpainting techniques are highly needed.
However, existing inpainting methods require manual mask input, limiting their
application scenarios. In this paper, we introduce a novel blind inpainting
method that automatically completes visual contents without specifying masks
for target areas in an image. Our proposed model includes a mask-free
reconstruction network and an object-aware discriminator. The reconstruction
network consists of two branches that predict the corrupted regions with
artificial markers and simultaneously recover the missing visual contents. The
object-aware discriminator relies on the powerful recognition capabilities of
the dense object detector to ensure that the markers of reconstructed images
cannot be detected in any local regions. As a result, the reconstructed image
can be close to the clean one as much as possible. Our proposed method is
evaluated on different medical image datasets, covering multiple imaging
modalities such as ultrasound (US), magnetic resonance imaging (MRI), and
electron microscopy (EM), demonstrating that our method is effective and robust
against various unknown missing region patterns
LCSCNet: Linear Compressing Based Skip-Connecting Network for Image Super-Resolution
In this paper, we develop a concise but efficient network architecture called
linear compressing based skip-connecting network (LCSCNet) for image
super-resolution. Compared with two representative network architectures with
skip connections, ResNet and DenseNet, a linear compressing layer is designed
in LCSCNet for skip connection, which connects former feature maps and
distinguishes them from newly-explored feature maps. In this way, the proposed
LCSCNet enjoys the merits of the distinguish feature treatment of DenseNet and
the parameter-economic form of ResNet. Moreover, to better exploit hierarchical
information from both low and high levels of various receptive fields in deep
models, inspired by gate units in LSTM, we also propose an adaptive
element-wise fusion strategy with multi-supervised training. Experimental
results in comparison with state-of-the-art algorithms validate the
effectiveness of LCSCNet.Comment: Accepted by IEEE Transactions on Image Processing (IEEE-TIP
- …