566 research outputs found
Masked Supervised Learning for Semantic Segmentation
Self-attention is of vital importance in semantic segmentation as it enables
modeling of long-range context, which translates into improved performance. We
argue that it is equally important to model short-range context, especially to
tackle cases where not only the regions of interest are small and ambiguous,
but also when there exists an imbalance between the semantic classes. To this
end, we propose Masked Supervised Learning (MaskSup), an effective single-stage
learning paradigm that models both short- and long-range context, capturing the
contextual relationships between pixels via random masking. Experimental
results demonstrate the competitive performance of MaskSup against strong
baselines in both binary and multi-class segmentation tasks on three standard
benchmark datasets, particularly at handling ambiguous regions and retaining
better segmentation of minority classes with no added inference cost. In
addition to segmenting target regions even when large portions of the input are
masked, MaskSup is also generic and can be easily integrated into a variety of
semantic segmentation methods. We also show that the proposed method is
computationally efficient, yielding an improved performance by 10\% on the mean
intersection-over-union (mIoU) while requiring less learnable
parameters
Iterative Graph Filtering Network for 3D Human Pose Estimation
Graph convolutional networks (GCNs) have proven to be an effective approach
for 3D human pose estimation. By naturally modeling the skeleton structure of
the human body as a graph, GCNs are able to capture the spatial relationships
between joints and learn an efficient representation of the underlying pose.
However, most GCN-based methods use a shared weight matrix, making it
challenging to accurately capture the different and complex relationships
between joints. In this paper, we introduce an iterative graph filtering
framework for 3D human pose estimation, which aims to predict the 3D joint
positions given a set of 2D joint locations in images. Our approach builds upon
the idea of iteratively solving graph filtering with Laplacian regularization
via the Gauss-Seidel iterative method. Motivated by this iterative solution, we
design a Gauss-Seidel network (GS-Net) architecture, which makes use of weight
and adjacency modulation, skip connection, and a pure convolutional block with
layer normalization. Adjacency modulation facilitates the learning of edges
that go beyond the inherent connections of body joints, resulting in an
adjusted graph structure that reflects the human skeleton, while skip
connections help maintain crucial information from the input layer's initial
features as the network depth increases. We evaluate our proposed model on two
standard benchmark datasets, and compare it with a comprehensive set of strong
baseline methods for 3D human pose estimation. Our experimental results
demonstrate that our approach outperforms the baseline methods on both
datasets, achieving state-of-the-art performance. Furthermore, we conduct
ablation studies to analyze the contributions of different components of our
model architecture and show that the skip connection and adjacency modulation
help improve the model performance
Graph Fairing Convolutional Networks for Anomaly Detection
Graph convolution is a fundamental building block for many deep neural
networks on graph-structured data. In this paper, we introduce a simple, yet
very effective graph convolutional network with skip connections for
semi-supervised anomaly detection. The proposed layerwise propagation rule of
our model is theoretically motivated by the concept of implicit fairing in
geometry processing, and comprises a graph convolution module for aggregating
information from immediate node neighbors and a skip connection module for
combining layer-wise neighborhood representations. This propagation rule is
derived from the iterative solution of the implicit fairing equation via the
Jacobi method. In addition to capturing information from distant graph nodes
through skip connections between the network's layers, our approach exploits
both the graph structure and node features for learning discriminative node
representations. These skip connections are integrated by design in our
proposed network architecture. The effectiveness of our model is demonstrated
through extensive experiments on five benchmark datasets, achieving better or
comparable anomaly detection results against strong baseline methods. We also
demonstrate through an ablation study that skip connection helps improve the
model performance
Determinants of Liquidity Risk in Islamic Banks: A Panel Study
This paper investigates the determinants of Islamic bank liquidity using a panel of 60 Islamic banks in MENA and Southeastern Asian countries. The period of study considers the subprime crisis insofar it ranges from 2004 to 2012. The analysis illustrates that liquidity risk depends on idiosyncratic factors such as bank profitability, capital adequacy ratio and investment ratio. While the profitability bank indicator (Return on Assets : ROA) positively affects the exposure to liquidity shortage, the capital adequacy ratio (CAR) and the ratio of bank’s investment have statistically significant negatively relationships with the liquidity risk measure. Nevertheless, the bank size does not matter probably because both small and large Islamic have difficulties to manage their liquidity risk. The real growth rate of Gross domestic product has negative but irrelevant association with liquidity risk. Islamic bank should improve their Profits and Losses Sharing investment in order to reduce their liquidity risk. Moreover, it is critical to reinforce instruments of liquidity risk management. Keywords: Islamic bank, Investment, Liquidity risk, Management, Return on Assets, Capital rati
Learning to recognize occluded and small objects with partial inputs
Recognizing multiple objects in an image is challenging due to occlusions,
and becomes even more so when the objects are small. While promising, existing
multi-label image recognition models do not explicitly learn context-based
representations, and hence struggle to correctly recognize small and occluded
objects. Intuitively, recognizing occluded objects requires knowledge of
partial input, and hence context. Motivated by this intuition, we propose
Masked Supervised Learning (MSL), a single-stage, model-agnostic learning
paradigm for multi-label image recognition. The key idea is to learn
context-based representations using a masked branch and to model label
co-occurrence using label consistency. Experimental results demonstrate the
simplicity, applicability and more importantly the competitive performance of
MSL against previous state-of-the-art methods on standard multi-label image
recognition benchmarks. In addition, we show that MSL is robust to random
masking and demonstrate its effectiveness in recognizing non-masked objects.
Code and pretrained models are available on GitHub
- …