48 research outputs found
Alexis de Tocqueville; chronicler of the American democratic experiment
[Abstract]: The purpose of this work is to develop an interactive tool which helps botanists to extract the vein system with its hierarchical properties with as little user interaction as possible. In this paper, we present a new venation extraction method using independent component analysis (ICA). The popular and efficient FastICA algorithm is applied to patches of leaf images to learn a set of linear basis functions or features for the images and then the basis functions are used as the pattern map for vein extraction. In our experiments, the training sets are randomly generated from different leaf images. Experimental results demonstrate that ICA is a promising technique for extracting leaf veins and edges of objects. ICA, therefore, can play an important role in automatically identifying living plants
An Algorithm Combining Statistics-based and Rules-based for Chunk Identification of Chinese Sentences
PACLIC 20 / Wuhan, China / 1-3 November, 200
Progression-Guided Temporal Action Detection in Videos
We present a novel framework, Action Progression Network (APN), for temporal
action detection (TAD) in videos. The framework locates actions in videos by
detecting the action evolution process. To encode the action evolution, we
quantify a complete action process into 101 ordered stages (0\%, 1\%, ...,
100\%), referred to as action progressions. We then train a neural network to
recognize the action progressions. The framework detects action boundaries by
detecting complete action processes in the videos, e.g., a video segment with
detected action progressions closely follow the sequence 0\%, 1\%, ..., 100\%.
The framework offers three major advantages: (1) Our neural networks are
trained end-to-end, contrasting conventional methods that optimize modules
separately; (2) The APN is trained using action frames exclusively, enabling
models to be trained on action classification datasets and robust to videos
with temporal background styles differing from those in training; (3) Our
framework effectively avoids detecting incomplete actions and excels in
detecting long-lasting actions due to the fine-grained and explicit encoding of
the temporal structure of actions. Leveraging these advantages, the APN
achieves competitive performance and significantly surpasses its counterparts
in detecting long-lasting actions. With an IoU threshold of 0.5, the APN
achieves a mean Average Precision (mAP) of 58.3\% on the THUMOS14 dataset and
98.9\% mAP on the DFMAD70 dataset.Comment: Under Review. Code available at https://github.com/makecent/AP
MR brain image segmentation based on self-organizing map network
Magnetic resonance imaging (MRI) is an advanced medical imaging technique providing rich information about the human soft tissue anatomy. The goal of magnetic resonance (MR) image segmentation is to accurately identify the principal tissue structures in these image volumes. A new unsupervised MR image segmentation method based on self-organizing feature map (SOFM) network is presented. The algorithm includes spatial constraints by using a Markov Random Field (MRF) model. The MRF term introduces the prior distribution with clique potentials and thus improves the segmentation results without having extra data samples in the training set or a complicated network structure. The simulation results demonstrate that the proposed algorithm is promising
A CNN Model for Human Parsing Based on Capacity Optimization
Although a state-of-the-art performance has been achieved in pixel-specific tasks, such as saliency prediction and depth estimation, convolutional neural networks (CNNs) still perform unsatisfactorily in human parsing where semantic information of detailed regions needs to be perceived under the influences of variations in viewpoints, poses, and occlusions. In this paper, we propose to improve the robustness of human parsing modules by introducing a depth-estimation module. A novel scheme is proposed for the integration of a depth-estimation module and a human-parsing module. The robustness of the overall model is improved with the automatically obtained depth labels. As another major concern, the computational efficiency is also discussed. Our proposed human parsing module with 24 layers can achieve a similar performance as the baseline CNN model with over 100 layers. The number of parameters in the overall model is less than that in the baseline model. Furthermore, we propose to reduce the computational burden by replacing a conventional CNN layer with a stack of simplified sub-layers to further reduce the overall number of trainable parameters. Experimental results show that the integration of two modules contributes to the improvement of human parsing without additional human labeling. The proposed model outperforms the benchmark solutions and the capacity of our model is better matched to the complexity of the task
Document Image Recognition Based on Template Matching of Component Block Projections
Document Image Recognition (DIR), a very useful technique in office automation and digital library applications, is to find the most similar template for any input document image in a prestored template document image data set