9,247 research outputs found
A decision forest based feature selection framework for action recognition from RGB-Depth cameras
In this paper, we present an action recognition framework
leveraging data mining capabilities of random decision forests trained on
kinematic features. We describe human motion via a rich collection of
kinematic feature time-series computed from the skeletal representation
of the body in motion. We discriminatively optimize a random decision
forest model over this collection to identify the most effective subset
of features, localized both in time and space. Later, we train a support
vector machine classifier on the selected features. This approach improves
upon the baseline performance obtained using the whole feature set with
a significantly less number of features (one tenth of the original). On
MSRC-12 dataset (12 classes), our method achieves 94% accuracy. On
the WorkoutSU-10 dataset, collected by our group (10 physical exercise
classes), the accuracy is 98%. The approach can also be used to provide
insights on the spatiotemporal dynamics of human actions
The Role of the Superior Order GLCM in the Characterization and Recognition of the Liver Tumors from Ultrasound Images
The hepatocellular carcinoma (HCC) is the most frequent malignant liver tumor. It often has a similar visual aspect with the cirrhotic parenchyma on which it evolves and with the benign liver tumors. The golden standard for HCC diagnosis is the needle biopsy, but this is an invasive, dangerous method. We aim to develop computerized,noninvasive techniques for the automatic diagnosis of HCC, based on information obtained from ultrasound images. The texture is an important property of the internal organs tissue, able to provide subtle information about the pathology. We previously defined the textural model of HCC, consisting in the exhaustive set of the relevant textural features, appropriate for HCC characterization and in the specific values of these features. In this work, we analyze the role that the superior order Grey Level Cooccurrence Matrices (GLCM) and the associated parameters have in the improvement of HCC characterization and automatic diagnosis. We also determine the best spatial relations between the pixels that lead to the highest performances, for the third, fifth and seventh order GLCM. The following classes will be considered: HCC, cirrhotic liver parenchyma on which it evolves and benign liver tumors
A machine learning approach for layout inference in spreadsheets
Spreadsheet applications are one of the most used tools for content generation and presentation in industry and the Web. In spite of this success, there does not exist a comprehensive approach to automatically extract and reuse the richness of data maintained in this format. The biggest obstacle is the lack of awareness about the structure of the data in spreadsheets, which otherwise could provide the means to automatically understand and extract knowledge from these files. In this paper, we propose a classification approach to discover the layout of tables in spreadsheets. Therefore, we focus on the cell level, considering a wide range of features not covered before by related work. We evaluated the performance of our classifiers on a large dataset covering three different corpora from various domains. Finally, our work includes a novel technique for detecting and repairing incorrectly classified cells in a post-processing step. The experimental results show that our approach deliver s very high accuracy bringing us a crucial step closer towards automatic table extraction.Peer ReviewedPostprint (published version
A random forest system combination approach for error detection in digital dictionaries
When digitizing a print bilingual dictionary, whether via optical character
recognition or manual entry, it is inevitable that errors are introduced into
the electronic version that is created. We investigate automating the process
of detecting errors in an XML representation of a digitized print dictionary
using a hybrid approach that combines rule-based, feature-based, and language
model-based methods. We investigate combining methods and show that using
random forests is a promising approach. We find that in isolation, unsupervised
methods rival the performance of supervised methods. Random forests typically
require training data so we investigate how we can apply random forests to
combine individual base methods that are themselves unsupervised without
requiring large amounts of training data. Experiments reveal empirically that a
relatively small amount of data is sufficient and can potentially be further
reduced through specific selection criteria.Comment: 9 pages, 7 figures, 10 tables; appeared in Proceedings of the
Workshop on Innovative Hybrid Approaches to the Processing of Textual Data,
April 201
- …