36,868 research outputs found
Extremely randomized trees
This paper proposes anew tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.Peer reviewe
Unsupervised extremely randomized trees
International audienceIn this paper we present a method to compute dissimilarities on unlabeled data, based on extremely randomized trees. This method, Unsupervised Extremely Randomized Trees, is used jointly with a novel randomized labeling scheme we describe here, and that we call AddCl3. Unlike existing methods such as AddCl1 and AddCl2, no synthetic instances are generated, thus avoiding an increase in the size of the dataset. The empirical study of this method shows that Unsupervised Extremely Randomized Trees with AddCl3 provides competitive results regarding the quality of resulting clusterings, while clearly outperforming previous similar methods in terms of running time
Grand Challenge: Real-time Destination and ETA Prediction for Maritime Traffic
In this paper, we present our approach for solving the DEBS Grand Challenge
2018. The challenge asks to provide a prediction for (i) a destination and the
(ii) arrival time of ships in a streaming-fashion using Geo-spatial data in the
maritime context. Novel aspects of our approach include the use of ensemble
learning based on Random Forest, Gradient Boosting Decision Trees (GBDT),
XGBoost Trees and Extremely Randomized Trees (ERT) in order to provide a
prediction for a destination while for the arrival time, we propose the use of
Feed-forward Neural Networks. In our evaluation, we were able to achieve an
accuracy of 97% for the port destination classification problem and 90% (in
mins) for the ETA prediction
Ensembles of extremely randomized trees and some generic applications
peer reviewedIn this paper we present a new tree-based ensemble method called “Extra-Trees”. This algorithm averages predictions of trees obtained by partitioning the inputspace with randomly generated splits, leading to significant improvements of precision, and various algorithmic advantages, in particular reduced computational complexity and scalability. We also discuss two generic applications of this algorithm, namely for time-series classification and for the automatic inference of near-optimal sequential decision policies from experimental data
Automated multimodal volume registration based on supervised 3D anatomical landmark detection
We propose a new method for automatic 3D multimodal registration based on anatomical landmark detection. Landmark detectors are learned independantly in the two imaging modalities using Extremely Randomized Trees and multi-resolution voxel windows. A least-squares fitting algorithm is then used for rigid registration based on the landmark positions as predicted by these detectors in the two imaging modalities. Experiments are carried out with this method on a dataset of pelvis CT and CBCT scans related to 45 patients. On this dataset, our fully automatic approach yields results very competitive with respect to a manually assisted state-of-the-art rigid registration algorithm
Random subwindows and extremely randomized trees for image classification in cell biology
Background: With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation of a large amount of images at different imaging modalities/scales. It stresses the need for computer vision methods that automate image classification tasks. Results: We illustrate the potential of our image classification method in cell biology by evaluating it on four datasets of images related to protein distributions or subcellular localizations, and red-blood cell shapes. Accuracy results are quite good without any specific pre-processing neither domain knowledge incorporation. The method is implemented in Java and available upon request for evaluation and research purpose. Conclusion: Our method is directly applicable to any image classification problems. We foresee the use of this automatic approach as a baseline method and first try on various biological image classification problems
- …