271,581 research outputs found
The effect of google drive distance and duration in residential property in Sydney, Australia
© 2016 by World Scientific Publishing Co. Pte. Ltd. Predicting the market value of a residential property accurately without inspection by professional valuer could be beneficial for vary of organization and people. Building an Automated Valuation Model could be beneficial if it will be accurate adequately. This paper examined 47 machine learning models (linear and non-linear). These models are fitted on 1967 records of units from 19 suburbs of Sydney, Australia. The main aim of this paper is to compare the performance of these techniques using this data set and investigate the effect of spatial information on valuation accuracy. The results demonstrated that tree models named eXtreme Gradient Boosting Linear, eXtreme Gradient Boosting Tree and Random Forest respectively have best performance among other techniques and spatial information such drive distance and duration to CBD increase the predictive model performance significantly
Preparing, restructuring, and augmenting a French treebank: lexicalised parsers or coherent treebanks?
We present the Modified French Treebank (MFT), a completely revamped French Treebank, derived from the Paris 7 Treebank
(P7T), which is cleaner, more coherent, has several transformed structures, and introduces new linguistic analyses. To determine the effect of these changes, we
investigate how theMFT fares in statistical parsing. Probabilistic parsers trained on the MFT training set (currently 3800 trees) already perform better than their counterparts trained on five times the P7T data (18,548 trees), providing an extreme example of the importance of data quality over quantity in statistical parsing. Moreover,
regression analysis on the learning curve of parsers trained on the MFT lead to the prediction that parsers trained on the full projected 18,548 tree MFT training set
will far outscore their counterparts trained on the full P7T. These analyses also show how problematic data can lead to problematic conclusions–in particular, we find that
lexicalisation in the probabilistic parsing of French is probably not as crucial as was once thought (Arun and Keller (2005))
Deep Extreme Multi-label Learning
Extreme multi-label learning (XML) or classification has been a practical and
important problem since the boom of big data. The main challenge lies in the
exponential label space which involves possible label sets especially
when the label dimension is huge, e.g., in millions for Wikipedia labels.
This paper is motivated to better explore the label space by originally
establishing an explicit label graph. In the meanwhile, deep learning has been
widely studied and used in various classification problems including
multi-label classification, however it has not been properly introduced to XML,
where the label space can be as large as in millions. In this paper, we propose
a practical deep embedding method for extreme multi-label classification, which
harvests the ideas of non-linear embedding and graph priors-based label space
modeling simultaneously. Extensive experiments on public datasets for XML show
that our method performs competitive against state-of-the-art result
Tree Echo State Networks
In this paper we present the Tree Echo State Network (TreeESN) model, generalizing the paradigm of Reservoir Computing to tree structured data. TreeESNs exploit an untrained generalized recursive reservoir, exhibiting extreme efficiency for learning in structured domains. In addition, we highlight through the paper other characteristics of the approach: First, we discuss the Markovian characterization of reservoir dynamics, extended to the case of tree domains, that is implied by the contractive setting of the TreeESN state transition function. Second, we study two types of state mapping functions to map the tree structured state of TreeESN into a fixed-size feature representation for classification or regression tasks. The critical role of the relation between the choice of the state mapping function and the Markovian characterization of the task is analyzed and experimentally investigated on both artificial and real-world tasks. Finally, experimental results on benchmark and real-world tasks show that the TreeESN approach, in spite of its efficiency, can achieve comparable results with state-of-the-art, although more complex, neural and kernel based models for tree structured data
- …