4,256 research outputs found

    Mass appraisal of residential apartments: An application of Random forest for valuation and a CART-based approach for model diagnostics

    Get PDF
    To the best knowledge of authors, the use of Random forest as a potential technique for residential estate mass appraisal has been attempted for the first time. In the empirical study using data on residential apartments the method performed better than such techniques as CHAID, CART, KNN, multiple regression analysis, Artificial Neural Networks (MLP and RBF) and Boosted Trees. An approach for automatic detection of segments where a model significantly underperforms and for detecting segments with systematically under- or overestimated prediction is introduced. This segmentational approach is applicable to various expert systems including, but not limited to, those used for the mass appraisal.Random forest, mass appraisal, CART, model diagnostics, real estate, automatic valuation model

    Modeling manufacturing processes using a genetic programming-based fuzzy regression with detection of outliers

    Get PDF
    Fuzzy regression (FR) been demonstrated as a promising technique for modeling manufacturing processes where availability of data is limited. FR can only yield linear type FR models which have a higher degree of fuzziness, but FR ignores higher order or interaction terms and the influence of outliers, all of which usually exist in the manufacturing process data. Genetic programming (GP), on the other hand, can be used to generate models with higher order and interaction terms but it cannot address the fuzziness of the manufacturing process data. In this paper, genetic programming-based fuzzy regression (GP-FR), which combines the advantages of the two approaches to overcome the deficiencies of the commonly used existing modeling methods, is proposed in order to model manufacturing processes. GP-FR uses GP to generate model structures based on tree representation which can represent interaction and higher order terms of models, and it uses an FR generator based on fuzzy regression to determine outliers in experimental data sets. It determines the contribution and fuzziness of each term in the model by using experimental data excluding the outliers. To evaluate the effectiveness of GP-FR in modeling manufacturing processes, it was used to model a non-linear system and an epoxy dispensing process. The results were compared with those based on two commonly used FR methods, Tanka's FR and Peters' FR. The prediction accuracy of the models developed based on GP-FR was shown to be better than that of models based on the other two FR methods

    Treatment of Model Error in Calibration by Robust and Fuzzy Procedures

    Get PDF
    Animal sensory systems are optimally adapted to those features typically encountered in natural surrounds, thus allowing neurons with limited bandwidth to encode challengingly large input ranges. Natural scenes are not random, and peripheral visual systems in vertebrates and insects have evolved to respond efficiently to their typical spatial statistics. The mammalian visual cortex is also tuned to natural spatial statistics, but less is known about coding in higher order neurons in insects. To redress this we here record intracellularly from a higher order visual neuron in the hoverfly. We show that the cSIFE neuron, which is inhibited by stationary images, is maximally inhibited when the slope constant of the amplitude spectrum is close to the mean in natural scenes. The behavioural optomotor response is also strongest to images with naturalistic image statistics. Our results thus reveal a close coupling between the inherent statistics of natural scenes and higher order visual processing in insects.Supplementary information available for this article at http://www.nature.com/ncomms/2015/151006/ncomms9522/suppinfo/ncomms9522_S1.html</p

    Using Fuzzy Linear Regression to Estimate Relationship between Forest Fires and Meteorological Conditions

    Get PDF
    Each year, millions of hectares of forest land are destroyed by fires causing great financial loss and ecological damage. In this paper, our aim is to study the effect of the variation of meteorological conditions on the total burned area in hectares, by using fuzzy linear regression analysis based on Tanaka’s approaches. The total burned area is considered a dependent variable. Air temperature (in ºC), relative humidity (in %), wind speed (in km/h) and rainfall (in mm/m2 ) are considered to be independent variables. The relationship between input and output data is estimated using data provided in data mining literature. In our study, we apply fuzzy regression, using crisp/fuzzy input data and fuzzy output data expressed in linguistic terms

    Vol. 15, No. 1 (Full Issue)

    Get PDF

    A weighted goal programming approach to fuzzy linear regression with quasi type-2 fuzzy input-output data

    Get PDF
    This study attempts to develop a regression model when both input data and output data are quasi type-2 fuzzy numbers. To estimate the crisp parameters of the regression model, a linear programming model is proposed based on goal programming. To handle the outlier problem, an omission approach is proposed. This approach examines the behavior of value changes in the objective function of proposed model when observations are omitted. In order to illustrate the proposed model, some numerical examples are presented. The applicability of the proposed method is tested on a real data set on soil science. The predictive performance of the model is examined by cross-validation.Publisher's Versio

    Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

    Full text link
    Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference
    corecore