4,466 research outputs found

    Identifying Real Estate Opportunities using Machine Learning

    Full text link
    The real estate market is exposed to many fluctuations in prices because of existing correlations with many variables, some of which cannot be controlled or might even be unknown. Housing prices can increase rapidly (or in some cases, also drop very fast), yet the numerous listings available online where houses are sold or rented are not likely to be updated that often. In some cases, individuals interested in selling a house (or apartment) might include it in some online listing, and forget about updating the price. In other cases, some individuals might be interested in deliberately setting a price below the market price in order to sell the home faster, for various reasons. In this paper, we aim at developing a machine learning application that identifies opportunities in the real estate market in real time, i.e., houses that are listed with a price substantially below the market price. This program can be useful for investors interested in the housing market. We have focused in a use case considering real estate assets located in the Salamanca district in Madrid (Spain) and listed in the most relevant Spanish online site for home sales and rentals. The application is formally implemented as a regression problem that tries to estimate the market price of a house given features retrieved from public online listings. For building this application, we have performed a feature engineering stage in order to discover relevant features that allows for attaining a high predictive performance. Several machine learning algorithms have been tested, including regression trees, k-nearest neighbors, support vector machines and neural networks, identifying advantages and handicaps of each of them.Comment: 24 pages, 13 figures, 5 table

    European exchange trading funds trading with locally weighted support vector regression

    Get PDF
    In this paper, two different Locally Weighted Support Vector Regression (wSVR) algorithms are generated and applied to the task of forecasting and trading five European Exchange Traded Funds. The trading application covers the recent European Monetary Union debt crisis. The performance of the proposed models is benchmarked against traditional Support Vector Regression (SVR) models. The Radial Basis Function, the Wavelet and the Mahalanobis kernel are explored and tested as SVR kernels. Finally, a novel statistical SVR input selection procedure is introduced based on a principal component analysis and the Hansen, Lunde, and Nason (2011) model confidence test. The results demonstrate the superiority of the wSVR models over the traditional SVRs and of the v-SVR over the ε-SVR algorithms. We note that the performance of all models varies and considerably deteriorates in the peak of the debt crisis. In terms of the kernels, our results do not confirm the belief that the Radial Basis Function is the optimum choice for financial series

    A literature survey of methods for analysis of subjective language

    Get PDF
    Subjective language is used to express attitudes and opinions towards things, ideas and people. While content and topic centred natural language processing is now part of everyday life, analysis of subjective aspects of natural language have until recently been largely neglected by the research community. The explosive growth of personal blogs, consumer opinion sites and social network applications in the last years, have however created increased interest in subjective language analysis. This paper provides an overview of recent research conducted in the area

    A survey on online active learning

    Full text link
    Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in the context of online active learning. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research. Our review aims to provide a comprehensive and up-to-date overview of the field and to highlight directions for future work
    • …
    corecore