19,654 research outputs found

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    Managing Uncertainty: A Case for Probabilistic Grid Scheduling

    Get PDF
    The Grid technology is evolving into a global, service-orientated architecture, a universal platform for delivering future high demand computational services. Strong adoption of the Grid and the utility computing concept is leading to an increasing number of Grid installations running a wide range of applications of different size and complexity. In this paper we address the problem of elivering deadline/economy based scheduling in a heterogeneous application environment using statistical properties of job historical executions and its associated meta-data. This approach is motivated by a study of six-month computational load generated by Grid applications in a multi-purpose Grid cluster serving a community of twenty e-Science projects. The observed job statistics, resource utilisation and user behaviour is discussed in the context of management approaches and models most suitable for supporting a probabilistic and autonomous scheduling architecture

    The Distribution of Stock Return Volatility

    Get PDF
    We exploit direct model-free measures of daily equity return volatility and correlation obtained from high-frequency intraday transaction prices on individual stocks in the Dow Jones Industrial Average over a five-year period to confirm, solidify and extend existing characterizations of stock return volatility and correlation We find that the unconditional distributions of the variances and covariances for all thirty stocks are leptokurtic and highly skewed to the right, while the logarithmic standard deviations and correlations all appear approximately Gaussian. Moreover, the distributions returns scaled by the realized standard deviations are also Gaussian. Furthermore, the realized logarithmic standard deviations and correlations all show strong dependence and appear to be well described by long-memory processes, consistent with our documentation of remarkably precise scaling laws under temporal aggregation. Our results also show that positive returns have less impact on future variances and correlations than negative returns of the same absolute magnitude, although the economic importance of this asymmetry is minor. Finally, there is strong evidence that equity volatilities and correlations move together, thus diminishing the benefits to diversification when the market is most volatile. By explicitly incorporating each of these stylized facts, our findings set the stage for improved high-dimensional volatility modeling and out-of-sample forecasting, which in turn hold promise for the development of better decision making in practical situations of risk management, portfolio allocation, and asset pricing.

    Support vector machine for the prediction of heating energy use

    Get PDF
    Prediction of a building energy use for heating is very important for adequate energy planning. In this paper the daily district heating use of one university campus was predicted using the support vector machine model. Support vector machine is the artificial intelligence method that has recently proved that it can achieve comparable, or even better prediction results than the much more used artificial neural networks. The proposed model was trained and tested on the real, measured data. The model accuracy was compared with the results of the previously published models (various neural networks and their ensembles) on the same database. The results showed that the support vector machine model can achieve better results than the individual neural networks, but also better than the conventional and multistage ensembles. It is expected that this theoretically well-known methodology finds wider application, especially in prediction tasks

    The Distribution of Stock Return Volatility

    Get PDF
    We exploit direct model-free measures of daily equity return volatility and correlation obtained from high-frequency intraday transaction prices on individual stocks in the Dow Jones Industrial Average over a five-year period to confirm, solidify and extend existing characterizations of stock return volatility and correlation. We find that the unconditional distributions of the variances and covariances for all thirty stocks are leptokurtic and highly skewed to the right, while the logarithmic standard deviations and correlations all appear approximately Gaussian. Moreover, the distributions of the returns scaled by the realized standard deviations are also Gaussian. Consistent with our documentation of remarkably precise scaling laws under temporal aggregation, the realized logarithmic standard deviations and correlations all show strong temporal dependence and appear to be well described by long-memory processes. Positive returns have less impact on future variances and correlations than negative returns of the same absolute magnitude, although the economic importance of this asymmetry is minor. Finally, there is strong evidence that equity volatilities and correlations move together, possibly reducing the benefits to portfolio diversification when the market is most volatile. Our findings are broadly consistent with a latent volatility fact or structure, and they set the stage for improved high-dimensional volatility modeling and out-of-sample forecasting, which in turn hold promise for the development of better decision making in practical situations of risk management, portfolio allocation, and asset pricing.

    Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment

    Full text link
    We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features

    Nonrational Actors and Financial Market Behavior

    Get PDF
    The insights of descriptive decision theorists and psychologists, we believe, have much to contribute to our understanding of financial market macrophenomena. We propose an analytic agenda that distinguishes those individual idiosyncrasies that prove consequential at the macro-level from those that are neutralized by market processes such as poaching. We discuss five behavioral traits - barn-door closing, expert/reliance effects, status quo bias, framing, and herding - that we employ in explaining financial flows. Patterns in flows to mutual funds, to new equities, across national boundaries, as well as movements in debt-equity ratios are shown to be consistent with deviations from rationality.

    What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing

    Full text link
    Driven by new software development processes and testing in clouds, system and integration testing nowadays tends to produce enormous number of alarms. Such test alarms lay an almost unbearable burden on software testing engineers who have to manually analyze the causes of these alarms. The causes are critical because they decide which stakeholders are responsible to fix the bugs detected during the testing. In this paper, we present a novel approach that aims to relieve the burden by automating the procedure. Our approach, called Cause Analysis Model, exploits information retrieval techniques to efficiently infer test alarm causes based on test logs. We have developed a prototype and evaluated our tool on two industrial datasets with more than 14,000 test alarms. Experiments on the two datasets show that our tool achieves an accuracy of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per cause analysis. Due to the attractive experimental results, our industrial partner, a leading information and communication technology company in the world, has deployed the tool and it achieves an average accuracy of 72% after two months of running, nearly three times more accurate than a previous strategy based on regular expressions.Comment: 12 page
    • …
    corecore