120,854 research outputs found

    Visual and interactive exploration of point data

    Get PDF
    Point data, such as Unit Postcodes (UPC), can provide very detailed information at fine scales of resolution. For instance, socio-economic attributes are commonly assigned to UPC. Hence, they can be represented as points and observable at the postcode level. Using UPC as a common field allows the concatenation of variables from disparate data sources that can potentially support sophisticated spatial analysis. However, visualising UPC in urban areas has at least three limitations. First, at small scales UPC occurrences can be very dense making their visualisation as points difficult. On the other hand, patterns in the associated attribute values are often hardly recognisable at large scales. Secondly, UPC can be used as a common field to allow the concatenation of highly multivariate data sets with an associated postcode. Finally, socio-economic variables assigned to UPC (such as the ones used here) can be non-Normal in their distributions as a result of a large presence of zero values and high variances which constrain their analysis using traditional statistics. This paper discusses a Point Visualisation Tool (PVT), a proof-of-concept system developed to visually explore point data. Various well-known visualisation techniques were implemented to enable their interactive and dynamic interrogation. PVT provides multiple representations of point data to facilitate the understanding of the relations between attributes or variables as well as their spatial characteristics. Brushing between alternative views is used to link several representations of a single attribute, as well as to simultaneously explore more than one variable. PVT’s functionality shows how the use of visual techniques embedded in an interactive environment enable the exploration of large amounts of multivariate point data

    Functional Data Analysis in Electronic Commerce Research

    Full text link
    This paper describes opportunities and challenges of using functional data analysis (FDA) for the exploration and analysis of data originating from electronic commerce (eCommerce). We discuss the special data structures that arise in the online environment and why FDA is a natural approach for representing and analyzing such data. The paper reviews several FDA methods and motivates their usefulness in eCommerce research by providing a glimpse into new domain insights that they allow. We argue that the wedding of eCommerce with FDA leads to innovations both in statistical methodology, due to the challenges and complications that arise in eCommerce data, and in online research, by being able to ask (and subsequently answer) new research questions that classical statistical methods are not able to address, and also by expanding on research questions beyond the ones traditionally asked in the offline environment. We describe several applications originating from online transactions which are new to the statistics literature, and point out statistical challenges accompanied by some solutions. We also discuss some promising future directions for joint research efforts between researchers in eCommerce and statistics.Comment: Published at http://dx.doi.org/10.1214/088342306000000132 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning

    Get PDF
    Fine tuning distributed systems is considered to be a craftsmanship, relying on intuition and experience. This becomes even more challenging when the systems need to react in near real time, as streaming engines have to do to maintain pre-agreed service quality metrics. In this article, we present an automated approach that builds on a combination of supervised and reinforcement learning methods to recommend the most appropriate lever configurations based on previous load. With this, streaming engines can be automatically tuned without requiring a human to determine the right way and proper time to deploy them. This opens the door to new configurations that are not being applied today since the complexity of managing these systems has surpassed the abilities of human experts. We show how reinforcement learning systems can find substantially better configurations in less time than their human counterparts and adapt to changing workloads

    Community Detection and Growth Potential Prediction from Patent Citation Networks

    Full text link
    The scoring of patents is useful for technology management analysis. Therefore, a necessity of developing citation network clustering and prediction of future citations for practical patent scoring arises. In this paper, we propose a community detection method using the Node2vec. And in order to analyze growth potential we compare three ''time series analysis methods'', the Long Short-Term Memory (LSTM), ARIMA model, and Hawkes Process. The results of our experiments, we could find common technical points from those clusters by Node2vec. Furthermore, we found that the prediction accuracy of the ARIMA model was higher than that of other models.Comment: arXiv admin note: text overlap with arXiv:1607.00653 by other author

    Proceedings of the 2011 New York Workshop on Computer, Earth and Space Science

    Full text link
    The purpose of the New York Workshop on Computer, Earth and Space Sciences is to bring together the New York area's finest Astronomers, Statisticians, Computer Scientists, Space and Earth Scientists to explore potential synergies between their respective fields. The 2011 edition (CESS2011) was a great success, and we would like to thank all of the presenters and participants for attending. This year was also special as it included authors from the upcoming book titled "Advances in Machine Learning and Data Mining for Astronomy". Over two days, the latest advanced techniques used to analyze the vast amounts of information now available for the understanding of our universe and our planet were presented. These proceedings attempt to provide a small window into what the current state of research is in this vast interdisciplinary field and we'd like to thank the speakers who spent the time to contribute to this volume.Comment: Author lists modified. 82 pages. Workshop Proceedings from CESS 2011 in New York City, Goddard Institute for Space Studie
    • …
    corecore