122,046 research outputs found

    Mining Top-k Closed Co-location Patterns

    Get PDF
    A spatial co-located event sets is a set of spatial events being frequently observed together in nearby geographic space. Spatial co-location patterns can give useful information in many application domains such as business, ecology, public health and criminology

    Subjectively Interesting Subgroup Discovery on Real-valued Targets

    Get PDF
    Deriving insights from high-dimensional data is one of the core problems in data mining. The difficulty mainly stems from the fact that there are exponentially many variable combinations to potentially consider, and there are infinitely many if we consider weighted combinations, even for linear combinations. Hence, an obvious question is whether we can automate the search for interesting patterns and visualizations. In this paper, we consider the setting where a user wants to learn as efficiently as possible about real-valued attributes. For example, to understand the distribution of crime rates in different geographic areas in terms of other (numerical, ordinal and/or categorical) variables that describe the areas. We introduce a method to find subgroups in the data that are maximally informative (in the formal Information Theoretic sense) with respect to a single or set of real-valued target attributes. The subgroup descriptions are in terms of a succinct set of arbitrarily-typed other attributes. The approach is based on the Subjective Interestingness framework FORSIED to enable the use of prior knowledge when finding most informative non-redundant patterns, and hence the method also supports iterative data mining.Comment: 12 pages, 10 figures, 2 tables, conference submissio

    Geo-Spotting: Mining Online Location-based Services for Optimal Retail Store Placement

    Full text link
    The problem of identifying the optimal location for a new retail store has been the focus of past research, especially in the field of land economy, due to its importance in the success of a business. Traditional approaches to the problem have factored in demographics, revenue and aggregated human flow statistics from nearby or remote areas. However, the acquisition of relevant data is usually expensive. With the growth of location-based social networks, fine grained data describing user mobility and popularity of places has recently become attainable. In this paper we study the predictive power of various machine learning features on the popularity of retail stores in the city through the use of a dataset collected from Foursquare in New York. The features we mine are based on two general signals: geographic, where features are formulated according to the types and density of nearby places, and user mobility, which includes transitions between venues or the incoming flow of mobile users from distant areas. Our evaluation suggests that the best performing features are common across the three different commercial chains considered in the analysis, although variations may exist too, as explained by heterogeneities in the way retail facilities attract users. We also show that performance improves significantly when combining multiple features in supervised learning algorithms, suggesting that the retail success of a business may depend on multiple factors.Comment: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, Chicago, 2013, Pages 793-80

    Testing Interestingness Measures in Practice: A Large-Scale Analysis of Buying Patterns

    Full text link
    Understanding customer buying patterns is of great interest to the retail industry and has shown to benefit a wide variety of goals ranging from managing stocks to implementing loyalty programs. Association rule mining is a common technique for extracting correlations such as "people in the South of France buy ros\'e wine" or "customers who buy pat\'e also buy salted butter and sour bread." Unfortunately, sifting through a high number of buying patterns is not useful in practice, because of the predominance of popular products in the top rules. As a result, a number of "interestingness" measures (over 30) have been proposed to rank rules. However, there is no agreement on which measures are more appropriate for retail data. Moreover, since pattern mining algorithms output thousands of association rules for each product, the ability for an analyst to rely on ranking measures to identify the most interesting ones is crucial. In this paper, we develop CAPA (Comparative Analysis of PAtterns), a framework that provides analysts with the ability to compare the outcome of interestingness measures applied to buying patterns in the retail industry. We report on how we used CAPA to compare 34 measures applied to over 1,800 stores of Intermarch\'e, one of the largest food retailers in France

    Population Density-based Hospital Recommendation with Mobile LBS Big Data

    Full text link
    The difficulty of getting medical treatment is one of major livelihood issues in China. Since patients lack prior knowledge about the spatial distribution and the capacity of hospitals, some hospitals have abnormally high or sporadic population densities. This paper presents a new model for estimating the spatiotemporal population density in each hospital based on location-based service (LBS) big data, which would be beneficial to guiding and dispersing outpatients. To improve the estimation accuracy, several approaches are proposed to denoise the LBS data and classify people by detecting their various behaviors. In addition, a long short-term memory (LSTM) based deep learning is presented to predict the trend of population density. By using Baidu large-scale LBS logs database, we apply the proposed model to 113 hospitals in Beijing, P. R. China, and constructed an online hospital recommendation system which can provide users with a hospital rank list basing the real-time population density information and the hospitals' basic information such as hospitals' levels and their distances. We also mine several interesting patterns from these LBS logs by using our proposed system

    Unravelling the relative contributions of climate change and ground disturbance to subsurface temperature perturbations: Case studies from Tyneside, UK

    Get PDF
    When assessing subsurface urban heat islands (UHIs) it is important to distinguish between localized effects of land-use change and the impacts of global climate change. However, few investigations have successfully unraveled the two influences. We have investigated borehole temperature records from the urban centres of Gateshead and Newcastle upon Tyne in northeast England, to ascertain the effects on subsurface temperatures of climate change and changes in ground conditions due to historic coal mining and more recent urban development. The latter effects are shown to be substantial, albeit with significant variations on a very local scale. Significant subsurface UHIs are indeed evident in both urban centres, estimated as 2.0 °C in Newcastle and 4.5 °C in Gateshead, the former value being comparable to the 1.9 °C atmospheric UHI previously measured for the Tyneside conurbation as a whole. We interpret these substantial subsurface UHIs as a consequence of the region’s long history of urban and industrial development and associated surface energy use, possibly supplemented in Gateshead by the thermal effect of trains braking in an adjacent shallow railway tunnel. We also show that a large proportion of the expected conductive heat flux from the Earth’s interior beneath both Gateshead and Newcastle becomes entrained by groundwater flow and transported elsewhere, through former mineworkings in which the rocks have become ‘permeabilised’ during the region’s long history of coal mining. Discharge of groundwater at a nearby minewater pumping station, Kibblesworth, has a heat flux that we estimate as ∼7.5 MW; it thus ‘captures’ the equivalent of roughly two thirds of the geothermal heat flux through a >100 km2 surrounding region. Modelling of the associated groundwater flow regime provides first-order estimates of the hydraulic transport properties of ‘permeabilised’ Carboniferous Coal Measures rocks, comprising permeability ∼3 × 10−11 m2 or ∼30 darcies, hydraulic conductivity ∼2 × 10−4 m s−1, and transmissivity ∼2 × 10−3 m2 s−1 or ∼200 m2 day−1; these are very high values, comparable to what one might expect for karstified Carboniferous limestone. Furthermore, the large-magnitude subsurface UHIs create significant downward components of conductive heat flow in the shallow subsurface, which are supplemented by downward heat transport by groundwater movement towards the flow network through the former mineworkings. The warm water in these workings has thus been heated, in part, by heat drawn from the shallow subsurface, as well as by heat flowing from the Earth’s interior. Similar conductive heat flow and groundwater flow responses are expected in other urban former coalfield regions of Britain; knowledge of the processes involved may facilitate their use as heat stores and may also contribute to UHI mitigation

    Mining Mid-level Features for Action Recognition Based on Effective Skeleton Representation

    Get PDF
    Recently, mid-level features have shown promising performance in computer vision. Mid-level features learned by incorporating class-level information are potentially more discriminative than traditional low-level local features. In this paper, an effective method is proposed to extract mid-level features from Kinect skeletons for 3D human action recognition. Firstly, the orientations of limbs connected by two skeleton joints are computed and each orientation is encoded into one of the 27 states indicating the spatial relationship of the joints. Secondly, limbs are combined into parts and the limb's states are mapped into part states. Finally, frequent pattern mining is employed to mine the most frequent and relevant (discriminative, representative and non-redundant) states of parts in continuous several frames. These parts are referred to as Frequent Local Parts or FLPs. The FLPs allow us to build powerful bag-of-FLP-based action representation. This new representation yields state-of-the-art results on MSR DailyActivity3D and MSR ActionPairs3D
    • …
    corecore