122,046 research outputs found
Mining Top-k Closed Co-location Patterns
A spatial co-located event sets is a set of spatial events being frequently observed together in nearby geographic space. Spatial co-location patterns can give useful information in many application domains such as business, ecology, public health and criminology
Subjectively Interesting Subgroup Discovery on Real-valued Targets
Deriving insights from high-dimensional data is one of the core problems in
data mining. The difficulty mainly stems from the fact that there are
exponentially many variable combinations to potentially consider, and there are
infinitely many if we consider weighted combinations, even for linear
combinations. Hence, an obvious question is whether we can automate the search
for interesting patterns and visualizations. In this paper, we consider the
setting where a user wants to learn as efficiently as possible about
real-valued attributes. For example, to understand the distribution of crime
rates in different geographic areas in terms of other (numerical, ordinal
and/or categorical) variables that describe the areas. We introduce a method to
find subgroups in the data that are maximally informative (in the formal
Information Theoretic sense) with respect to a single or set of real-valued
target attributes. The subgroup descriptions are in terms of a succinct set of
arbitrarily-typed other attributes. The approach is based on the Subjective
Interestingness framework FORSIED to enable the use of prior knowledge when
finding most informative non-redundant patterns, and hence the method also
supports iterative data mining.Comment: 12 pages, 10 figures, 2 tables, conference submissio
Geo-Spotting: Mining Online Location-based Services for Optimal Retail Store Placement
The problem of identifying the optimal location for a new retail store has
been the focus of past research, especially in the field of land economy, due
to its importance in the success of a business. Traditional approaches to the
problem have factored in demographics, revenue and aggregated human flow
statistics from nearby or remote areas. However, the acquisition of relevant
data is usually expensive. With the growth of location-based social networks,
fine grained data describing user mobility and popularity of places has
recently become attainable.
In this paper we study the predictive power of various machine learning
features on the popularity of retail stores in the city through the use of a
dataset collected from Foursquare in New York. The features we mine are based
on two general signals: geographic, where features are formulated according to
the types and density of nearby places, and user mobility, which includes
transitions between venues or the incoming flow of mobile users from distant
areas. Our evaluation suggests that the best performing features are common
across the three different commercial chains considered in the analysis,
although variations may exist too, as explained by heterogeneities in the way
retail facilities attract users. We also show that performance improves
significantly when combining multiple features in supervised learning
algorithms, suggesting that the retail success of a business may depend on
multiple factors.Comment: Proceedings of the 19th ACM SIGKDD international conference on
Knowledge discovery and data mining, Chicago, 2013, Pages 793-80
Testing Interestingness Measures in Practice: A Large-Scale Analysis of Buying Patterns
Understanding customer buying patterns is of great interest to the retail
industry and has shown to benefit a wide variety of goals ranging from managing
stocks to implementing loyalty programs. Association rule mining is a common
technique for extracting correlations such as "people in the South of France
buy ros\'e wine" or "customers who buy pat\'e also buy salted butter and sour
bread." Unfortunately, sifting through a high number of buying patterns is not
useful in practice, because of the predominance of popular products in the top
rules. As a result, a number of "interestingness" measures (over 30) have been
proposed to rank rules. However, there is no agreement on which measures are
more appropriate for retail data. Moreover, since pattern mining algorithms
output thousands of association rules for each product, the ability for an
analyst to rely on ranking measures to identify the most interesting ones is
crucial. In this paper, we develop CAPA (Comparative Analysis of PAtterns), a
framework that provides analysts with the ability to compare the outcome of
interestingness measures applied to buying patterns in the retail industry. We
report on how we used CAPA to compare 34 measures applied to over 1,800 stores
of Intermarch\'e, one of the largest food retailers in France
Population Density-based Hospital Recommendation with Mobile LBS Big Data
The difficulty of getting medical treatment is one of major livelihood issues
in China. Since patients lack prior knowledge about the spatial distribution
and the capacity of hospitals, some hospitals have abnormally high or sporadic
population densities. This paper presents a new model for estimating the
spatiotemporal population density in each hospital based on location-based
service (LBS) big data, which would be beneficial to guiding and dispersing
outpatients. To improve the estimation accuracy, several approaches are
proposed to denoise the LBS data and classify people by detecting their various
behaviors. In addition, a long short-term memory (LSTM) based deep learning is
presented to predict the trend of population density. By using Baidu
large-scale LBS logs database, we apply the proposed model to 113 hospitals in
Beijing, P. R. China, and constructed an online hospital recommendation system
which can provide users with a hospital rank list basing the real-time
population density information and the hospitals' basic information such as
hospitals' levels and their distances. We also mine several interesting
patterns from these LBS logs by using our proposed system
Unravelling the relative contributions of climate change and ground disturbance to subsurface temperature perturbations: Case studies from Tyneside, UK
When assessing subsurface urban heat islands (UHIs) it is important to distinguish between localized effects of land-use change and the impacts of global climate change. However, few investigations have successfully unraveled the two influences. We have investigated borehole temperature records from the urban centres of Gateshead and Newcastle upon Tyne in northeast England, to ascertain the effects on subsurface temperatures of climate change and changes in ground conditions due to historic coal mining and more recent urban development. The latter effects are shown to be substantial, albeit with significant variations on a very local scale. Significant subsurface UHIs are indeed evident in both urban centres, estimated as 2.0 °C in Newcastle and 4.5 °C in Gateshead, the former value being comparable to the 1.9 °C atmospheric UHI previously measured for the Tyneside conurbation as a whole. We interpret these substantial subsurface UHIs as a consequence of the region’s long history of urban and industrial development and associated surface energy use, possibly supplemented in Gateshead by the thermal effect of trains braking in an adjacent shallow railway tunnel. We also show that a large proportion of the expected conductive heat flux from the Earth’s interior beneath both Gateshead and Newcastle becomes entrained by groundwater flow and transported elsewhere, through former mineworkings in which the rocks have become ‘permeabilised’ during the region’s long history of coal mining. Discharge of groundwater at a nearby minewater pumping station, Kibblesworth, has a heat flux that we estimate as ∼7.5 MW; it thus ‘captures’ the equivalent of roughly two thirds of the geothermal heat flux through a >100 km2 surrounding region. Modelling of the associated groundwater flow regime provides first-order estimates of the hydraulic transport properties of ‘permeabilised’ Carboniferous Coal Measures rocks, comprising permeability ∼3 × 10−11 m2 or ∼30 darcies, hydraulic conductivity ∼2 × 10−4 m s−1, and transmissivity ∼2 × 10−3 m2 s−1 or ∼200 m2 day−1; these are very high values, comparable to what one might expect for karstified Carboniferous limestone. Furthermore, the large-magnitude subsurface UHIs create significant downward components of conductive heat flow in the shallow subsurface, which are supplemented by downward heat transport by groundwater movement towards the flow network through the former mineworkings. The warm water in these workings has thus been heated, in part, by heat drawn from the shallow subsurface, as well as by heat flowing from the Earth’s interior. Similar conductive heat flow and groundwater flow responses are expected in other urban former coalfield regions of Britain; knowledge of the processes involved may facilitate their use as heat stores and may also contribute to UHI mitigation
Mining Mid-level Features for Action Recognition Based on Effective Skeleton Representation
Recently, mid-level features have shown promising performance in computer
vision. Mid-level features learned by incorporating class-level information are
potentially more discriminative than traditional low-level local features. In
this paper, an effective method is proposed to extract mid-level features from
Kinect skeletons for 3D human action recognition. Firstly, the orientations of
limbs connected by two skeleton joints are computed and each orientation is
encoded into one of the 27 states indicating the spatial relationship of the
joints. Secondly, limbs are combined into parts and the limb's states are
mapped into part states. Finally, frequent pattern mining is employed to mine
the most frequent and relevant (discriminative, representative and
non-redundant) states of parts in continuous several frames. These parts are
referred to as Frequent Local Parts or FLPs. The FLPs allow us to build
powerful bag-of-FLP-based action representation. This new representation yields
state-of-the-art results on MSR DailyActivity3D and MSR ActionPairs3D
- …