52,472 research outputs found
A Statistical Toolbox For Mining And Modeling Spatial Data
Most data mining projects in spatial economics start with an evaluation of a set of attribute variables on a sample of spatial entities, looking for the existence and strength of spatial autocorrelation, based on the Moran’s and the Geary’s coefficients, the adequacy of which is rarely challenged, despite the fact that when reporting on their properties, many users seem likely to make mistakes and to foster confusion. My paper begins by a critical appraisal of the classical definition and rational of these indices. I argue that while intuitively founded, they are plagued by an inconsistency in their conception. Then, I propose a principled small change leading to corrected spatial autocorrelation coefficients, which strongly simplifies their relationship, and opens the way to an augmented toolbox of statistical methods of dimension reduction and data visualization, also useful for modeling purposes. A second section presents a formal framework, adapted from recent work in statistical learning, which gives theoretical support to our definition of corrected spatial autocorrelation coefficients. More specifically, the multivariate data mining methods presented here, are easily implementable on the existing (free) software, yield methods useful to exploit the proposed corrections in spatial data analysis practice, and, from a mathematical point of view, whose asymptotic behavior, already studied in a series of papers by Belkin & Niyogi, suggests that they own qualities of robustness and a limited sensitivity to the Modifiable Areal Unit Problem (MAUP), valuable in exploratory spatial data analysis
KBGAN: Adversarial Learning for Knowledge Graph Embeddings
We introduce KBGAN, an adversarial learning framework to improve the
performances of a wide range of existing knowledge graph embedding models.
Because knowledge graphs typically only contain positive facts, sampling useful
negative training examples is a non-trivial task. Replacing the head or tail
entity of a fact with a uniformly randomly selected entity is a conventional
method for generating negative facts, but the majority of the generated
negative facts can be easily discriminated from positive facts, and will
contribute little towards the training. Inspired by generative adversarial
networks (GANs), we use one knowledge graph embedding model as a negative
sample generator to assist the training of our desired model, which acts as the
discriminator in GANs. This framework is independent of the concrete form of
generator and discriminator, and therefore can utilize a wide variety of
knowledge graph embedding models as its building blocks. In experiments, we
adversarially train two translation-based models, TransE and TransD, each with
assistance from one of the two probability-based models, DistMult and ComplEx.
We evaluate the performances of KBGAN on the link prediction task, using three
knowledge base completion datasets: FB15k-237, WN18 and WN18RR. Experimental
results show that adversarial training substantially improves the performances
of target embedding models under various settings.Comment: To appear at NAACL HLT 201
Exploring Student Check-In Behavior for Improved Point-of-Interest Prediction
With the availability of vast amounts of user visitation history on
location-based social networks (LBSN), the problem of Point-of-Interest (POI)
prediction has been extensively studied. However, much of the research has been
conducted solely on voluntary checkin datasets collected from social apps such
as Foursquare or Yelp. While these data contain rich information about
recreational activities (e.g., restaurants, nightlife, and entertainment),
information about more prosaic aspects of people's lives is sparse. This not
only limits our understanding of users' daily routines, but more importantly
the modeling assumptions developed based on characteristics of recreation-based
data may not be suitable for richer check-in data. In this work, we present an
analysis of education "check-in" data using WiFi access logs collected at
Purdue University. We propose a heterogeneous graph-based method to encode the
correlations between users, POIs, and activities, and then jointly learn
embeddings for the vertices. We evaluate our method compared to previous
state-of-the-art POI prediction methods, and show that the assumptions made by
previous methods significantly degrade performance on our data with dense(r)
activity signals. We also show how our learned embeddings could be used to
identify similar students (e.g., for friend suggestions).Comment: published in KDD'1
Conditional t-SNE: Complementary t-SNE embeddings through factoring out prior information
Dimensionality reduction and manifold learning methods such as t-Distributed
Stochastic Neighbor Embedding (t-SNE) are routinely used to map
high-dimensional data into a 2-dimensional space to visualize and explore the
data. However, two dimensions are typically insufficient to capture all
structure in the data, the salient structure is often already known, and it is
not obvious how to extract the remaining information in a similarly effective
manner. To fill this gap, we introduce \emph{conditional t-SNE} (ct-SNE), a
generalization of t-SNE that discounts prior information from the embedding in
the form of labels. To achieve this, we propose a conditioned version of the
t-SNE objective, obtaining a single, integrated, and elegant method. ct-SNE has
one extra parameter over t-SNE; we investigate its effects and show how to
efficiently optimize the objective. Factoring out prior knowledge allows
complementary structure to be captured in the embedding, providing new
insights. Qualitative and quantitative empirical results on synthetic and
(large) real data show ct-SNE is effective and achieves its goal
PlaNet - Photo Geolocation with Convolutional Neural Networks
Is it possible to build a system to determine the location where a photo was
taken using just its pixels? In general, the problem seems exceptionally
difficult: it is trivial to construct situations where no location can be
inferred. Yet images often contain informative cues such as landmarks, weather
patterns, vegetation, road markings, and architectural details, which in
combination may allow one to determine an approximate location and occasionally
an exact location. Websites such as GeoGuessr and View from your Window suggest
that humans are relatively good at integrating these cues to geolocate images,
especially en-masse. In computer vision, the photo geolocation problem is
usually approached using image retrieval methods. In contrast, we pose the
problem as one of classification by subdividing the surface of the earth into
thousands of multi-scale geographic cells, and train a deep network using
millions of geotagged images. While previous approaches only recognize
landmarks or perform approximate matching using global image descriptors, our
model is able to use and integrate multiple visible cues. We show that the
resulting model, called PlaNet, outperforms previous approaches and even
attains superhuman levels of accuracy in some cases. Moreover, we extend our
model to photo albums by combining it with a long short-term memory (LSTM)
architecture. By learning to exploit temporal coherence to geolocate uncertain
photos, we demonstrate that this model achieves a 50% performance improvement
over the single-image model
Strategies for embedding eLearning in traditional universities: drivers and barriers
This paper addresses the question: how can elearning be embedded in traditional universities so that it contributes to the transformation of the university? The paper examines elearning strategies in higher education, locating the institutional context within the broader framework of national and international policy drivers which link elearning with the achievement of strategic goals such as widening access to lifelong learning, and upskilling for the knowledge and information society. The focus will be on traditional universities i.e. universities whose main form of teaching is on-campus and face-to-face, rather than on open and distance teaching universities, which face different strategic issues in implementing elearning.
Reports on the adoption of elearning in traditional universities indicate extensive use of elearning to improve the quality of learning for on-campus students, but this has not yet translated into a significant increase in opportunities for lifelong learners in the workforce and those unable to attend on-campus. One vision of the future of universities is that ‘Virtualisation and remote working technologies will enable us to study at any university in the world, from home’. However, this paper will point out that realisation of this vision of ubiquitous and lifelong access to higher education requires that a fully articulated elearning strategy aims to have a ‘transformative’ rather than just a ‘sustaining’ effect on teaching functions carried out in traditional universities. In order words, rather than just facilitating universities to improve their teaching, elearning should transform how universities currently teach. However, to achieve this transformation, universities will have to introduce strategies and policies which implement flexible academic frameworks, innovative pedagogical approaches, new forms of assessments, cross-institutional accreditation and credit transfer agreements, institutional collaboration in development and delivery, and, most crucially, commitment to equivalence of access for students on and off-campus.
The insights in this paper are drawn from an action research case study involving both qualitative and quantitative approaches, utilising interviews, surveys and focus groups with stakeholders, in addition to comparative research on international best practice. The paper will review the drivers and rationales at international, national and institutional level which are leading to the development of elearning strategies, before outlining the outcomes of a case study of elearning strategy development in a traditional Irish university. This study examined the drivers and barriers which increase or decrease motivation to engage in elearning, and provides some insights into the challenges of embedding elearning in higher education. While recognising the desirability of reaching out to new students and engaging in innovative pedagogical approaches, many academic staff continue to prefer traditional lectures, and are sceptical about the potential for student learning in online settings. Extrinsic factors in terms of lack of time and support serve to decrease motivation and there are also fears of loss of academic control to central administration.
The paper concludes with some observations on how university elearning strategies must address staff concerns through capacity building, awareness raising and the establishment of effective support structures for embedding elearning
- …