27 research outputs found

    The SpatialCIM methodology for spatial document coverage disambiguation and the entity recognition process aided by linguistic techniques.

    Get PDF
    Abstract. Nowadays it is becoming more usual for users to take into account the geographical localization of the documents in the retrieval information process. However, the conventional retrieval information systems based on key-word matching do not consider which words can represent geographical entities that are spatially related to other entities in the document. This paper presents the SpatialCIM methodology, which is based on three steps: pre-processing, data expansion and disambiguation. In the pre-processing step, the entity recognition process is carried out with the support of the Rembrandt tool. Additionally, a comparison between the performances regarding the discovery of the location entities in the texts of the Rembrandt tool against the use of a controlled vocabulary corresponding to the Brazilian geographic locations are presented. For the comparison a set of geographic labeled news covering the sugar cane culture in the Portuguese language is used. The results showed a F-measure value increase for the Rembrandt tool from 45% in the non-disambiguated process to 0.50 after disambiguation and from 35% to 38% using the controlled vocabulary. Additionally, the results showed the Rembrandt tool has a minimal amplitude difference between precision and recall, although the controlled vocabulary has always the biggest recall values.GeoDoc 2012, PAKDD 2012

    A Geo-Statistical Approach for Crime hot spot Prediction

    Get PDF
    Crime hot spot prediction is a challenging task in present time. Effective models are needed which are capable of dealing with large amount of crime dataset and prediction of future crime location. Spatio-temporal data mining are very much useful for dealing with the geographical crime data. In this paper sparse matrix analysis based spatial clustering technique for serial crime prediction model is used. Firstly, crime data are preprocessed through various distribution techniques and then sparse matrix analysis based spatial clustering technique are applied on a four years time series data from 2010 to 2014 for the major cities of India like Delhi, Mumbai, Kolkata and Chennai to find out the hotspot location for next year, after that three clustering techniques are used to grouping similar crime incident, at last cluster results obtained by original and proposed dataset are compared. The main objective of this research is applying crime prediction technique, forecast and detect the future crime location and its probability

    Introducing an annotated bibliography on temporal and evolution aspects in the World Wide Web

    Full text link

    Prediction of peptides binding to MHC class I and II alleles by temporal motif mining

    Get PDF
    Background: MHC (Major Histocompatibility Complex) is a key player in the immune response of most vertebrates. The computational prediction of whether a given antigenic peptide will bind to a specific MHC allele is important in the development of vaccines for emerging pathogens, the creation of possibilities for controlling immune response, and for the applications of immunotherapy. One of the problems that make this computational prediction difficult is the detection of the binding core region in peptides, coupled with the presence of bulges and loops causing variations in the total sequence length. Most machine learning methods require the sequences to be of the same length to successfully discover the binding motifs, ignoring the length variance in both motif mining and prediction steps. In order to overcome this limitation, we propose the use of time-based motif mining methods that work position-independently. Results: The prediction method was tested on a benchmark set of 28 different alleles for MHC class I and 27 different alleles for MHC class II. The obtained results are comparable to the state of the art methods for both MHC classes, surpassing the published results for some alleles. The average prediction AUC values are 0.897 for class I, and 0.858 for class II. Conclusions: Temporal motif mining using partial periodic patterns can capture information about the sequences well enough to predict the binding of the peptides and is comparable to state of the art methods in the literature. Unlike neural networks or matrix based predictors, our proposed method does not depend on peptide length and can work with both short and long fragments. This advantage allows better use of the available training data and the prediction of peptides of uncommon lengths

    Mining Exceptional Social Behaviour

    Get PDF
    Essentially, our lives are made of social interactions. These can be recorded through personal gadgets as well as sensors adequately attached to people for research purposes. In particular, such sensors may record real time location of people. This location data can then be used to infer interactions, which may be translated into behavioural patterns. In this paper, we focus on the automatic discovery of exceptional social behaviour from spatio-temporal data. For that, we propose a method for Exceptional Behaviour Discovery (EBD). The proposed method combines Subgroup Discovery and Network Science techniques for finding social behaviour that deviates from the norm. In particular, it transforms movement and demographic data into attributed social interaction networks, and returns descriptive subgroups. We applied the proposed method on two real datasets containing location data from children playing in the school playground. Our results indicate that this is a valid approach which is able to obtain meaningful knowledge from the data.This work has been partially supported by the German Research Foundation (DFG) project “MODUS” (under grant AT 88/4-1). Furthermore, the research leading to these results has received funding (JG) from ESRC grant ES/N006577/1. This work was financed by the project Kids First, project number 68639

    Improved partitioning technique for density cube-based spatio-temporal clustering method

    Get PDF
    This work proposes a novel partitioning technique on the density-cube-based data model for the Spatio-temporal clustering method. This work further adapts this clustering approach to Spatio-temporal data. We have compared the IMSTAGRID-the proposed algorithm to the ST-DBSCAN, AGRID+, and ST-AGRID algorithms and have found that the IMSTAGRID algorithm improves the data partitioning technique and the interval expansion technique and is able to achieve uniformity in the spatial and temporal dimensional values. Three types of Spatio-temporal data sets have been used in this experiment: a storm data set and two synthetic data sets – synthetic data set 1 and synthetic data set 2. Both the storm data set and synthetic data set 2 were comparable in terms of the scattering of the data points, while synthetic data set 1 contained clustered data. The performance of the IMSTAGRID clustering method was measured via a silhouette analysis, and its results surpassed the other algorithms investigated; the silhouette index for synthetic data set 2 was 0.970, and 0.993 using synthetic data set data set 1. The IMSTAGRID algorithm also outperformed the baseline algorithms (ST-DBSCAN, AGRID+, and ST-AGRID) in labeling accuracy for the storm data set, yielding results of 82.68%, 38.36%, 76.13%, and 78.66%, respectively

    Literature Review on Temporal, Spatial, and Spatiotermpoal Data Models

    Get PDF
    This paper reviews papers on temporal databases, spatial databases, and spatio-temporal databases
    corecore