874 research outputs found

    Multi-Source Spatial Entity Extraction and Linkage

    Get PDF

    Answering skyline queries over incomplete data with crowdsourcing (Extended Abstract)

    Get PDF

    Coping with new Challenges in Clustering and Biomedical Imaging

    Get PDF
    The last years have seen a tremendous increase of data acquisition in different scientific fields such as molecular biology, bioinformatics or biomedicine. Therefore, novel methods are needed for automatic data processing and analysis of this large amount of data. Data mining is the process of applying methods like clustering or classification to large databases in order to uncover hidden patterns. Clustering is the task of partitioning points of a data set into distinct groups in order to minimize the intra cluster similarity and to maximize the inter cluster similarity. In contrast to unsupervised learning like clustering, the classification problem is known as supervised learning that aims at the prediction of group membership of data objects on the basis of rules learned from a training set where the group membership is known. Specialized methods have been proposed for hierarchical and partitioning clustering. However, these methods suffer from several drawbacks. In the first part of this work, new clustering methods are proposed that cope with problems from conventional clustering algorithms. ITCH (Information-Theoretic Cluster Hierarchies) is a hierarchical clustering method that is based on a hierarchical variant of the Minimum Description Length (MDL) principle which finds hierarchies of clusters without requiring input parameters. As ITCH may converge only to a local optimum we propose GACH (Genetic Algorithm for Finding Cluster Hierarchies) that combines the benefits from genetic algorithms with information-theory. In this way the search space is explored more effectively. Furthermore, we propose INTEGRATE a novel clustering method for data with mixed numerical and categorical attributes. Supported by the MDL principle our method integrates the information provided by heterogeneous numerical and categorical attributes and thus naturally balances the influence of both sources of information. A competitive evaluation illustrates that INTEGRATE is more effective than existing clustering methods for mixed type data. Besides clustering methods for single data objects we provide a solution for clustering different data sets that are represented by their skylines. The skyline operator is a well-established database primitive for finding database objects which minimize two or more attributes with an unknown weighting between these attributes. In this thesis, we define a similarity measure, called SkyDist, for comparing skylines of different data sets that can directly be integrated into different data mining tasks such as clustering or classification. The experiments show that SkyDist in combination with different clustering algorithms can give useful insights into many applications. In the second part, we focus on the analysis of high resolution magnetic resonance images (MRI) that are clinically relevant and may allow for an early detection and diagnosis of several diseases. In particular, we propose a framework for the classification of Alzheimer's disease in MR images combining the data mining steps of feature selection, clustering and classification. As a result, a set of highly selective features discriminating patients with Alzheimer and healthy people has been identified. However, the analysis of the high dimensional MR images is extremely time-consuming. Therefore we developed JGrid, a scalable distributed computing solution designed to allow for a large scale analysis of MRI and thus an optimized prediction of diagnosis. In another study we apply efficient algorithms for motif discovery to task-fMRI scans in order to identify patterns in the brain that are characteristic for patients with somatoform pain disorder. We find groups of brain compartments that occur frequently within the brain networks and discriminate well among healthy and diseased people

    Service selection and transactional management for web service composition

    Get PDF
    [no abstract

    Off-Season Tourists and the Cultural Offer of a Mass-Tourism Destination: The Case of Rimini

    Get PDF
    This paper assesses the potential implications on off-season tourism of enhancing the cultural offer of Rimini, a popular Italian seaside holiday destination. Rimini, a city of about 130,000 people hosts a total of around 12 million overnight stays, 10 million of which are concentrated in the summer months. In the last twenty years or so, Rimini has been undergoing a policy of deseasoning, which mainly pivots around business tourism (a new fair quarter and important conference venues have been built) and cultural tourism (the city has been investing on both its cultural heritage and art exhibitions). This assessment is carried out through discrete choice experiments submitted to a sample of about 800 off-season tourists, that is, tourists who visited Rimini outside the summer months. Since tourism can be viewed as a composite good, which overall utility depends on the arrangement of the component characteristics, the choice experiments allow to disentangle the importance and the willingness to pay of tourists for different levels of the holiday's characteristics. The choice model incorporates as attributes a number of possible changes to actual tourism features (which are also the subject of public debate), including them in hypothetical alternative "holiday packages". The conditional logit analysis of the choice experiments can highlight the potential synergies and trade-offs between cultural and business tourism. Moreover, the methodology and the structure of the questionnaire allow a partial comparison of our findings with results stemming from two previous studies carried out in Rimini, respectively on summer tourists and on residents. Such comparison highlights synergies and trade-offs between off-season tourists, summer tourists, and residents.tourism demand; cultural tourism; business tourism, conditional logit; urban planning; choice experiments

    Landscapes and regional development: What are the links?

    Get PDF
    Despite increasing interest for rural landscapes, information technology advances, and transportation improvements, rural areas generally continue to lag behind urban ones with respect to many socioeconomic indicators. Those rural areas that experience significant growth, however, are either located close to metropolitan areas or offer outstanding amenities that attract population and firms. Landscapes, as amenities, are defined as location-specific features that enhance the attractiveness of a given location. The empirical connection between amenities and regional growth has been established, but supply and demand issues of amenities and how their presence might lead to increased development still need clarification. This survey paper deals with several issues. First, amenities and landscapes and their characteristics are defined and described, particularly as economic goods. Second, supply and demand factors for amenities are presented. Third, the links between amenities, landscapes and regional development are explained, via both impact mechanisms and institutional arrangements. Last, key public policy and further research issues are outlined.landscape, amenities, rural area, regional development, economic development.

    Recommendation Support for Multi-Attribute Databases

    Get PDF

    Mining and Managing User-Generated Content and Preferences

    Get PDF
    Ιn this thesis, we present techniques to manage the results of expressive queries, such as skyline, and mine online content that has been generated by users. Given the numerous scenarios and applications where content mining can be applied, we focus, in particular, to two cases: review mining and social media analysis. More specifically, we focus on preference queries, where users can query a set of items, each associated with an attribute set. For each of the attributes, users can specify their preference on whether to minimize or maximize it, e.g., "minimize price", "maximize performance", etc. Such queries are also know as "pareto optimal", or "skyline queries". A drawback of this query type is that the result may become too large for the user to inspect manually. We propose an approach that addresses this issue, by selecting a set of diverse skyline results. We provide a formal definition of skyline diversification and present efficient techniques to return such a set of points. The result can then be ranked according to established quality criteria. We also propose an alternative scheme for ranking skyline results, following an information retrieval approach

    Eco-town: An integrated modeling framework for simulating the effects of urban morphology on sustainable development

    Get PDF
    Spatial structure of a city is a key determinant of its socioeconomic well-being and there is a growing interest in models that investigate the relation between spatial structure and sustainable urban development. This dissertation aims to examine the role of urban spatial structure on social, economic, and environmental dimensions of sustainability through developing an integrated modeling framework. In particular, this modeling framework bridges design of urban built environment and sustainable development at the city-region level through simulation and measuring the effect of changes in urban spatial structure on the stock of various asset forms including natural, human, and physical capital.;The proposed methodology consists of an integrated modeling framework through which various spatial configurations of the selected urban facilities are simulated and simulation outputs are evaluated in terms of sustainability. This framework consists of four components: i. a spatial database, ii. a land suitability analysis, iii. a spatial optimization model which is a combination of optimal facility location and optimal shopping frequency models and iv. a sustainability assessment. The sustainability metrics of Genuine Progress Indicator (GPI) is employed to evaluate simulation results and reveal the direction and magnitude of effects.;The modeling framework was applied to the study area of Morgantown, West Virginia for a case of locating food and beverage stores. The simulation results were generalized into a set of monocentric, polycentric and decentralized scenarios in order to measure the GPI level\u27s change due to the changes in the spatial configuration of food and beverage stores. The results show that even a modest change in the spatial configuration of an urban facility (food and beverage store) can significantly change the urban sustainability level as measured by GPI
    • …
    corecore