8 research outputs found

    Urban Hourly Water Demand Prediction Using Human Mobility Data

    Get PDF
    The efficient management of a water supply system requires precise water demand forecasts as inputs. This paper compares existing prediction methods and improves their performance by integrating human-related factors with water consumption in an urban area. Furthermore, a framework for processing and transforming mobility data into time-series is presented. Results show that using human mobility data improves forecasting accuracy reaching 87.6%

    Classification of Three Volatiles Using a Single-Type eNose with Detailed Class-Map Visualization

    Get PDF
    The use of electronic noses (eNoses) as analysis tools are growing in popularity; however, the lack of a comprehensive, visual representation of how the different classes are organized and distributed largely complicates the interpretation of the classification results, thus reducing their practicality. The new contributions of this paper are the assessment of the multivariate classification performance of a custom, low-cost eNose composed of 16 single-type (identical) MOX gas sensors for the classification of three volatiles, along with a proposal to improve the visual interpretation of the classification results by means of generating a detailed 2D class-map representation based on the inverse of the orthogonal linear transformation obtained from a PCA and LDA analysis. The results showed that this single-type eNose implementation was able to perform multivariate classification, while the class-map visualization summarized the learned features and how these features may affect the performance of the classification, simplifying the interpretation and understanding of the eNose results

    Mining app reviews to support software engineering

    Get PDF
    The thesis studies how mining app reviews can support software engineering. App reviews —short user reviews of an app in app stores— provide a potentially rich source of information to help software development teams maintain and evolve their products. Exploiting this information is however difficult due to the large number of reviews and the difficulty in extracting useful actionable information from short informal texts. A variety of app review mining techniques have been proposed to classify reviews and to extract information such as feature requests, bug descriptions, and user sentiments but the usefulness of these techniques in practice is still unknown. Research in this area has grown rapidly, resulting in a large number of scientific publications (at least 182 between 2010 and 2020) but nearly no independent evaluation and description of how diverse techniques fit together to support specific software engineering tasks have been performed so far. The thesis presents a series of contributions to address these limitations. We first report the findings of a systematic literature review in app review mining exposing the breadth and limitations of research in this area. Using findings from the literature review, we then present a reference model that relates features of app review mining tools to specific software engineering tasks supporting requirements engineering, software maintenance and evolution. We then present two additional contributions extending previous evaluations of app review mining techniques. We present a novel independent evaluation of opinion mining techniques using an annotated dataset created for our experiment. Our evaluation finds lower effectiveness than initially reported by the techniques authors. A final part of the thesis, evaluates approaches in searching for app reviews pertinent to a particular feature. The findings show a general purpose search technique is more effective than the state-of-the-art purpose-built app review mining techniques; and suggest their usefulness for requirements elicitation. Overall, the thesis contributes to improving the empirical evaluation of app review mining techniques and their application in software engineering practice. Researchers and developers of future app mining tools will benefit from the novel reference model, detailed experiments designs, and publicly available datasets presented in the thesis

    Analytics of human presence and movement behaviour within specific environments

    Get PDF
    The vast amounts of detailed information, generated by Wi-Fi and other mobile communication technologies, provide an invaluable opportunity to study different aspects of presence and movement behaviours of people within a given environment; for example, a university campus, an organisation office complex, or a city centre. Utilising such data, this thesis studies three main aspects of the human presence and movement behaviours: spatio-temporal movement (where and when do people move), user identification (how to uniquely identify people from their presence and movement historical records), and social grouping (how do people interact). Previous research works have predominantly studied two out of these three aspects, at most. Conversely, we investigate all three aspects in order to develop a coherent view of the human presence and movement behaviour within selected environments. More specifically, we create stochastic models for movement prediction and user identification. We also devise a set of clustering models for the detection of the social groups within a given environment. The thesis makes the following contributions: 1. Proposes a family of predictive models that allows for inference of locations though a collaborative mechanism which does not require the profiling of individual users. These prediction models utilise suffix trees as their core underlying data structure, where predictions about a specific individual are computed over an aggregate model incorporating the collective record of observed behaviours of multiple users. 2. Defines a mobility fingerprint as a profile constructed from the users historical mobility traces. The proposed method for constructing such a profile is a principled and scalable implementation of a variable length Markov model based on n-grams. 3. Proposes density-based clustering methods that discover social groups by analysing activity traces of mobile users as they move around, from one location to another, within an observed environment. We utilise two large collections of mobility traces: a GPS data set from Nokia and an Eduroam network log from Birkbeck, University of London, for the evaluation of the proposed models reported herein

    Application of a Single-Type eNose to Discriminate the Brewed Aroma of One Caffeinated and Decaffeinated Encapsulated Espresso Coffee Type

    Get PDF
    This paper assesses a custom single-type electronic nose (eNose) applied to differentiate the complex aromas generated by the caffeinated and decaffeinated versions of one encapsulated espresso coffee mixture type. The eNose used is composed of 16 single-type (identical) metal–oxide semiconductor (MOX) gas sensors based on microelectromechanical system (MEMS). This eNose proposal takes advantage of the small but inherent sensing variability of MOX gas sensors in order to provide a multisensorial description of volatiles or aromas. Results have shown that the information provided with this eNose processed using LDA is able to successfully discriminate the complex aromas of one caffeinated and decaffeinated encapsulated espresso coffee type

    Assessing over Time Performance of an eNose Composed of 16 Single-Type MOX Gas Sensors Applied to Classify Two Volatiles

    Get PDF
    This paper assesses the over time performance of a custom electronic nose (eNose) composed of an array of commercial low-cost and single-type miniature metal-oxide (MOX) semiconductor gas sensors. The eNose uses 16 BME680 versatile sensor devices, each including an embedded non-selective MOX gas sensor that was originally proposed to measure the total volatile organic compounds (TVOC) in the air. This custom eNose has been used previously to detect ethanol and acetone, obtaining initial promising classification results that worsened over time because of sensor drift. The current paper assesses the over time performance of different classification methods applied to process the information gathered from the eNose. The best classification results have been obtained when applying a linear discriminant analysis (LDA) to the normalized conductance of the sensing layer of the 16 MOX gas sensors available in the eNose. The LDA procedure by itself has reduced the influence of drift in the classification performance of this single-type eNose during an evaluation period of three month

    On Pattern Mining in Graph Data to Support Decision-Making

    Get PDF
    In recent years graph data models became increasingly important in both research and industry. Their core is a generic data structure of things (vertices) and connections among those things (edges). Rich graph models such as the property graph model promise an extraordinary analytical power because relationships can be evaluated without knowledge about a domain-specific database schema. This dissertation studies the usage of graph models for data integration and data mining of business data. Although a typical company's business data implicitly describes a graph it is usually stored in multiple relational databases. Therefore, we propose the first semi-automated approach to transform data from multiple relational databases into a single graph whose vertices represent domain objects and whose edges represent their mutual relationships. This transformation is the base of our conceptual framework BIIIG (Business Intelligence with Integrated Instance Graphs). We further proposed a graph-based approach to data integration. The process is executed after the transformation. In established data mining approaches interrelated input data is mostly represented by tuples of measure values and dimension values. In the context of graphs these values must be attached to the graph structure and aggregated measure values are graph attributes. Since the latter was not supported by any existing model, we proposed the use of collections of property graphs. They act as data structure of the novel Extended Property Graph Model (EPGM). The model supports vertices and edges that may appear in different graphs as well as graph properties. Further on, we proposed some operators that benefit from this data structure, for example, graph-based aggregation of measure values. A primitive operation of graph pattern mining is frequent subgraph mining (FSM). However, existing algorithms provided no support for directed multigraphs. We extended the popular gSpan algorithm to overcome this limitation. Some patterns might not be frequent while their generalizations are. Generalized graph patterns can be mined by attaching vertices to taxonomies. We proposed a novel approach to Generalized Multidimensional Frequent Subgraph Mining (GM-FSM), in particular the first solution to generalized FSM that supports not only directed multigraphs but also multiple dimensional taxonomies. In scenarios that compare patterns of different categories, e.g., fraud or not, FSM is not sufficient since pattern frequencies may differ by category. Further on, determining all pattern frequencies without frequency pruning is not an option due to the computational complexity of FSM. Thus, we developed an FSM extension to extract patterns that are characteristic for a specific category according to a user-defined interestingness function called Characteristic Subgraph Mining (CSM). Parts of this work were done in the context of GRADOOP, a framework for distributed graph analytics. To make the primitive operation of frequent subgraph mining available to this framework, we developed Distributed In-Memory gSpan (DIMSpan), a frequent subgraph miner that is tailored to the characteristics of shared-nothing clusters and distributed dataflow systems. Finally, the results of use case evaluations in cooperation with a large scale enterprise will be presented. This includes a report of practical experiences gained in implementation and application of the proposed algorithms

    Geographic information extraction from texts

    Get PDF
    A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
    corecore