11,474 research outputs found

    Pattern Mining and Sense-Making Support for Enhancing the User Experience

    Get PDF
    While data mining techniques such as frequent itemset and sequence mining are well established as powerful pattern discovery tools in domains from science, medicine to business, a detriment is the lack of support for interactive exploration of high numbers of patterns generated with diverse parameter settings and the relationships among the mined patterns. To enhance the user experience, real-time query turnaround times and improved support for interactive mining are desired. There is also an increasing interest in applying data mining solutions for mobile data. Patterns mined over mobile data may enable context-aware applications ranging from automating frequently repeated tasks to providing personalized recommendations. Overall, this dissertation addresses three problems that limit the utility of data mining, namely, (a.) lack of interactive exploration tools for mined patterns, (b.) insufficient support for mining localized patterns, and (c.) high computational mining requirements prohibiting mining of patterns on smaller compute units such as a smartphone. This dissertation develops interactive frameworks for the guided exploration of mined patterns and their relationships. Contributions include the PARAS pre- processing and indexing framework; enabling analysts to gain key insights into rule relationships in a parameter space view due to the compact storage of rules that enables query-time reconstruction of complete rulesets. Contributions also include the visual rule exploration framework FIRE that presents an interactive dual view of the parameter space and the rule space, that together enable enhanced sense-making of rule relationships. This dissertation also supports the online mining of localized association rules computed on data subsets by selectively deploying alternative execution strategies that leverage multidimensional itemset-based data partitioning index. Finally, we designed OLAPH, an on-device context-aware service that learns phone usage patterns over mobile context data such as app usage, location, call and SMS logs to provide device intelligence. Concepts introduced for modeling mobile data as sequences include compressing context logs to intervaled context events, adding generalized time features, and identifying meaningful sequences via filter expressions

    Computing at massive scale: Scalability and dependability challenges

    Get PDF
    Large-scale Cloud systems and big data analytics frameworks are now widely used for practical services and applications. However, with the increase of data volume, together with the heterogeneity of workloads and resources, and the dynamic nature of massive user requests, the uncertainties and complexity of resource management and service provisioning increase dramatically, often resulting in poor resource utilization, vulnerable system dependability, and user-perceived performance degradations. In this paper we report our latest understanding of the current and future challenges in this particular area, and discuss both existing and potential solutions to the problems, especially those concerned with system efficiency, scalability and dependability. We first introduce a data-driven analysis methodology for characterizing the resource and workload patterns and tracing performance bottlenecks in a massive-scale distributed computing environment. We then examine and analyze several fundamental challenges and the solutions we are developing to tackle them, including for example incremental but decentralized resource scheduling, incremental messaging communication, rapid system failover, and request handling parallelism. We integrate these solutions with our data analysis methodology in order to establish an engineering approach that facilitates the optimization, tuning and verification of massive-scale distributed systems. We aim to develop and offer innovative methods and mechanisms for future computing platforms that will provide strong support for new big data and IoE (Internet of Everything) applications

    A Nine Month Progress Report on an Investigation into Mechanisms for Improving Triple Store Performance

    No full text
    This report considers the requirement for fast, efficient, and scalable triple stores as part of the effort to produce the Semantic Web. It summarises relevant information in the major background field of Database Management Systems (DBMS), and provides an overview of the techniques currently in use amongst the triple store community. The report concludes that for individuals and organisations to be willing to provide large amounts of information as openly-accessible nodes on the Semantic Web, storage and querying of the data must be cheaper and faster than it is currently. Experiences from the DBMS field can be used to maximise triple store performance, and suggestions are provided for lines of investigation in areas of storage, indexing, and query optimisation. Finally, work packages are provided describing expected timetables for further study of these topics

    Keeping in touch – A benefit of public holidays using time use diary data

    Get PDF
    This paper argues that public holidays facilitate the co-ordination of leisure time but do not constrain the annual amount of leisure. Public holidays therefore have benefits both in the utility of leisure on holidays and (by enabling people to maintain social contacts more easily) in increasing the utility of leisure on normal weekdays and weekends. The paper uses the variation in public holidays across German Länder based on more than 37.000 individual diary data of the actual German Time Use Survey of 2001-02 to illustrate the positive association between more public holidays and social life on normal weekdays and weekends. These benefits are additional to the other, direct benefits of public holidays.Public holidays, social contacts, social leisure time, time allocation, time use diaries, German Time Budget Survey 2001/02

    Measuring and predicting adaptation in multidimensional activity-travel patterns

    Get PDF
    2+131hlm.;24c

    Exploring time diaries using semi-automated activity pattern extraction

    Get PDF
    Identifying patterns of activities in time diaries in order to understand the variety of daily life in terms of combinations of activities performed by individuals in different groups is of interest in time use research. So far, activity patterns have mostly been identified by visually inspecting representations of activity data or by using sequence comparison methods, such as sequence alignment, in order to cluster similar data and then extract representative patterns from these clusters. Both these methods are sensitive to data size, pure visual methods become too cluttered and sequence comparison methods become too time consuming. Furthermore, the patterns identified by both methods represent mostly general trends of activity in a population, while detail and unexpected features hidden in the data are often never revealed. We have implemented an algorithm that searches the time diaries and automatically extracts all activity patterns meeting user-defined criteria of what constitutes a valid pattern of interest for the user’s research question. Amongst the many criteria which can be applied are a time window containing the pattern, minimum and maximum occurrences of the pattern, and number of people that perform it. The extracted activity patterns can then be interactively filtered, visualized and analyzed to reveal interesting insights. Exploration of the results of each pattern search may result in new hypotheses which can be subsequently explored by altering the search criteria. To demonstrate the value of the presented approach we consider and discuss sequential activity patterns at a population level, from a single day perspective.Time-geography, diaries, everyday life, activity patterns, visualization, data mining, sequential pattern mining

    Harmonising extended measures of parental childcare in the time-diary surveys of four countries – Proximity versus responsibility

    Get PDF
    Measures of childcare drawn from time-diary data are commonly based on the specific childcare activities a parent engages in throughout the day. This emphasis on activities has been criticised as it ignores the large quantity of time parents spend supervising their children. In order to provide more accurate estimates of childcare that incorporate supervisory childcare, researchers have turned to extended measures of care based on being i) in proximity to children or ii) responsible for children. There has been debate about the extent to which these approaches each measure the same aspect of childcare. In addition, it is thought they may be sensitive to the way surveys have been designed, which can affect the extent to which they can be compared crossnationally. We argue that measures of proximity and responsibility are conceptually interchangeable, and demonstrate that they can be harmonised and compared cross-nationally. Finally, we suggest ways in which these extended measures of childcare can be made increasingly comparable cross-nationally.Time-diary data, measurement of parental childcare, cross-national harmonisation of measures of childcare, time geography
    • …
    corecore