541,726 research outputs found

    Interactive Exploration of Temporal Event Sequences

    Get PDF
    Life can often be described as a series of events. These events contain rich information that, when put together, can reveal history, expose facts, or lead to discoveries. Therefore, many leading organizations are increasingly collecting databases of event sequences: Electronic Medical Records (EMRs), transportation incident logs, student progress reports, web logs, sports logs, etc. Heavy investments were made in data collection and storage, but difficulties still arise when it comes to making use of the collected data. Analyzing millions of event sequences is a non-trivial task that is gaining more attention and requires better support due to its complex nature. Therefore, I aimed to use information visualization techniques to support exploratory data analysis---an approach to analyzing data to formulate hypotheses worth testing---for event sequences. By working with the domain experts who were analyzing event sequences, I identified two important scenarios that guided my dissertation: First, I explored how to provide an overview of multiple event sequences? Lengthy reports often have an executive summary to provide an overview of the report. Unfortunately, there was no executive summary to provide an overview for event sequences. Therefore, I designed LifeFlow, a compact overview visualization that summarizes multiple event sequences, and interaction techniques that supports users' exploration. Second, I examined how to support users in querying for event sequences when they are uncertain about what they are looking for. To support this task, I developed similarity measures (the M&M measure 1-2) and user interfaces (Similan 1-2) for querying event sequences based on similarity, allowing users to search for event sequences that are similar to the query. After that, I ran a controlled experiment comparing exact match and similarity search interfaces, and learned the advantages and disadvantages of both interfaces. These lessons learned inspired me to develop Flexible Temporal Search (FTS) that combines the benefits of both interfaces. FTS gives confident and countable results, and also ranks results by similarity. I continued to work with domain experts as partners, getting them involved in the iterative design, and constantly using their feedback to guide my research directions. As the research progressed, several short-term user studies were conducted to evaluate particular features of the user interfaces. Both quantitative and qualitative results were reported. To address the limitations of short-term evaluations, I included several multi-dimensional in-depth long-term case studies with domain experts in various fields to evaluate deeper benefits, validate generalizability of the ideas, and demonstrate practicability of this research in non-laboratory environments. The experience from these long-term studies was combined into a set of design guidelines for temporal event sequence exploration. My contributions from this research are LifeFlow, a visualization that compactly displays summaries of multiple event sequences, along with interaction techniques for users' explorations; similarity measures (the M&M measure 1-2) and similarity search interfaces (Similan 1-2) for querying event sequences; Flexible Temporal Search (FTS), a hybrid query approach that combines the benefits of exact match and similarity search; and case study evaluations that results in a process model and a set of design guidelines for temporal event sequence exploration. Finally, this research has revealed new directions for exploring event sequences

    Lessons learned: structuring knowledge codification and abstraction to provide meaningful information for learning

    Get PDF
    Purpose – To increase the spread and reuse of lessons learned (LLs), the purpose of this paper is to develop a standardised information structure to facilitate concise capture of the critical elements needed to engage secondary learners and help them apply lessons to their contexts. Design/methodology/approach – Three workshops with industry practitioners, an analysis of over 60 actual lessons from private and public sector organisations and seven practitioner interviews provided evidence of actual practice. Design science was used to develop a repeatable/consistent information model of LL content/structure. Workshop analysis and theory provided the coding template. Situation theory and normative analysis were used to define the knowledge and rule logic to standardise fields. Findings – Comparing evidence from practice against theoretical prescriptions in the literature highlighted important enhancements to the standard LL model. These were a consistent/concise rule and context structure, appropriate emotional language, reuse and control criteria to ensure lessons were transferrable and reusable in new situations. Research limitations/implications – Findings are based on a limited sample. Long-term benefits of standardisation and use need further research. A larger sample/longitudinal usage study is planned. Practical implications – The implementation of the LL structure was well-received in one government user site and other industry user sites are pending. Practitioners validated the design logic for improving capture and reuse of lessons to render themeasily translatable to a new learner’s context. Originality/value – The new LL structure is uniquely grounded in user needs, developed from existing best practice and is an original application of normative and situation theory to provide consistent rule logic for context/content structure

    Cooperation between expert knowledge and data mining discovered knowledge: Lessons learned

    Get PDF
    Expert systems are built from knowledge traditionally elicited from the human expert. It is precisely knowledge elicitation from the expert that is the bottleneck in expert system construction. On the other hand, a data mining system, which automatically extracts knowledge, needs expert guidance on the successive decisions to be made in each of the system phases. In this context, expert knowledge and data mining discovered knowledge can cooperate, maximizing their individual capabilities: data mining discovered knowledge can be used as a complementary source of knowledge for the expert system, whereas expert knowledge can be used to guide the data mining process. This article summarizes different examples of systems where there is cooperation between expert knowledge and data mining discovered knowledge and reports our experience of such cooperation gathered from a medical diagnosis project called Intelligent Interpretation of Isokinetics Data, which we developed. From that experience, a series of lessons were learned throughout project development. Some of these lessons are generally applicable and others pertain exclusively to certain project types

    What’s wrong with the minimal conception of innateness in cognitive science?

    Get PDF
    One of the classic debates in cognitive science is between nativism and empiricism about the development of psychological capacities. In principle, the debate is empirical. However, in practice nativist hypotheses have also been challenged for relying on an ill-defined, or even unscientific, notion of innateness as that which is “not learned”. Here this minimal conception of innateness is defended on four fronts. First, it is argued that the minimal conception is crucial to understanding the nativism-empiricism debate, when properly construed; Second, various objections to the minimal conception—that it risks overgeneralization, lacks an account of learning, frustrates genuine explanations of psychological development, and fails to unify different notions of innateness across the sciences—are rebutted. Third, it is argued that the minimal conception avoids the shortcomings of primitivism, the prominent view that innate capacities are those that are not acquired via a psychological process in development. And fourth, the minimal conception undermines some attempts to identify innateness with a natural kind. So in short, we have little reason to reject, and good reason to accept, the minimal conception of innateness in cognitive science

    Transfer Learning for Speech and Language Processing

    Full text link
    Transfer learning is a vital technique that generalizes models trained for one setting or task to other settings or tasks. For example in speech recognition, an acoustic model trained for one language can be used to recognize speech in another language, with little or no re-training data. Transfer learning is closely related to multi-task learning (cross-lingual vs. multilingual), and is traditionally studied in the name of `model adaptation'. Recent advance in deep learning shows that transfer learning becomes much easier and more effective with high-level abstract features learned by deep models, and the `transfer' can be conducted not only between data distributions and data types, but also between model structures (e.g., shallow nets and deep nets) or even model types (e.g., Bayesian models and neural models). This review paper summarizes some recent prominent research towards this direction, particularly for speech and language processing. We also report some results from our group and highlight the potential of this very interesting research field.Comment: 13 pages, APSIPA 201

    On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law

    Full text link
    Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set. OOD benchmarks are designed to present a different joint distribution of data and labels between training and test time. VQA-CP has become the standard OOD benchmark for visual question answering, but we discovered three troubling practices in its current use. First, most published methods rely on explicit knowledge of the construction of the OOD splits. They often rely on ``inverting'' the distribution of labels, e.g. answering mostly 'yes' when the common training answer is 'no'. Second, the OOD test set is used for model selection. Third, a model's in-domain performance is assessed after retraining it on in-domain splits (VQA v2) that exhibit a more balanced distribution of labels. These three practices defeat the objective of evaluating generalization, and put into question the value of methods specifically designed for this dataset. We show that embarrassingly-simple methods, including one that generates answers at random, surpass the state of the art on some question types. We provide short- and long-term solutions to avoid these pitfalls and realize the benefits of OOD evaluation
    • …
    corecore