473 research outputs found

    The design and study of pedagogical paper recommendation

    Get PDF
    For learners engaging in senior-level courses, tutors in many cases would like to pick some articles as supplementary reading materials for them each week. Unlike researchers ‘Googling’ papers from the Internet, tutors, when making recommendations, should consider course syllabus and their assessment of learners along many dimensions. As such, simply ‘Googling’ articles from the Internet is far from enough. That is, learner models of each individual, including their learning interest, knowledge, goals, etc. should be considered when making paper recommendations, since the recommendation should be carried out so as to ensure that the suitability of a paper for a learner is calculated as the summation of the fitness of the appropriateness of it to help the learner in general. This type of the recommendation is called a Pedagogical Paper Recommender.In this thesis, we propose a set of recommendation methods for a Pedagogical Paper Recommender and study the various important issues surrounding it. Experimental studies confirm that making recommendations to learners in social learning environments is not the same as making recommendation to users in commercial environments such as Amazon.com. In such learning environments, learners are willing to accept items that are not interesting, yet meet their learning goals in some way or another; learners’ overall impression towards each paper is not solely dependent on the interestingness of the paper, but also other factors, such as the degree to which the paper can help to meet their ‘cognitive’ goals.It is also observed that most of the recommendation methods are scalable. Although the degree of this scalability is still unclear, we conjecture that those methods are consistent to up to 50 papers in terms of recommendation accuracy. The experiments conducted so far and suggestions made on the adoption of recommendation methods are based on the data we have collected during one semester of a course. Therefore, the generality of results needs to undergo further validation before more certain conclusion can be drawn. These follow up studies should be performed (ideally) in more semesters on the same course or related courses with more newly added papers. Then, some open issues can be further investigated. Despite these weaknesses, this study has been able to reach the research goals set out in the proposed pedagogical paper recommender which, although sounding intuitive, unfortunately has been largely ignored in the research community. Finding a ‘good’ paper is not trivial: it is not about the simple fact that the user will either accept the recommended items, or not; rather, it is a multiple step process that typically entails the users navigating the paper collections, understanding the recommended items, seeing what others like/dislike, and making decisions. Therefore, a future research goal to proceed from the study here is to design for different kinds of social navigation in order to study their respective impacts on user behavior, and how over time, user behavior feeds back to influence the system performance

    Text mining with exploitation of user\u27s background knowledge : discovering novel association rules from text

    Get PDF
    The goal of text mining is to find interesting and non-trivial patterns or knowledge from unstructured documents. Both objective and subjective measures have been proposed in the literature to evaluate the interestingness of discovered patterns. However, objective measures alone are insufficient because such measures do not consider knowledge and interests of the users. Subjective measures require explicit input of user expectations which is difficult or even impossible to obtain in text mining environments. This study proposes a user-oriented text-mining framework and applies it to the problem of discovering novel association rules from documents. The developed system, uMining, consists of two major components: a background knowledge developer and a novel association rules miner. The background knowledge developer learns a user\u27s background knowledge by extracting keywords from documents already known to the user (background documents) and developing a concept hierarchy to organize popular keywords. The novel association rule miner discovers association rules among noun phrases extracted from relevant documents (target documents) and compares the rules with the background knowledge to predict the rule novelty to the particular user (useroriented novelty). The user-oriented novelty measure is defined as the semantic distance between the antecedent and the consequent of a rule in the background knowledge. It consists of two components: occurrence distance and connection distance. The former considers the co-occurrences of two keywords in the background documents: the more the shorter the distance. The latter considers the common connections of with others in the concept hierarchy. It is defined as the length of the connecting the two keywords in the concept hierarchy: the longer the path, distance. The user-oriented novelty measure is evaluated from two perspectives: novelty prediction accuracy and usefulness indication power. The results show that the useroriented novelty measure outperforms the WordNet novelty measure and the compared objective measures in term of predicting novel rules and identifying useful rules

    Evaluating Human-Language Model Interaction

    Full text link
    Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems and dimensions to consider when designing evaluation metrics. Compared to standard, non-interactive evaluation, HALIE captures (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality (e.g., enjoyment and ownership). We then design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation. With four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21 Labs' Jurassic-1), we find that better non-interactive performance does not always translate to better human-LM interaction. In particular, we highlight three cases where the results from non-interactive and interactive metrics diverge and underscore the importance of human-LM interaction for LM evaluation.Comment: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI

    Feeling the landscape: six psychological studies into landscape experience

    Get PDF
    In de zes studies van deze dissertatie zijn een aantal zowel praktische als theoretische vraagstukken met betrekking tot de beleving van landschap onderzocht. Landschapsbeleving wordt gedefinieerd als een dynamisch proces, als het resultaat van interacties tussen cultureel en biologisch bepaalde, algemene determinanten van de ervaring. In de studies wordt een aantal verschillende psychologische theoriën getoetst, en samen tonen deze het belang aan van psychologisch onderzoek naar landschapsbeleving. Het is de toepassing van methodologiën en theoretische perspectieven uit de psychologie, die het mogelijk heeft gemaakt tot de inzichten te komen over de interactie tussen mens en landschap, die het resultaat zijn van deze studie

    Information Reliability on the Social Web - Models and Applications in Intelligent User Interfaces

    Get PDF
    The Social Web is undergoing continued evolution, changing the paradigm of information production, processing and sharing. Information sources have shifted from institutions to individual users, vastly increasing the amount of information available online. To overcome the information overload problem, modern filtering algorithms have enabled people to find relevant information in efficient ways. However, noisy, false and otherwise useless information remains a problem. We believe that the concept of information reliability needs to be considered along with information relevance to adapt filtering algorithms to today's Social Web. This approach helps to improve information search and discovery and can also improve user experience by communicating aspects of information reliability.This thesis first shows the results of a cross-disciplinary study into perceived reliability by reporting on a novel user experiment. This is followed by a discussion of modeling, validating, and communicating information reliability, including its various definitions across disciplines. A selection of important reliability attributes such as source credibility, competence, influence and timeliness are examined through different case studies. Results show that perceived reliability of information can vary greatly across contexts. Finally, recent studies on visual analytics, including algorithm explanations and interactive interfaces are discussed with respect to their impact on the perception of information reliability in a range of application domains

    Semantically-guided evolutionary knowledge discovery from texts

    Get PDF
    This thesis proposes a new approach for structured knowledge discovery from texts which considers both the mining process itself, the evaluation of this knowledge by the model, and the human assessment of the quality of the outcome.This is achieved by integrating Natural-Language technology and Genetic Algorithms to produce explanatory novel hypotheses. Natural-Language techniques are specifically used to extract genre-based information from text documents. Additional semantic and rhetorical information for generating training data and for feeding a semistructured Latent Semantic Analysis process is also captured.The discovery process is modeled by a semantically-guided Genetic Algorithm which uses training data to guide the search and optimization process. A number of novel criteria to evaluate the quality of the new knowledge are proposed. Consequently, new genetic operations suitable for text mining are designed, and techniques for Evolutionary Multi-Objective Optimization are adapted for the model to trade off between different criteria in the hypotheses.Domain experts were used in an experiment to assess the quality of the hypotheses produced by the model so as to establish their effectiveness in terms of novel and interesting knowledge. The assessment showed encouraging results for the discovered knowledge and for the correlation between the model and the human opinions

    Inferring interestingness in online social networks

    Get PDF
    Information sharing and user-generated content on the Internet has given rise to the increased presence of uninteresting and ‘noisy’ information in media streams on many online social networks. Although there is a lot of ‘interesting’ information also shared amongst users, the noise increases the cognitive burden in terms of the users’ abilities to identify what is interesting and may increase the chance of missing content that is useful or important. Additionally, users on such platforms are generally limited to receiving information only from those that they are directly linked to on the social graph, meaning that users exist within distinct content ‘bubbles’, further limiting the chance of receiving interesting and relevant information from outside of the immediate social circle. In this thesis, Twitter is used as a platform for researching methods for deriving “interestingness” through popularity as given by the mechanism of retweeting, which allows information to be propagated further between users on Twitter’s social graph. Retweet behaviours are studied, and features; such as those surrounding Tweet audience, information redundancy, and propagation depth through path-length, are uncovered to help relate retweet action to the underlying social graph and the communities it represents. This culminates in research into a methodology for assigning scores to Tweets based on their ‘quality’, which is validated and shown to perform well in various situations

    Exploring the role of late-occurring nonspecific retroactive interference and interest on recall

    Get PDF
    A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.Any form of post-encoding distraction, known as Nonspecific Retroactive Interference (NRI), may cause forgetting (Keppel, 1968; Wixted, 2004). However, recent experiments have not always found evidence for NRI and its effect may be very mild. NRI was tested across five experiments which aimed to take the epistemological approach of cognitive memory and forgetting research, and to incorporate the educational psychology domain of motivated learning through interest development. This enabled the exploration of factors which may affect NRI based forgetting, including wakeful rest, mind wandering (MW), and various forms of interest. Verbal memory was tested within a short-term (five-minute retention intervals) learning and recall setting by comparing conditions where NRI (usually elicited by spot-the-difference tasks) was present or absent. This project carefully manipulated the role of prior-tasks, measurements of interest and MW (depending on conceptualisation), and the NRI task. As a result, the thesis was able to explore the role of fatigue vs. cumulative similarity interference, the reliability of NRI effects, and provide a cognitive explanation of interest-based learning. The results demonstrated that (1) overall effects of NRI were more reliable than first hypothesised. (2) Interest is separate from NRI within this paradigm as it increases recall during the encoding phase, with interesting facts being retained more, but experiencing a similar susceptibility to interference as less interesting facts. (3) Subjective interest increases recall, with dispositional individual interest modulating the amount of situational interest evoked by the stimuli. (4) MW decreases recall but any interaction with interest requires further exploration. (5) Recall was consistently worse if the NRI condition was late-occurring, and there was limited evidence for a fatigue explanation. It is put forward that NRI is a low-level form of diversion interference which can accumulate with similarity-based PI, and potentially cognitive load

    Exploratory search in time-oriented primary data

    Get PDF
    In a variety of research fields, primary data that describes scientific phenomena in an original condition is obtained. Time-oriented primary data, in particular, is an indispensable data type, derived from complex measurements depending on time. Today, time-oriented primary data is collected at rates that exceed the domain experts’ abilities to seek valuable information undiscovered in the data. It is widely accepted that the magnitudes of uninvestigated data will disclose tremendous knowledge in data-driven research, provided that domain experts are able to gain insight into the data. Domain experts involved in data-driven research urgently require analytical capabilities. In scientific practice, predominant activities are the generation and validation of hypotheses. In analytical terms, these activities are often expressed in confirmatory and exploratory data analysis. Ideally, analytical support would combine the strengths of both types of activities. Exploratory search (ES) is a concept that seamlessly includes information-seeking behaviors ranging from search to exploration. ES supports domain experts in both gaining an understanding of huge and potentially unknown data collections and the drill-down to relevant subsets, e.g., to validate hypotheses. As such, ES combines predominant tasks of domain experts applied to data-driven research. For the design of useful and usable ES systems (ESS), data scientists have to incorporate different sources of knowledge and technology. Of particular importance is the state-of-the-art in interactive data visualization and data analysis. Research in these factors is at heart of Information Visualization (IV) and Visual Analytics (VA). Approaches in IV and VA provide meaningful visualization and interaction designs, allowing domain experts to perform the information-seeking process in an effective and efficient way. Today, bestpractice ESS almost exclusively exist for textual data content, e.g., put into practice in digital libraries to facilitate the reuse of digital documents. For time-oriented primary data, ES mainly remains at a theoretical state. Motivation and Problem Statement. This thesis is motivated by two main assumptions. First, we expect that ES will have a tremendous impact on data-driven research for many research fields. In this thesis, we focus on time-oriented primary data, as a complex and important data type for data-driven research. Second, we assume that research conducted to IV and VA will particularly facilitate ES. For time-oriented primary data, however, novel concepts and techniques are required that enhance the design and the application of ESS. In particular, we observe a lack of methodological research in ESS for time-oriented primary data. In addition, the size, the complexity, and the quality of time-oriented primary data hampers the content-based access, as well as the design of visual interfaces for gaining an overview of the data content. Furthermore, the question arises how ESS can incorporate techniques for seeking relations between data content and metadata to foster data-driven research. Overarching challenges for data scientists are to create usable and useful designs, urgently requiring the involvement of the targeted user group and support techniques for choosing meaningful algorithmic models and model parameters. Throughout this thesis, we will resolve these challenges from conceptual, technical, and systemic perspectives. In turn, domain experts can benefit from novel ESS as a powerful analytical support to conduct data-driven research. Concepts for Exploratory Search Systems (Chapter 3). We postulate concepts for the ES in time-oriented primary data. Based on a survey of analysis tasks supported in IV and VA research, we present a comprehensive selection of tasks and techniques relevant for search and exploration activities. The assembly guides data scientists in the choice of meaningful techniques presented in IV and VA. Furthermore, we present a reference workflow for the design and the application of ESS for time-oriented primary data. The workflow divides the data processing and transformation process into four steps, and thus divides the complexity of the design space into manageable parts. In addition, the reference workflow describes how users can be involved in the design. The reference workflow is the framework for the technical contributions of this thesis. Visual-Interactive Preprocessing of Time-Oriented Primary Data (Chapter 4). We present a visual-interactive system that enables users to construct workflows for preprocessing time-oriented primary data. In this way, we introduce a means of providing content-based access. Based on a rich set of preprocessing routines, users can create individual solutions for data cleansing, normalization, segmentation, and other preprocessing tasks. In addition, the system supports the definition of time series descriptors and time series distance measures. Guidance concepts support users in assessing the workflow generalizability, which is important for large data sets. The execution of the workflows transforms time-oriented primary data into feature vectors, which can subsequently be used for downstream search and exploration techniques. We demonstrate the applicability of the system in usage scenarios and case studies. Content-Based Overviews (Chapter 5). We introduce novel guidelines and techniques for the design of contentbased overviews. The three key factors are the creation of meaningful data aggregates, the visual mapping of these aggregates into the visual space, and the view transformation providing layouts of these aggregates in the display space. For each of these steps, we characterize important visualization and interaction design parameters allowing the involvement of users. We introduce guidelines supporting data scientists in choosing meaningful solutions. In addition, we present novel visual-interactive quality assessment techniques enhancing the choice of algorithmic model and model parameters. Finally, we present visual interfaces enabling users to formulate visual queries of the time-oriented data content. In this way, we provide means of combining content-based exploration with content-based search. Relation Seeking Between Data Content and Metadata (Chapter 6). We present novel visual interfaces enabling domain experts to seek relations between data content and metadata. These interfaces can be integrated into ESS to bridge analytical gaps between the data content and attached metadata. In three different approaches, we focus on different types of relations and define algorithmic support to guide users towards most interesting relations. Furthermore, each of the three approaches comprises individual visualization and interaction designs, enabling users to explore both the data and the relations in an efficient and effective way. We demonstrate the applicability of our interfaces with usage scenarios, each conducted together with domain experts. The results confirm that our techniques are beneficial for seeking relations between data content and metadata, particularly for data-centered research. Case Studies - Exploratory Search Systems (Chapter 7). In two case studies, we put our concepts and techniques into practice. We present two ESS constructed in design studies with real users, and real ES tasks, and real timeoriented primary data collections. The web-based VisInfo ESS is a digital library system facilitating the visual access to time-oriented primary data content. A content-based overview enables users to explore large collections of time series measurements and serves as a baseline for content-based queries by example. In addition, VisInfo provides a visual interface for querying time oriented data content by sketch. A result visualization combines different views of the data content and metadata with faceted search functionality. The MotionExplorer ESS supports domain experts in human motion analysis. Two content-based overviews enhance the exploration of large collections of human motion capture data from two perspectives. MotionExplorer provides a search interface, allowing domain experts to query human motion sequences by example. Retrieval results are depicted in a visual-interactive view enabling the exploration of variations of human motions. Field study evaluations performed for both ESS confirm the applicability of the systems in the environment of the involved user groups. The systems yield a significant improvement of both the effectiveness and the efficiency in the day-to-day work of the domain experts. As such, both ESS demonstrate how large collections of time-oriented primary data can be reused to enhance data-centered research. In essence, our contributions cover the entire time series analysis process starting from accessing raw time-oriented primary data, processing and transforming time series data, to visual-interactive analysis of time series. We present visual search interfaces providing content-based access to time-oriented primary data. In a series of novel explorationsupport techniques, we facilitate both gaining an overview of large and complex time-oriented primary data collections and seeking relations between data content and metadata. Throughout this thesis, we introduce VA as a means of designing effective and efficient visual-interactive systems. Our VA techniques empower data scientists to choose appropriate models and model parameters, as well as to involve users in the design. With both principles, we support the design of usable and useful interfaces which can be included into ESS. In this way, our contributions bridge the gap between search systems requiring exploration support and exploratory data analysis systems requiring visual querying capability. In the ESS presented in two case studies, we prove that our techniques and systems support data-driven research in an efficient and effective way

    Sentiment Analysis in Social Streams

    Get PDF
    In this chapter, we review and discuss the state of the art on sentiment analysis in social streams—such as web forums, microblogging systems, and social networks, aiming to clarify how user opinions, affective states, and intended emo tional effects are extracted from user generated content, how they are modeled, and howthey could be finally exploited.We explainwhy sentiment analysistasks aremore difficult for social streams than for other textual sources, and entail going beyond classic text-based opinion mining techniques. We show, for example, that social streams may use vocabularies and expressions that exist outside the mainstream of standard, formal languages, and may reflect complex dynamics in the opinions and sentiments expressed by individuals and communities
    • 

    corecore