11 research outputs found

    A Transdisciplinary Approach to Construct Search and Integration

    Get PDF
    Human behaviors play a leading role in many critical areas including the adoption of information systems, prevention of many diseases, and educational achievement. There has been explosive growth of research in the behavioral sciences during the past decade. Behavioral science researchers are now recognizing that due to this ever expanding volume of research it is impossible to find and incorporate all appropriate inter-related disciplinary knowledge. Unfortunately, due to inconsistent language and construct proliferation across disciplines, this excellent but disconnected research has not been utilized fully or effectively to address problems of human health or other areas. This paper introduces a newly developed, cutting edge technology, the Inter-Nomological Network (INN) which for the first time provides an integrating tool to behavioral scientists so they may effectively build upon prior research. We expect INN to provide the first step in moving the behavioral sciences into an era of integrated science. INN is based on Latent Semantic Analysis (LSA), a theory of language use with associated automatic computerized text analysis capabilities

    Clustering and its Application in Requirements Engineering

    Get PDF
    Large scale software systems challenge almost every activity in the software development life-cycle, including tasks related to eliciting, analyzing, and specifying requirements. Fortunately many of these complexities can be addressed through clustering the requirements in order to create abstractions that are meaningful to human stakeholders. For example, the requirements elicitation process can be supported through dynamically clustering incoming stakeholders’ requests into themes. Cross-cutting concerns, which have a significant impact on the architectural design, can be identified through the use of fuzzy clustering techniques and metrics designed to detect when a theme cross-cuts the dominant decomposition of the system. Finally, traceability techniques, required in critical software projects by many regulatory bodies, can be automated and enhanced by the use of cluster-based information retrieval methods. Unfortunately, despite a significant body of work describing document clustering techniques, there is almost no prior work which directly addresses the challenges, constraints, and nuances of requirements clustering. As a result, the effectiveness of software engineering tools and processes that depend on requirements clustering is severely limited. This report directly addresses the problem of clustering requirements through surveying standard clustering techniques and discussing their application to the requirements clustering process

    Context-sensitive information retrieval

    Full text link
    Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal

    Recommender systems in industrial contexts

    Full text link
    This thesis consists of four parts: - An analysis of the core functions and the prerequisites for recommender systems in an industrial context: we identify four core functions for recommendation systems: Help do Decide, Help to Compare, Help to Explore, Help to Discover. The implementation of these functions has implications for the choices at the heart of algorithmic recommender systems. - A state of the art, which deals with the main techniques used in automated recommendation system: the two most commonly used algorithmic methods, the K-Nearest-Neighbor methods (KNN) and the fast factorization methods are detailed. The state of the art presents also purely content-based methods, hybridization techniques, and the classical performance metrics used to evaluate the recommender systems. This state of the art then gives an overview of several systems, both from academia and industry (Amazon, Google ...). - An analysis of the performances and implications of a recommendation system developed during this thesis: this system, Reperio, is a hybrid recommender engine using KNN methods. We study the performance of the KNN methods, including the impact of similarity functions used. Then we study the performance of the KNN method in critical uses cases in cold start situation. - A methodology for analyzing the performance of recommender systems in industrial context: this methodology assesses the added value of algorithmic strategies and recommendation systems according to its core functions.Comment: version 3.30, May 201

    Genre, academic writing and e-learning: An integrated tertiary level Taiwan-based study

    Get PDF
    The research reported here has two main focus points: online learning and the teaching of academic writing to learners of English as an additional language. At its core is a study involving an intensive genre-centered writing course conducted in a tertiary educational institution in Taiwan and delivered in three modes - face-to-face, fully online and blended. That study, preceded by a pilot study conducted in New Zealand, involved a writing course that focused on cognitive genres (e.g. argument) that have been identified as being fundamental to academic writing. It included model texts (constructed in segments with accompanying discussion of their language and structure) and writing exercises. Analysis of post-course questionnaires and focus group discussions revealed a high level of satisfaction with the course. Analysis of pre-test and post-test writing tasks in terms of a wide range of criteria provided evidence of improvement in the writing of course participants in a range of areas. Although those involved in blended and face-to-face modes were most positive about the advantages of the course, it was not necessarily always the case that they outperformed online group members in terms of improvement in writing. Also included are two questionnaire-based surveys of samples of teachers of English in tertiary level educational institutions in Taiwan. The first investigated attitudes and practices in relation to the integration of instructional technology into teaching. Although the vast majority of survey participants believed that it was important to incorporate instructional technology into their teaching, this was not necessarily reflected in their more specific beliefs and practices. Very few reported having spent more than a few hours attending instructional technology-related workshops, more than half indicated that very little or none of the interaction in their language classes was computer-mediated, only approximately one third reported having used a learning platform in the six weeks prior to the survey, and over one third reported that they had never used a learning platform. The second questionnaire-based survey investigated attitudes and practices in relation to the teaching and assessment of writing. Although survey participants were familiar with process-centered approaches to the teaching of writing, they appeared to be much less familiar with genre-centered approaches. Using model texts as a way of introducing, demonstrating and explaining language in use seemed to be the exception rather than the rule. Additionally, although they reported spending a considerable amount of time grading and commenting on their students' writing, most of them indicated that they did not design grading criteria that related specifically to course content, and many of the sample comments on student writing that they provided were of a type that is unlikely to help students to improve their writing. Overall, the study provides evidence that a genre-centered academic writing course can be associated with a high level of student satisfaction and can lead to demonstrable improvement in student writing. However, it also demonstrates that teachers of English at tertiary level in Taiwan are generally unfamiliar with this sort of approach and that many of them are not yet ready to provide their students with options in terms of delivery modes

    From Bugs to Decision Support – Leveraging Historical Issue Reports in Software Evolution

    Get PDF
    Software developers in large projects work in complex information landscapes and staying on top of all relevant software artifacts is an acknowledged challenge. As software systems often evolve over many years, a large number of issue reports is typically managed during the lifetime of a system, representing the units of work needed for its improvement, e.g., defects to fix, requested features, or missing documentation. Efficient management of incoming issue reports requires the successful navigation of the information landscape of a project. In this thesis, we address two tasks involved in issue management: Issue Assignment (IA) and Change Impact Analysis (CIA). IA is the early task of allocating an issue report to a development team, and CIA is the subsequent activity of identifying how source code changes affect the existing software artifacts. While IA is fundamental in all large software projects, CIA is particularly important to safety-critical development. Our solution approach, grounded on surveys of industry practice as well as scientific literature, is to support navigation by combining information retrieval and machine learning into Recommendation Systems for Software Engineering (RSSE). While the sheer number of incoming issue reports might challenge the overview of a human developer, our techniques instead benefit from the availability of ever-growing training data. We leverage the volume of issue reports to develop accurate decision support for software evolution. We evaluate our proposals both by deploying an RSSE in two development teams, and by simulation scenarios, i.e., we assess the correctness of the RSSEs' output when replaying the historical inflow of issue reports. In total, more than 60,000 historical issue reports are involved in our studies, originating from the evolution of five proprietary systems for two companies. Our results show that RSSEs for both IA and CIA can help developers navigate large software projects, in terms of locating development teams and software artifacts. Finally, we discuss how to support the transfer of our results to industry, focusing on addressing the context dependency of our tool support by systematically tuning parameters to a specific operational setting

    Exploitation de connaissances sémantiques externes dans les représentations vectorielles en recherche documentaire

    Get PDF
    The work presented in this thesis deals with several problems met in information retrieval (IR), task which one can summarise as identifying, in a collection of "documents", a subset of documents carrying a sought information, i.e.. relevant for a request expressed by a user. In the case of textual documents, to which we limited ourselves within the framework of this thesis, a significant part of the difficulty lies in ambiguity inherent to human languages. The interaction with the user is also approached in our work, by studying a tool enabling a natural language access to a database. Finally, some techniques which permit the visualisation of large collections of documents are also presented. In this document we first of all describe the principal models of IR by highlighting the relations which exist with some manual technics of IR and document retrieval, developed during the past centuries. We present the principle of document indexing, allowing us to represent documents in a multidimensional space, and the use of this representation by a vectorial model. After having reviewed the principal improvements made these last years with vectorial research systems, including the preprocessings of collections, the indexing mechanism and measurements of similarities between documents, we detail some recent usecases of additional semantic resources (semantic dictionaries, thesaurus, networks, ontologies) reported in scientific literature for the indexing task. We then present more in detail the semantic indexing principle of textual documents by using a thesaurus, consisting in integrating in the document's representation space at least part of the informational contents of hierarchical semantic resources. We propose a general framework allowing us to describe and position various possible techniques to carry out the semantic indexing by adapting, if possible, the specificity of the descriptions resulting from the semantic resources to the data to be represented. We use this framework to describe three families of criteria usable for semantic indexing, each one having its own characteristics. For each of these families, we give the specific algorithms allowing the computation of the criteria. The first two families allow us to consider several criteria already known in feature selection. Moreover we show that, unfortunately, many of these criteria are in fact not very effective for the considered task. The third family allows us to introduce a completely new criterion, the Minimum Redundancy Cut criterion (MRC), built on the basis of the information theory and allowing us to obtain index terms having a probability of occurrence in the collection of documents as well balanced as possible. Finally, we treat the case of semantic index independent of the data (statically choosen), allowing a parameterisation of the level of generality of the index terms. Some of the criteria suggested for semantic indexing has been empirically evaluated. To judge their relevance, we used a well known vectorial system (the Smart IR system) and measured the performances of IR obtained with various reference collections. Those collections was indexed on the basis of the studied criterion, by taking into account the strongly structuring semantic relation of hyper/hyponymy ("is-a" relation), given by two different semantic resources. By comparing results obtained with the performances of a traditional indexing (using the lemmas of the words as representation space), we can show on one hand the relevance of the semantic indexings (in RD) and on the other hand the quality of the proposed criterion (MRC). Concerning man-machine interaction, we present a general outline allowing to build in a relatively fast and systematic way systems with mixed initiative, giving the human user a large (and natural) latitude in the control of the dialogue. This outline is usable in typical database research-task applications (where the database is hidden to the user, but the latter knows exactly which information they wish to find) as well as advice-task applications, for which the users does not necessarily have a precise idea of their needs, and uses the system not only for specifing their wishes, but also a set of propositions as a final result. We particularly stress the techniques allowing us to obtain a robust system, able to deal with speech recognizer failures. Concerning the visualisation of large textual data collections, we present an application of the correspondences analysis (allowing to highlight similarities and oppositions for various groups of entity, built on the basis of additional features present in the DB) to the case of patents data. In addition, we propose a method (based on the bootstrap replication principle) allowing us to determine a confidence interval for relative positionings of various groups, thus permit to immediately judge the reliability of the visually apparent similarities or oppositions

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    An Investigation of Autism Support Groups on Facebook

    Get PDF
    Autism-affected users, such as autism patients, caregivers, parents, family members, and researchers, currently seek informational support and social support from communities on social media. To reveal the information needs of autism- affected users, this study centers on the research of users’ interactions and information sharing within autism communities on social media. It aims to understand how autism-affected users utilize support groups on Facebook. A systematic method was proposed to aid in the data analysis including social network analysis, topic modeling, sentiment analysis, and inferential analysis. Social network analysis method was adopted to reveal the interaction patterns appearing in the groups, and topic modeling method was employed to uncover the discussion themes that users were concerned with in their daily lives. Sentiment analysis method helped analyze the emotional characteristics of the content that users expressed in the groups. Inferential analysis method was applied to compare the similarities and differences among different autism support groups found on Facebook. This study collected user-generated content from five sampled support groups (an awareness group, a treatment group, a parents group, a research group, and a local support group) on Facebook. Findings show that the discussion topics varied in different groups. Influential users in each Facebook support group were identified through the analysis of the interaction network. The results indicated that the influential users not only attracted more attention from other group members but also led the discussion topics in the group. In addition, it was examined that autism support groups on Facebook offered a supportive emotional atmosphere for group members. The findings of this study revealed the characteristics of user interactions and information exchanges in autism support groups on social media. Theoretically, the findings demonstrated the significance of social media for autism users. The unique implication of this study is to identify support groups on Facebook as a source of informational, social, and emotional support for autism-related users. The methodology applied in this study presented a systematic approach to evaluating the information exchange in health-related support groups on social media. Further, it investigated the potential role of technology in the social lives of autism-related users. The outcomes of this study can contribute to improving online intervention programs by highlighting effective communication approaches

    Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020). This edition of the conference is held in Bologna and organised by the University of Bologna. The CLiC-it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after six years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges
    corecore