1,525 research outputs found

    Improving requirements with NLP techniques

    Get PDF
    Elaborating “good” requirements specifications is an important factor for the success of a software project. Requirements are normally expressed using textual descriptions in natural language, but not without problems. Some requirements documentation techniques, such as use cases specifications, often focus on functionality and leave many concerns understated in the text and scattered through several documents. These concerns, commonly known as crosscutting or architecturally-relevant concerns, often come from business goals or quality attributes that must be clearly identified by analysts and developers, as these concerns can have a far-reaching effect in the development process. Not treating these concerns at early development stages can lead to poor design solutions that become difficult (and costly) to fix afterwards. Unfortunately, searching for concerns in textual requirements is a difficult and time-consuming task for analysts, because requirements are often poorly modularized and there is text duplicated across documents. (Párrafo extraído del texto a modo de resumen)Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Improving requirements with NLP techniques

    Get PDF
    Elaborating “good” requirements specifications is an important factor for the success of a software project. Requirements are normally expressed using textual descriptions in natural language, but not without problems. Some requirements documentation techniques, such as use cases specifications, often focus on functionality and leave many concerns understated in the text and scattered through several documents. These concerns, commonly known as crosscutting or architecturally-relevant concerns, often come from business goals or quality attributes that must be clearly identified by analysts and developers, as these concerns can have a far-reaching effect in the development process. Not treating these concerns at early development stages can lead to poor design solutions that become difficult (and costly) to fix afterwards. Unfortunately, searching for concerns in textual requirements is a difficult and time-consuming task for analysts, because requirements are often poorly modularized and there is text duplicated across documents. (Párrafo extraído del texto a modo de resumen)Sociedad Argentina de Informática e Investigación Operativa (SADIO

    From Bugs to Decision Support – Leveraging Historical Issue Reports in Software Evolution

    Get PDF
    Software developers in large projects work in complex information landscapes and staying on top of all relevant software artifacts is an acknowledged challenge. As software systems often evolve over many years, a large number of issue reports is typically managed during the lifetime of a system, representing the units of work needed for its improvement, e.g., defects to fix, requested features, or missing documentation. Efficient management of incoming issue reports requires the successful navigation of the information landscape of a project. In this thesis, we address two tasks involved in issue management: Issue Assignment (IA) and Change Impact Analysis (CIA). IA is the early task of allocating an issue report to a development team, and CIA is the subsequent activity of identifying how source code changes affect the existing software artifacts. While IA is fundamental in all large software projects, CIA is particularly important to safety-critical development. Our solution approach, grounded on surveys of industry practice as well as scientific literature, is to support navigation by combining information retrieval and machine learning into Recommendation Systems for Software Engineering (RSSE). While the sheer number of incoming issue reports might challenge the overview of a human developer, our techniques instead benefit from the availability of ever-growing training data. We leverage the volume of issue reports to develop accurate decision support for software evolution. We evaluate our proposals both by deploying an RSSE in two development teams, and by simulation scenarios, i.e., we assess the correctness of the RSSEs' output when replaying the historical inflow of issue reports. In total, more than 60,000 historical issue reports are involved in our studies, originating from the evolution of five proprietary systems for two companies. Our results show that RSSEs for both IA and CIA can help developers navigate large software projects, in terms of locating development teams and software artifacts. Finally, we discuss how to support the transfer of our results to industry, focusing on addressing the context dependency of our tool support by systematically tuning parameters to a specific operational setting

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Intelligent Information Access to Linked Data - Weaving the Cultural Heritage Web

    Get PDF
    The subject of the dissertation is an information alignment experiment of two cultural heritage information systems (ALAP): The Perseus Digital Library and Arachne. In modern societies, information integration is gaining importance for many tasks such as business decision making or even catastrophe management. It is beyond doubt that the information available in digital form can offer users new ways of interaction. Also, in the humanities and cultural heritage communities, more and more information is being published online. But in many situations the way that information has been made publicly available is disruptive to the research process due to its heterogeneity and distribution. Therefore integrated information will be a key factor to pursue successful research, and the need for information alignment is widely recognized. ALAP is an attempt to integrate information from Perseus and Arachne, not only on a schema level, but to also perform entity resolution. To that end, technical peculiarities and philosophical implications of the concepts of identity and co-reference are discussed. Multiple approaches to information integration and entity resolution are discussed and evaluated. The methodology that is used to implement ALAP is mainly rooted in the fields of information retrieval and knowledge discovery. First, an exploratory analysis was performed on both information systems to get a first impression of the data. After that, (semi-)structured information from both systems was extracted and normalized. Then, a clustering algorithm was used to reduce the number of needed entity comparisons. Finally, a thorough matching was performed on the different clusters. ALAP helped with identifying challenges and highlighted the opportunities that arise during the attempt to align cultural heritage information systems

    A computational academic integrity framework

    Get PDF
    L'abast creixent i la naturalesa canviant dels programes acadĂšmics constitueixen un repte per a la integritat dels protocols tradicionals de proves i exĂ mens. L'objectiu dÂżaquesta tesi Ă©s introduir una alternativa als enfocaments tradicionals d'integritat acadĂšmica, per a cobrir la bretxa del buit de l'anonimat i donar la possibilitat als instructors i administradors acadĂšmics de fer servir nous mitjans que permetin mantenir la integritat acadĂšmica i promoguin la responsabilitat, accessibilitat i eficiĂšncia, a mĂ©s de preservar la privadesa i minimitzin la interrupciĂł en el procĂ©s d'aprenentatge. Aquest treball tĂ© com a objectiu començar un canvi de paradigma en les prĂ ctiques d'integritat acadĂšmica. La recerca en l'Ă rea de la identitat de l'estudiant i la garantia de l'autoria sĂłn importants perquĂš la concessiĂł de crĂšdits d'estudi a entitats no verificades Ă©s perjudicial per a la credibilitat institucional i la seguretat pĂșblica. Aquesta tesi es basa en la nociĂł que la identitat de l'alumne es compon de dues capes diferents, fĂ­sica i de comportament, en les quals tant els criteris d'identitat com els d'autoria han de ser confirmats per a mantenir un nivell raonable d'integritat acadĂšmica. Per a aixĂČ, aquesta tesi s'organitza en tres seccions, cadascuna de les quals aborda el problema des d'una de les perspectives segĂŒents: (a) teĂČrica, (b) empĂ­rica i (c) pragmĂ tica.El creciente alcance y la naturaleza cambiante de los programas acadĂ©micos constituyen un reto para la integridad de los protocolos tradicionales de pruebas y exĂĄmenes. El objetivo de esta tesis es introducir una alternativa a los enfoques tradicionales de integridad acadĂ©mica, para cubrir la brecha del vacĂ­o anonimato y dar la posibilidad a los instructores y administradores acadĂ©micos de usar nuevos medios que permitan mantener la integridad acadĂ©mica y promuevan la responsabilidad, accesibilidad y eficiencia, ademĂĄs de preservar la privacidad y minimizar la interrupciĂłn en el proceso de aprendizaje. Este trabajo tiene como objetivo iniciar un cambio de paradigma en las prĂĄcticas de integridad acadĂ©mica. La investigaciĂłn en el ĂĄrea de la identidad del estudiante y la garantĂ­a de la autorĂ­a son importantes porque la concesiĂłn de crĂ©ditos de estudio a entidades no verificadas es perjudicial para la credibilidad institucional y la seguridad pĂșblica. Esta tesis se basa en la nociĂłn de que la identidad del alumno se compone de dos capas distintas, fĂ­sica y de comportamiento, en las que tanto los criterios de identidad como los de autorĂ­a deben ser confirmados para mantener un nivel razonable de integridad acadĂ©mica. Para ello, esta tesis se organiza en tres secciones, cada una de las cuales aborda el problema desde una de las siguientes perspectivas: (a) teĂłrica, (b) empĂ­rica y (c) pragmĂĄtica.The growing scope and changing nature of academic programmes provide a challenge to the integrity of traditional testing and examination protocols. The aim of this thesis is to introduce an alternative to the traditional approaches to academic integrity, bridging the anonymity gap and empowering instructors and academic administrators with new ways of maintaining academic integrity that preserve privacy, minimize disruption to the learning process, and promote accountability, accessibility and efficiency. This work aims to initiate a paradigm shift in academic integrity practices. Research in the area of learner identity and authorship assurance is important because the award of course credits to unverified entities is detrimental to institutional credibility and public safety. This thesis builds upon the notion of learner identity consisting of two distinct layers (a physical layer and a behavioural layer), where the criteria of identity and authorship must both be confirmed to maintain a reasonable level of academic integrity. To pursue this goal in organized fashion, this thesis has the following three sections: (a) theoretical, (b) empirical, and (c) pragmatic

    A Computational Academic Integrity Framework

    Get PDF
    L'abast creixent i la naturalesa canviant dels programes acadĂšmics constitueixen un repte per a la integritat dels protocols tradicionals de proves i exĂ mens. L'objectiu d'aquesta tesi Ă©s introduir una alternativa als enfocaments tradicionals d'integritat acadĂšmica, per a cobrir la bretxa del buit de l'anonimat i donar la possibilitat als instructors i administradors acadĂšmics de fer servir nous mitjans que permetin mantenir la integritat acadĂšmica i promoguin la responsabilitat, accessibilitat i eficiĂšncia, a mĂ©s de preservar la privadesa i minimitzin la interrupciĂł en el procĂ©s d'aprenentatge. Aquest treball tĂ© com a objectiu començar un canvi de paradigma en les prĂ ctiques d'integritat acadĂšmica. La recerca en l'Ă rea de la identitat de l'estudiant i la garantia de l'autoria sĂłn importants perquĂš la concessiĂł de crĂšdits d'estudi a entitats no verificades Ă©s perjudicial per a la credibilitat institucional i la seguretat pĂșblica. Aquesta tesi es basa en la nociĂł que la identitat de l'alumne es compon de dues capes diferents, fĂ­sica i de comportament, en les quals tant els criteris d'identitat com els d'autoria han de ser confirmats per a mantenir un nivell raonable d'integritat acadĂšmica. Per a aixĂČ, aquesta tesi s'organitza en tres seccions, cadascuna de les quals aborda el problema des d'una de les perspectives segĂŒents: (a) teĂČrica, (b) empĂ­rica i (c) pragmĂ tica.El creciente alcance y la naturaleza cambiante de los programas acadĂ©micos constituyen un reto para la integridad de los protocolos tradicionales de pruebas y exĂĄmenes. El objetivo de esta tesis es introducir una alternativa a los enfoques tradicionales de integridad acadĂ©mica, para cubrir la brecha del vacĂ­o anonimato y dar la posibilidad a los instructores y administradores acadĂ©micos de usar nuevos medios que permitan mantener la integridad acadĂ©mica y promuevan la responsabilidad, accesibilidad y eficiencia, ademĂĄs de preservar la privacidad y minimizar la interrupciĂłn en el proceso de aprendizaje. Este trabajo tiene como objetivo iniciar un cambio de paradigma en las prĂĄcticas de integridad acadĂ©mica. La investigaciĂłn en el ĂĄrea de la identidad del estudiante y la garantĂ­a de la autorĂ­a son importantes porque la concesiĂłn de crĂ©ditos de estudio a entidades no verificadas es perjudicial para la credibilidad institucional y la seguridad pĂșblica. Esta tesis se basa en la nociĂłn de que la identidad del alumno se compone de dos capas distintas, fĂ­sica y de comportamiento, en las que tanto los criterios de identidad como los de autorĂ­a deben ser confirmados para mantener un nivel razonable de integridad acadĂ©mica. Para ello, esta tesis se organiza en tres secciones, cada una de las cuales aborda el problema desde una de las siguientes perspectivas: (a) teĂłrica, (b) empĂ­rica y (c) pragmĂĄtica.The growing scope and changing nature of academic programmes provide a challenge to the integrity of traditional testing and examination protocols. The aim of this thesis is to introduce an alternative to the traditional approaches to academic integrity, bridging the anonymity gap and empowering instructors and academic administrators with new ways of maintaining academic integrity that preserve privacy, minimize disruption to the learning process, and promote accountability, accessibility and efficiency. This work aims to initiate a paradigm shift in academic integrity practices. Research in the area of learner identity and authorship assurance is important because the award of course credits to unverified entities is detrimental to institutional credibility and public safety. This thesis builds upon the notion of learner identity consisting of two distinct layers (a physical layer and a behavioural layer), where the criteria of identity and authorship must both be confirmed to maintain a reasonable level of academic integrity. To pursue this goal in organized fashion, this thesis has the following three sections: (a) theoretical, (b) empirical, and (c) pragmatic
    • 

    corecore