350 research outputs found

    Humour support and emotive stance in comments on Korean TV Drama

    Get PDF
    Viewers on viki.com comment on Korean television drama series while watching: They produce timed comments tied to the timecode of the audiovisual stream. Among the functions these comments have in the community, the expression of emotive stance is central. Importantly, this includes humour support encoded in a variety of linguistic and paralinguistic ways. Our study identifies a range of humour support indicators, which allow us to find comments that are responses to humour. Accordingly, our study explores how commenters make use of the affordances of the Viki timed comment feature to linguistically and paralinguistically encode their humorous reaction to fictional events and to previous comments. We do this both quantitatively e based on a multilingual corpus of all 320,118 timed comments that accompany five Korean dramas we randomly selected (80 episodes in total), and qualitatively based on the in-depth analysis of two episodes. What we contribute is a typology and the distribution of humour support indicators used in a novel genre of technology-mediated communication as well as insights into how the viewing community collectively does humour support. Finally, we also present the semi-automatic detection of humour support as a viable strategy to objectively identify humour-relevant scenes in Korean TV drama

    Improving Software Project Health Using Machine Learning

    Get PDF
    In recent years, systems that would previously live on different platforms have been integrated under a single umbrella. The increased use of GitHub, which offers pull-requests, issue trackingand version history, and its integration with other solutions such as Gerrit, or Travis, as well as theresponse from competitors, created development environments that favour agile methodologiesby increasingly automating non-coding tasks: automated build systems, automated issue triagingetc. In essence, source-code hosting platforms shifted to continuous integration/continuousdelivery (CI/CD) as a service. This facilitated a shift in development paradigms, adherents ofagile methodology can now adopt a CI/CD infrastructure more easily. This has also created large,publicly accessible sources of source-code together with related project artefacts: GHTorrent andsimilar datasets now offer programmatic access to the whole of GitHub. Project health encompasses traceability, documentation, adherence to coding conventions,tasks that reduce maintenance costs and increase accountability, but may not directly impactfeatures. Overfocus on health can slow velocity (new feature delivery) so the Agile Manifestosuggests developers should travel light — forgo tasks focused on a project health in favourof higher feature velocity. Obviously, injudiciously following this suggestion can undermine aproject’s chances for success. Simultaneously, this shift to CI/CD has allowed the proliferation of Natural Language orNatural Language and Formal Language textual artefacts that are programmatically accessible:GitHub and their competitors allow API access to their infrastructure to enable the creation ofCI/CD bots. This suggests that approaches from Natural Language Processing and MachineLearning are now feasible and indeed desirable. This thesis aims to (semi-)automate tasks forthis new paradigm and its attendant infrastructure by bringing to the foreground the relevant NLPand ML techniques. Under this umbrella, I focus on three synergistic tasks from this domain: (1) improving theissue-pull-request traceability, which can aid existing systems to automatically curate the issuebacklog as pull-requests are merged; (2) untangling commits in a version history, which canaid the beforementioned traceability task as well as improve the usability of determining a faultintroducing commit, or cherry-picking via tools such as git bisect; (3) mixed-text parsing, to allowbetter API mining and open new avenues for project-specific code-recommendation tools

    Semantic discovery and reuse of business process patterns

    Get PDF
    Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse

    Propelling the Potential of Enterprise Linked Data in Austria. Roadmap and Report

    Get PDF
    In times of digital transformation and considering the potential of the data-driven economy, it is crucial that data is not only made available, data sources can be trusted, but also data integrity can be guaranteed, necessary privacy and security mechanisms are in place, and data and access comply with policies and legislation. In many cases, complex and interdisciplinary questions cannot be answered by a single dataset and thus it is necessary to combine data from multiple disparate sources. However, because most data today is locked up in isolated silos, data cannot be used to its fullest potential. The core challenge for most organisations and enterprises in regards to data exchange and integration is to be able to combine data from internal and external data sources in a manner that supports both day to day operations and innovation. Linked Data is a promising data publishing and integration paradigm that builds upon standard web technologies. It supports the publishing of structured data in a semantically explicit and interlinked manner such that it can be easily connected, and consequently becomes more interoperable and useful. The PROPEL project - Propelling the Potential of Enterprise Linked Data in Austria - surveyed technological challenges, entrepreneurial opportunities, and open research questions on the use of Linked Data in a business context and developed a roadmap and a set of recommendations for policy makers, industry, and the research community. Shifting away from a predominantly academic perspective and an exclusive focus on open data, the project looked at Linked Data as an emerging disruptive technology that enables efficient enterprise data management in the rising data economy. Current market forces provide many opportunities, but also present several data and information management challenges. Given that Linked Data enables advanced analytics and decision-making, it is particularly suitable for addressing today's data and information management challenges. In our research, we identified a variety of highly promising use cases for Linked Data in an enterprise context. Examples of promising application domains include "customization and customer relationship management", "automatic and dynamic content production, adaption and display", "data search, information retrieval and knowledge discovery", as well as "data and information exchange and integration". The analysis also revealed broad potential across a large spectrum of industries whose structural and technological characteristics align well with Linked Data characteristics and principles: energy, retail, finance and insurance, government, health, transport and logistics, telecommunications, media, tourism, engineering, and research and development rank among the most promising industries for the adoption of Linked Data principles. In addition to approaching the subject from an industry perspective, we also examined the topics and trends emerging from the research community in the field of Linked Data and the Semantic Web. Although our analysis revolved around a vibrant and active community composed of academia and leading companies involved in semantic technologies, we found that industry needs and research discussions are somewhat misaligned. Whereas some foundation technologies such as knowledge representation and data creation/publishing/sharing, data management and system engineering are highly represented in scientific papers, specific topics such as recommendations, or cross-topics such as machine learning or privacy and security are marginally present. Topics such as big/large data and the internet of things are (still) on an upward trajectory in terms of attention. In contrast, topics that are very relevant for industry such as application oriented topics or those that relate to security, privacy and robustness are not attracting much attention. When it comes to standardisation efforts, we identified a clear need for a more in-depth analysis into the effectiveness of existing standards, the degree of coverage they provide with respect the foundations they belong to, and the suitability of alternative standards that do not fall under the core Semantic Web umbrella. Taking into consideration market forces, sector analysis of Linked Data potential, demand side analysis and the current technological status it is clear that Linked Data has a lot of potential for enterprises and can act as a key driver of technological, organizational, and economic change. However, in order to ensure a solid foundation for Enterprise Linked Data include there is a need for: greater awareness surrounding the potential of Linked Data in enterprises, lowering of entrance barriers via education and training, better alignment between industry demands and research activities, greater support for technology transfer from universities to companies. The PROPEL roadmap recommends concrete measures in order to propel the adoption of Linked Data in Austrian enterprises. These measures are structured around five fields of activities: "awareness and education", "technological innovation, research gaps, standardisation", "policy and legal", and "funding". Key short-term recommendations include the clustering of existing activities in order to raise visibility on an international level, the funding of key topics that are under represented by the community, and the setup of joint projects. In the medium term, we recommend the strengthening of existing academic and private education efforts via certification and to establish flagship projects that are based on national use cases that can serve as blueprints for transnational initiatives. This requires not only financial support, but also infrastructure support, such as data and services to build solutions on top. In the long term, we recommend cooperation with international funding schemes to establish and foster a European level agenda, and the setup of centres of excellence

    Application of Digital Forensic Science to Electronic Discovery in Civil Litigation

    Get PDF
    Following changes to the Federal Rules of Civil Procedure in 2006 dealing with the role of Electronically Stored Information, digital forensics is becoming necessary to the discovery process in civil litigation. The development of case law interpreting the rule changes since their enactment defines how digital forensics can be applied to the discovery process, the scope of discovery, and the duties imposed on parties. Herein, pertinent cases are examined to determine what trends exist and how they effect the field. These observations buttress case studies involving discovery failures in large corporate contexts along with insights on the technical reasons those discovery failures occurred and continue to occur. The state of the art in the legal industry for handling Electronically Stored Information is slow, inefficient, and extremely expensive. These failings exacerbate discovery failures by making the discovery process more burdensome than necessary. In addressing this problem, weaknesses of existing approaches are identified, and new tools are presented which cure these defects. By drawing on open source libraries, components, and other support the presented tools exceed the performance of existing solutions by between one and two orders of magnitude. The transparent standards embodied in the open source movement allow for clearer defensibility of discovery practice sufficiency whereas existing approaches entail difficult to verify closed source solutions. Legacy industry practices in numbering documents based on Bates numbers inhibit efficient parallel and distributed processing of electronic data into paginated forms. The failures inherent in legacy numbering systems is identified, and a new system is provided which eliminates these inhibiters while simultaneously better modeling the nature of electronic data which does not lend itself to pagination; such non-paginated data includes databases and other file types which are machine readable, but not human readable in format. In toto, this dissertation provides a broad treatment of digital forensics applied to electronic discovery, an analysis of current failures in the industry, and a suite of tools which address the weaknesses, problems, and failures identified

    Conceptualising digital nomadic practice: evidence from a technology-intensive firm

    Get PDF
    This thesis studies how individuals use digital media to manage connectivity and accomplish work across digital and physical spaces in modern organisations, ultimately conceptualising this new type of work as a new digital nomadic practice. Increased digitisation and the need for more flexible work styles have pressured organisations to adopt new digital media and to redesign their workplaces. Existing research provides some theoretical understanding of this phenomena, however it is scattered across multiple disciplines and lack a broader all-encompassing view of the concept. This study addresses this gap with deeper and more holistic theoretical engagement in order to better capture and explain new work practices within organisations today. Exploring the salient aspects of digital nomadic practices, the study builds on the emergent literature on connectivity to understand the ways and means of staying connected. It also draws on the technology adoption and affordance literature to review how individuals use the capabilities of multiple digital media that provide the potential for a particular action. Overall the study aims to i) understand how individuals conduct their work practices in physical and digital spaces, ii) identify how individuals use digital media to stay connected, and iii) understand how individuals manage connectivity. It draws on a single case study of a multinational IT organisation in the UK. The research follows a qualitative approach and inductively driven strategy. The study focuses on the dimensions of connectivity, digital media use, and follows digital nomad’s work ‘within and between’ the digital and physical spaces. The findings of this exploratory case study show that digital nomads use the new digital media in a way it precluded them from being overly connected and allowed them to manage connectivity across multiple, operational, social and organisational levels. It identified the digital media choice by drawing on a theory of nested affordances in order to capture media choice in a dynamic way, which happens at different levels, as digital media coexist together and provide combination of various affordances. These findings contribute to knowledge of how individuals choose digital media to manage their connectivity in digital and physical spaces, and particularly inform the study of digital media adoption and technological affordances

    Application of Digital Forensic Science to Electronic Discovery in Civil Litigation

    Get PDF
    Following changes to the Federal Rules of Civil Procedure in 2006 dealing with the role of Electronically Stored Information, digital forensics is becoming necessary to the discovery process in civil litigation. The development of case law interpreting the rule changes since their enactment defines how digital forensics can be applied to the discovery process, the scope of discovery, and the duties imposed on parties. Herein, pertinent cases are examined to determine what trends exist and how they effect the field. These observations buttress case studies involving discovery failures in large corporate contexts along with insights on the technical reasons those discovery failures occurred and continue to occur. The state of the art in the legal industry for handling Electronically Stored Information is slow, inefficient, and extremely expensive. These failings exacerbate discovery failures by making the discovery process more burdensome than necessary. In addressing this problem, weaknesses of existing approaches are identified, and new tools are presented which cure these defects. By drawing on open source libraries, components, and other support the presented tools exceed the performance of existing solutions by between one and two orders of magnitude. The transparent standards embodied in the open source movement allow for clearer defensibility of discovery practice sufficiency whereas existing approaches entail difficult to verify closed source solutions. Legacy industry practices in numbering documents based on Bates numbers inhibit efficient parallel and distributed processing of electronic data into paginated forms. The failures inherent in legacy numbering systems is identified, and a new system is provided which eliminates these inhibiters while simultaneously better modeling the nature of electronic data which does not lend itself to pagination; such non-paginated data includes databases and other file types which are machine readable, but not human readable in format. In toto, this dissertation provides a broad treatment of digital forensics applied to electronic discovery, an analysis of current failures in the industry, and a suite of tools which address the weaknesses, problems, and failures identified

    24th Nordic Conference on Computational Linguistics (NoDaLiDa)

    Get PDF
    corecore