144 research outputs found
Natural Language Processing in-and-for Design Research
We review the scholarly contributions that utilise Natural Language
Processing (NLP) methods to support the design process. Using a heuristic
approach, we collected 223 articles published in 32 journals and within the
period 1991-present. We present state-of-the-art NLP in-and-for design research
by reviewing these articles according to the type of natural language text
sources: internal reports, design concepts, discourse transcripts, technical
publications, consumer opinions, and others. Upon summarizing and identifying
the gaps in these contributions, we utilise an existing design innovation
framework to identify the applications that are currently being supported by
NLP. We then propose a few methodological and theoretical directions for future
NLP in-and-for design research
Recommended from our members
Leveraging Text-to-Scene Generation for Language Elicitation and Documentation
Text-to-scene generation systems take input in the form of a natural language text and output a 3D scene illustrating the meaning of that text. A major benefit of text-to-scene generation is that it allows users to create custom 3D scenes without requiring them to have a background in 3D graphics or knowledge of specialized software packages. This contributes to making text-to-scene useful in scenarios from creative applications to education. The primary goal of this thesis is to explore how we can use text-to-scene generation in a new way: as a tool to facilitate the elicitation and formal documentation of language. In particular, we use text-to-scene generation (a) to assist field linguists studying endangered languages; (b) to provide a cross-linguistic framework for formally modeling spatial language; and (c) to collect language data using crowdsourcing. As a side effect of these goals, we also explore the problem of multilingual text-to-scene generation, that is, systems for generating 3D scenes from languages other than English.
The contributions of this thesis are the following. First, we develop a novel tool suite (the WordsEye Linguistics Tools, or WELT) that uses the WordsEye text-to-scene system to assist field linguists with eliciting and documenting endangered languages. WELT allows linguists to create custom elicitation materials and to document semantics in a formal way. We test WELT with two endangered languages, Nahuatl and Arrernte. Second, we explore the question of how to learn a syntactic parser for WELT. We show that an incremental learning method using a small number of annotated dependency structures can produce reasonably accurate results. We demonstrate that using a parser trained in this way can significantly decrease the time it takes an annotator to label a new sentence with dependency information. Third, we develop a framework that generates 3D scenes from spatial and graphical semantic primitives. We incorporate this system into the WELT tools for creating custom elicitation materials, allowing users to directly manipulate the underlying semantics of a generated scene. Fourth, we introduce a deep semantic representation of spatial relations and use this to create a new resource, SpatialNet, which formally declares the lexical semantics of spatial relations for a language. We demonstrate how SpatialNet can be used to support multilingual text-to-scene generation. Finally, we show how WordsEye and the semantic resources it provides can be used to facilitate elicitation of language using crowdsourcing
Recommended from our members
Production and use of documentation in scientific software development
Software is becoming ubiquitous in science. The success of the application of scientific software depends on effective communication about what the software does and how it operates. Documentation captures the communication about the software. For that reason, practices around scientific software documentation need to be better understood. This thesis presents four qualitative empirical studies that look in depth at the production and use of documentation of scientific software. Together, the studies provide evidence emphasising the importance of documentation and shows the handshake between written documentation and the informal, ephemeral information exchange that happens within the community.
Four reasons behind the obstacles to producing effective scientific software documentation are identified: 1) the insufficient resources; 2) lack of incentives for researchers; 3) the influence of the community of practice; 4) the necessity of keeping up with the regular advancements of science. Benefits of the process of producing documentation are also identified: 1) aiding reasoning; 2) supporting reproducibility of science; 3) in certain contexts, expanding the community of users and developers around the software. The latter is investigated through a case study of documentation ‘crowdsourcing’.
The research reveals that there is a spectrum of users, with differing needs with respect to documentation. This, in turn, requires different approaches in addressing their needs. The research shows that the view of what constitutes documentation must be broad, in order to recognise how wide a range of resources (e.g., formal documents, email, online fora, comments in the source code) is actually used in communicating knowledge about scientific software. Much of the information about the software resides within the community of practice (and may not be documented). These observations are of practical use for those producing documentation in different contexts of scientific software development, for example providing guidance about engaging a community in ‘crowdsourcing’ documentation
The influence of capacity and attitudes in the use of water quality citizen science and volunteer benthic monitoring in the freshwater management activities of Ontario’s Conservation Authorities
The contribution of non-experts to environmental management has been significant and continues to flourish through their participation in citizen science. Despite its growth as an interdisciplinary field of enquiry, there are many gaps in our understanding of the role that citizen science may play in the future of environmental management. In Ontario, Canada, due to funding cuts and infrastructural changes over the past two decades, the provincial government’s ability to monitor changes in freshwater resources had been severely limited. This has resulted in downloading water monitoring to municipalities through their conservation authorities (CAs) which are watershed-based, quasi-governmental water management agencies. The public has been supplementing monitoring efforts through the thousands of hours they have devoted to water quality citizen science, including volunteer benthic monitoring (VBM). Through their watershed-based structure, their mandate to involve community in their work, their activities managing freshwater and their collaborations with various stakeholders, CAs seem like the ideal organizations to connect the public with the decision makers within the municipalities that manage local freshwater resources. However, their use of citizen science, particularly in benthic monitoring, is rare with most of their data being collected in-house by paid expert staff. By conducting 44 interviews among individuals of CAs and citizen science groups, participating in monitoring and collecting documents published by both these groups as well as administering a survey among all of the 36 CAs, I examined the influence of both CA capacity and attitudes in limiting the use of volunteer benthic monitoring by CAs in their freshwater management decisions. Twenty-nine CAs participated in the survey to some extent, although for 24 of these CAs, only one or two questionnaires were submitted (a total of 67 questionnaires completed). While the CA’s capacity through their organizational dynamics (human resources, flexibility, collaborations) generally supports the use of VBM, they lack the financial and human resources to fully support this form of citizen science. This, along with the attitude that volunteers are not capable of collecting credible monitoring information, makes the widespread adoption of VBM by CAs unlikely. Despite these findings, there is still the potential for CAs to successfully adopt certain types of water quality citizen science that are not as financial and human resource intense as VBM, and that have a broader appeal to variety of types of volunteers
Reification as the key to augmenting software development: an object is worth a thousand words
Software development has become more and more pervasive, with influence in almost every human activity. To be able to fit in so many different scenarios and constantly implement new features, software developers adopted methodologies with tight development cycles, sometimes with more than one release per day. With the constant growth of modern software projects and the consequent expansion of development teams, understanding all the components of a system becomes a task too big to handle. In this context understanding the cause of an error or identifying its source is not an easy task, and correcting the erroneous behavior can lead to unexpected downtime of vital services. Being able to keep track of software defects, usually referred to as bugs, is crucial in the development of a project and in containing maintenance costs. For this purpose, the correctness and completeness of the information available has a great impact on the time required to understand and solve a problem. In this thesis we present an overview of the current techniques commonly used to report software defects. We show why we believe that the state of the art needs to be improved, and present a set of approaches and tools to collect data from software failures, model it, and turn it into actionable knowledge. Our goal is to show that data generated from errors can have a great impact on daily software development, and how it can be employed to augment the development environment to assist software engineers to build and maintain software systems
Relevance is in the Eye of the Beholder: Design Principles for the Extraction of Context-Aware Information
Since the1970s many approaches of representing domains have been suggested. Each approach maintains the assumption that the information about the objects represented in the Information System (IS) is specified and verified by domain experts and potential users. Yet, as more IS are developed to support a larger diversity of users such as customers, suppliers, and members of the general public (such as many multi-user online systems), analysts can no longer rely on a stable single group of people for complete specification of domains –to the extent that prior research has questioned the efficacy of conceptual modeling in these heterogeneous settings. We formulated principles for identifying basic classes in a domain. These classes can guide conceptual modeling, database design, and user interface development in a wide variety of traditional and emergent domains. Moreover, we used a case study of a large foster organization to study how unstructured data entry practices result in differences in how information is collected across organizational units. We used institutional theory to show how institutional elements enacted by individuals can generate new practices that can be adopted over time as best practices. We analyzed free-text notes to prioritize potential cases of psychotropic drug use—our tactical need. We showed that too much flexibility in how data can be entered into the system, results in different styles, which tend to be homogenous across organizational units but not across organizational units. Theories in Psychology help explain the implications of the level of specificity and the inferential utility of the text encoded in the unstructured note
Managing episodic volunteers in free/libre/open source software communities
We draw on the concept of episodic volunteering (EV) from the general volunteering literature to identify practices for managing EV in free/libre/open source software (FLOSS) communities. Infrequent but ongoing participation is widespread, but the practices that community managers are using to manage EV, and their concerns about EV, have not been previously documented. We conducted a policy Delphi study involving 24 FLOSS community managers from 22 different communities. Our panel identified 16 concerns related to managing EV in FLOSS, which we ranked by prevalence. We also describe 65 practices for managing EV in FLOSS. Almost three-quarters of these practices are used by at least three community managers. We report these practices using a systematic presentation that includes context, relationships between practices, and concerns that they address. These findings provide a coherent framework that can help FLOSS community managers to better manage episodic contributors
Designing AI-Based Systems for Qualitative Data Collection and Analysis
With the continuously increasing impact of information systems (IS) on private and professional life, it has become crucial to integrate users in the IS development process. One of the critical reasons for failed IS projects is the inability to accurately meet user requirements, resulting from an incomplete or inaccurate collection of requirements during the requirements elicitation (RE) phase. While interviews are the most effective RE technique, they face several challenges that make them a questionable fit for the numerous, heterogeneous, and geographically distributed users of contemporary IS.
Three significant challenges limit the involvement of a large number of users in IS development processes today. Firstly, there is a lack of tool support to conduct interviews with a wide audience. While initial studies show promising results in utilizing text-based conversational agents (chatbots) as interviewer substitutes, we lack design knowledge for designing AI-based chatbots that leverage established interviewing techniques in the context of RE. By successfully applying chatbot-based interviewing, vast amounts of qualitative data can be collected. Secondly, there is a need to provide tool support enabling the analysis of large amounts of qualitative interview data. Once again, while modern technologies, such as machine learning (ML), promise remedy, concrete implementations of automated analysis for unstructured qualitative data lag behind the promise. There is a need to design interactive ML (IML) systems for supporting the coding process of qualitative data, which centers around simple interaction formats to teach the ML system, and transparent and understandable suggestions to support data analysis. Thirdly, while organizations rely on online feedback to inform requirements without explicitly conducting RE interviews (e.g., from app stores), we know little about the demographics of who is giving feedback and what motivates them to do so. Using online feedback as requirement source risks including solely the concerns and desires of vocal user groups.
With this thesis, I tackle these three challenges in two parts. In part I, I address the first and the second challenge by presenting and evaluating two innovative AI-based systems, a chatbot for requirements elicitation and an IML system to semi-automate qualitative coding. In part II, I address the third challenge by presenting results from a large-scale study on IS feedback engagement. With both parts, I contribute with prescriptive knowledge for designing AI-based qualitative data collection and analysis systems and help to establish a deeper understanding of the coverage of existing data collected from online sources. Besides providing concrete artifacts, architectures, and evaluations, I demonstrate the application of a chatbot interviewer to understand user values in smartphones and provide guidance for extending feedback coverage from underrepresented IS user groups
- …