5 research outputs found

    Towards a corpus for credibility assessment in software practitioner blog articles

    Get PDF
    Blogs are a source of grey literature which are widely adopted by software practitioners for disseminating opinion and experience. Analysing such articles can provide useful insights into the state-of-practice for software engineering research. However, there are challenges in identifying higher quality content from the large quantity of articles available. Credibility assessment can help in identifying quality content, though there is a lack of existing corpora. Credibility is typically measured through a series of conceptual criteria, with 'argumentation' and 'evidence' being two important criteria. We create a corpus labelled for argumentation and evidence that can aid the credibility community. The corpus consists of articles from the blog of a single software practitioner and is publicly available. Three annotators label the corpus with a series of conceptual credibility criteria, reaching an agreement of 0.82 (Fleiss' Kappa). We present preliminary analysis of the corpus by using it to investigate the identification of claim sentences (one of our ten labels). We train four systems (Bert, KNN, Decision Tree and SVM) using three feature sets (Bag of Words, Topic Modelling and InferSent), achieving an F1 score of 0.64 using InferSent and a Linear SVM. Our preliminary results are promising, indicating that the corpus can help future studies in detecting the credibility of grey literature. Future research will investigate the degree to which the sentence level annotations can infer the credibility of the overall document

    Extending a corpus for assessing the credibility of software practitioner blog articles using meta-knowledge

    Get PDF
    Practitioner written grey literature, such as blog articles, has value in software engineering research. Such articles provide insight into practice that is often not visible to research. However, a high quantity and varying quality are two major challenges in utilising such material. Quality is defined as an aggregate of a document's relevance to the consumer and its credibility. Credibility is often assessed through a series of conceptual criteria that are specific to a particular user group. For researchers, previous work has found argumentation' and >evidence' to be two important criteria. In this paper, we extend a previously developed corpus by annotating at broader granularity. We then investigate whether the original annotations (sentence level) can infer these new annotations (article level). Our preliminary results show that sentence-level annotations infer the overall credibility of an article with an F1 score of 91%. These results indicate that the corpus can help future studies in detecting the credibility of practitioner written grey literature

    "Because Some Sighted People, They Don't Know What the Heck You're Talking About:" A Study of Blind TikTokers' Infrastructuring Work to Build Independence

    Full text link
    There has been extensive research on the experiences of individuals with visual impairments on text- and image-based social media platforms, such as Facebook and Twitter. However, little is known about the experiences of visually impaired users on short-video platforms like TikTok. To bridge this gap, we conducted an interview study with 30 BlindTokers (the nickname of blind TikTokers). Our study aimed to explore the various activities of BlindTokers on TikTok, including everyday entertainment, professional development, and community engagement. The widespread usage of TikTok among participants demonstrated that they considered TikTok and its associated experiences as the infrastructure for their activities. Additionally, participants reported experiencing breakdowns in this infrastructure due to accessibility issues. They had to carry out infrastructuring work to resolve the breakdowns. Blind users' various practices on TikTok also foregrounded their perceptions of independence. We then discussed blind users' nuanced understanding of the TikTok-mediated independence; we also critically examined BlindTokers' infrastructuring work for such independence.Comment: Accepted at CSCW'24, 29 pages, 2 figures, and 2 table

    Informing the design of spoken conversational search: Perspective paper

    Get PDF
    We conducted a laboratory-based observational study where pairs of people performed search tasks communicating verbally. Examination of the discourse allowed commonly used interactions to be identified for Spoken Conversational Search (SCS). We compared the interactions to existing models of search behaviour. We find that SCS is more complex and interactive than traditional search. This work enhances our understanding of different search behaviours and proposes research opportunities for an audio-only search system. Future work will focus on creating models of search behaviour for SCS and evaluating these against actual SCS systems

    Spoken conversational search: audio-only interactive information retrieval

    Get PDF
    Speech-based web search where no keyboard or screens are available to present search engine results is becoming ubiquitous, mainly through the use of mobile devices and intelligent assistants such as Apple's HomePod, Google Home, or Amazon Alexa. Currently, these intelligent assistants do not maintain a lengthy information exchange. They do not track context or present information suitable for an audio-only channel, and do not interact with the user in a multi-turn conversation. Understanding how users would interact with such an audio-only interaction system in multi-turn information seeking dialogues, and what users expect from these new systems, are unexplored in search settings. In particular, the knowledge on how to present search results over an audio-only channel and which interactions take place in this new search paradigm is crucial to incorporate while producing usable systems. Thus, constructing insight into the conversational structure of information seeking processes provides researchers and developers opportunities to build better systems while creating a research agenda and directions for future advancements in Spoken Conversational Search (SCS). Such insight has been identified as crucial in the growing SCS area. At the moment, limited understanding has been acquired for SCS, for example how the components interact, how information should be presented, or how task complexity impacts the interactivity or discourse behaviours. We aim to address these knowledge gaps. This thesis outlines the breadth of SCS and forms a manifesto advancing this highly interactive search paradigm with new research directions including prescriptive notions for implementing identified challenges. We investigate SCS through quantitative and qualitative designs: (i) log and crowdsourcing experiments investigating different interaction and results presentation styles, and (ii) the creation and analysis of the first SCS dataset and annotation schema through designing and conducting an observational study of information seeking dialogues. We propose new research directions and design recommendations based on the triangulation of three different datasets and methods: the log analysis to identify practical challenges and limitations of existing systems while informing our future observational study; the crowdsourcing experiment to validate a new experimental setup for future search engine results presentation investigations; and the observational study to establish the SCS dataset (SCSdata), form the first Spoken Conversational Search Annotation Schema (SCoSAS), and study interaction behaviours for different task complexities. Our principle contributions are based on our observational study for which we developed a novel methodology utilising a qualitative design. We show that existing information seeking models may be insufficient for the new SCS search paradigm because they inadequately capture meta-discourse functions and the system's role as an active agent. Thus, the results indicate that SCS systems have to support the user through discourse functions and be actively involved in the users' search process. This suggests that interactivity between the user and system is necessary to overcome the increased complexity which has been imposed upon the user and system by the constraints of the audio-only communication channel. We then present the first schematic model for SCS which is derived from the SCoSAS through the qualitative analysis of the SCSdata. In addition, we demonstrate the applicability of our dataset by investigating the effect of task complexity on interaction and discourse behaviour. Lastly, we present SCS design recommendations and outline new research directions for SCS. The implications of our work are practical, conceptual, and methodological. The practical implications include the development of the SCSdata, the SCoSAS, and SCS design recommendations. The conceptual implications include the development of a schematic SCS model which identifies the need for increased interactivity and pro-activity to overcome the audio-imposed complexity in SCS. The methodological implications include the development of the crowdsourcing framework, and techniques for developing and analysing SCS datasets. In summary, we believe that our findings can guide researchers and developers to help improve existing interactive systems which are less constrained, such as mobile search, as well as more constrained systems such as SCS systems
    corecore