550 research outputs found
A survey of the use of crowdsourcing in software engineering
The term 'crowdsourcing' was initially introduced in 2006 to describe an emerging distributed problem-solving model by online workers. Since then it has been widely studied and practiced to support software engineering. In this paper we provide a comprehensive survey of the use of crowdsourcing in software engineering, seeking to cover all literature on this topic. We first review the definitions of crowdsourcing and derive our definition of Crowdsourcing Software Engineering together with its taxonomy. Then we summarise industrial crowdsourcing practice in software engineering and corresponding case studies. We further analyse the software engineering domains, tasks and applications for crowdsourcing and the platforms and stakeholders involved in realising Crowdsourced Software Engineering solutions. We conclude by exposing trends, open issues and opportunities for future research on Crowdsourced Software Engineering
Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data
Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease pro- gression. Until recently, such signals were usually collected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from dig- ital stethoscopes) for cardiovascular or respiratory examination, which could then be used for automatic analysis. Some initial work shows promise in detecting diagnostic signals of COVID-19 from voice and coughs. In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. We use coughs and breathing to under- stand how discernible COVID-19 sounds are from those in asthma or healthy controls. Our results show that even a simple binary machine learning classifier is able to classify correctly healthy and COVID-19 sounds. We also show how we distinguish a user who tested positive for COVID-19 and has a cough from a healthy user with a cough, and users who tested positive for COVID-19 and have a cough from users with asthma and a cough. Our models achieve an AUC of above 80% across all tasks. These results are preliminary and only scratch the surface of the potential of this type of data and audio-based machine learning. This work opens the door to further investigation of how automatically analysed respiratory patterns could be used as pre-screening signals to aid COVID-19 diagnosis.ER
Data Collection and Machine Learning Methods for Automated Pedestrian Facility Detection and Mensuration
Large-scale collection of pedestrian facility (crosswalks, sidewalks, etc.) presence data is vital to the success of efforts to improve pedestrian facility management, safety analysis, and road network planning. However, this kind of data is typically not available on a large scale due to the high labor and time costs that are the result of relying on manual data collection methods. Therefore, methods for automating this process using techniques such as machine learning are currently being explored by researchers. In our work, we mainly focus on machine learning methods for the detection of crosswalks and sidewalks from both aerial and street-view imagery. We test data from these two viewpoints individually and with an ensemble method that we refer to as our “dual-perspective prediction model”. In order to obtain this data, we developed a data collection pipeline that combines crowdsourced pedestrian facility location data with aerial and street-view imagery from Bing Maps. In addition to the Convolutional Neural Network used to perform pedestrian facility detection using this data, we also trained a segmentation network to measure the length and width of crosswalks from aerial images. In our tests with a dual-perspective image dataset that was heavily occluded in the aerial view but relatively clear in the street view, our dual-perspective prediction model was able to increase prediction accuracy, recall, and precision by 49%, 383%, and 15%, respectively (compared to using a single perspective model based on only aerial view images). In our tests with satellite imagery provided by the Mississippi Department of Transportation, we were able to achieve accuracies as high as 99.23%, 91.26%, and 93.7% for aerial crosswalk detection, aerial sidewalk detection, and aerial crosswalk mensuration, respectively. The final system that we developed packages all of our machine learning models into an easy-to-use system that enables users to process large batches of imagery or examine individual images in a directory using a graphical interface. Our data collection and filtering guidelines can also be used to guide future research in this area by establishing standards for data quality and labelling
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mobile App Testing
Mobile app development involves a unique set of challenges including device
fragmentation and rapidly evolving platforms, making testing a difficult task.
The design space for a comprehensive mobile testing strategy includes features,
inputs, potential contextual app states, and large combinations of devices and
underlying platforms. Therefore, automated testing is an essential activity of
the development process. However, current state of the art of automated testing
tools for mobile apps poses limitations that has driven a preference for manual
testing in practice. As of today, there is no comprehensive automated solution
for mobile testing that overcomes fundamental issues such as automated oracles,
history awareness in test cases, or automated evolution of test cases.
In this perspective paper we survey the current state of the art in terms of
the frameworks, tools, and services available to developers to aid in mobile
testing, highlighting present shortcomings. Next, we provide commentary on
current key challenges that restrict the possibility of a comprehensive,
effective, and practical automated testing solution. Finally, we offer our
vision of a comprehensive mobile app testing framework, complete with research
agenda, that is succinctly summarized along three principles: Continuous,
Evolutionary and Large-scale (CEL).Comment: 12 pages, accepted to the Proceedings of 33rd IEEE International
Conference on Software Maintenance and Evolution (ICSME'17
Taming Android Fragmentation through Lightweight Crowdsourced Testing
Android fragmentation refers to the overwhelming diversity of Android devices
and OS versions. These lead to the impossibility of testing an app on every
supported device, leaving a number of compatibility bugs scattered in the
community and thereby resulting in poor user experiences. To mitigate this, our
fellow researchers have designed various works to automatically detect such
compatibility issues. However, the current state-of-the-art tools can only be
used to detect specific kinds of compatibility issues (i.e., compatibility
issues caused by API signature evolution), i.e., many other essential types of
compatibility issues are still unrevealed. For example, customized OS versions
on real devices and semantic changes of OS could lead to serious compatibility
issues, which are non-trivial to be detected statically. To this end, we
propose a novel, lightweight, crowdsourced testing approach, LAZYCOW, to fill
this research gap and enable the possibility of taming Android fragmentation
through crowdsourced efforts. Specifically, crowdsourced testing is an emerging
alternative to conventional mobile testing mechanisms that allow developers to
test their products on real devices to pinpoint platform-specific issues.
Experimental results on thousands of test cases on real-world Android devices
show that LAZYCOW is effective in automatically identifying and verifying
API-induced compatibility issues. Also, after investigating the user experience
through qualitative metrics, users' satisfaction provides strong evidence that
LAZYCOW is useful and welcome in practice
Multi-objective Search-based Mobile Testing
Despite the tremendous popularity of mobile applications, mobile testing still relies heavily on manual testing. This thesis presents mobile test automation approaches based on multi-objective search. We introduce three approaches: Sapienz (for native Android app testing), Octopuz (for hybrid/web JavaScript app testing) and Polariz (for using crowdsourcing to support search-based mobile testing). These three approaches represent the primary scientific and technical contributions of the thesis. Since crowdsourcing is, itself, an emerging research area, and less well understood than search-based software engineering, the thesis also provides the first comprehensive survey on the use of crowdsourcing in software testing (in particular) and in software engineering (more generally). This survey represents a secondary contribution. Sapienz is an approach to Android testing that uses multi-objective search-based testing to automatically explore and optimise test sequences, minimising their length, while simultaneously maximising their coverage and fault revelation. The results of empirical studies demonstrate that Sapienz significantly outperforms both the state-of-the-art technique Dynodroid and the widely-used tool, Android Monkey, on all three objectives. When applied to the top 1,000 Google Play apps, Sapienz found 558 unique, previously unknown crashes. Octopuz reuses the Sapienz multi-objective search approach for automated JavaScript testing, aiming to investigate whether it replicates the Sapienz’ success on JavaScript testing. Experimental results on 10 real-world JavaScript apps provide evidence that Octopuz significantly outperforms the state of the art (and current state of practice) in automated JavaScript testing. Polariz is an approach that combines human (crowd) intelligence with machine (computational search) intelligence for mobile testing. It uses a platform that enables crowdsourced mobile testing from any source of app, via any terminal client, and by any crowd of workers. It generates replicable test scripts based on manual test traces produced by the crowd workforce, and automatically extracts from these test traces, motif events that can be used to improve search-based mobile testing approaches such as Sapienz
Crowd Intelligence Enhances Automated Mobile Testing
We show that information extracted from crowd-based testing can enhance automated mobile testing. We introduce Polariz, which generates replicable test scripts from crowd-based testing, extracting cross-app `motif' events: automatically-inferred reusable higher-level event sequences composed of lower-level observed event actions. Our empirical study used 434 crowd workers from Mechanical Turk to perform 1,350 testing tasks on 9 popular Google Play apps, each with at least 1 million user installs. The findings reveal that the crowd was able to achieve 60.5% unique activity coverage and proved to be complementary to automated search-based testing in 5 out of the 9 subjects studied. Our leave-one-out evaluation demonstrates that coverage attainment can be improved (6 out of 9 cases, with no disimprovement on the remaining 3) by combining crowd-based and search-based testing
GenQ: Automated Question Generation to Support Caregivers While Reading Stories with Children
When caregivers ask open--ended questions to motivate dialogue with children,
it facilitates the child's reading comprehension skills.Although there is scope
for use of technological tools, referred here as "intelligent tutoring
systems", to scaffold this process, it is currently unclear whether existing
intelligent systems that generate human--language like questions is beneficial.
Additionally, training data used in the development of these automated question
generation systems is typically sourced without attention to demographics, but
people with different cultural backgrounds may ask different questions. As a
part of a broader project to design an intelligent reading support app for
Latinx children, we crowdsourced questions from Latinx caregivers and
noncaregivers as well as caregivers and noncaregivers from other demographics.
We examine variations in question--asking within this dataset mediated by
individual, cultural, and contextual factors. We then design a system that
automatically extracts templates from this data to generate open--ended
questions that are representative of those asked by Latinx caregivers
- …