39 research outputs found

    Exploring Outliers in Crowdsourced Ranking for QoE

    Full text link
    Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years. In this paper, we propose some simple and fast algorithms for outlier detection and robust QoE evaluation based on the nonconvex optimization principle. Several iterative procedures are designed with or without knowing the number of outliers in samples. Theoretical analysis is given to show that such procedures can reach statistically good estimates under mild conditions. Finally, experimental results with simulated and real-world crowdsourcing datasets show that the proposed algorithms could produce similar performance to Huber-LASSO approach in robust ranking, yet with nearly 8 or 90 times speed-up, without or with a prior knowledge on the sparsity size of outliers, respectively. Therefore the proposed methodology provides us a set of helpful tools for robust QoE evaluation with crowdsourcing data.Comment: accepted by ACM Multimedia 2017 (Oral presentation). arXiv admin note: text overlap with arXiv:1407.763

    A survey of the use of crowdsourcing in software engineering

    Get PDF
    The term 'crowdsourcing' was initially introduced in 2006 to describe an emerging distributed problem-solving model by online workers. Since then it has been widely studied and practiced to support software engineering. In this paper we provide a comprehensive survey of the use of crowdsourcing in software engineering, seeking to cover all literature on this topic. We first review the definitions of crowdsourcing and derive our definition of Crowdsourcing Software Engineering together with its taxonomy. Then we summarise industrial crowdsourcing practice in software engineering and corresponding case studies. We further analyse the software engineering domains, tasks and applications for crowdsourcing and the platforms and stakeholders involved in realising Crowdsourced Software Engineering solutions. We conclude by exposing trends, open issues and opportunities for future research on Crowdsourced Software Engineering

    Attribute Learning for Image/Video Understanding

    Get PDF
    PhDFor the past decade computer vision research has achieved increasing success in visual recognition including object detection and video classification. Nevertheless, these achievements still cannot meet the urgent needs of image and video understanding. The recently rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. In particular, these types of media data usually contain very complex social activities of a group of people (e.g. YouTube video of a wedding reception) and are captured by consumer devices with poor visual quality. Thus it is extremely challenging to automatically understand such a high number of complex image and video categories, especially when these categories have never been seen before. One way to understand categories with no or few examples is by transfer learning which transfers knowledge across related domains, tasks, or distributions. In particular, recently lifelong learning has become popular which aims at transferring information to tasks without any observed data. In computer vision, transfer learning often takes the form of attribute learning. The key underpinning idea of attribute learning is to exploit transfer learning via an intermediatelevel semantic representations – attributes. The semantic attributes are most commonly used as a semantically meaningful bridge between low feature data and higher level class concepts, since they can be used both descriptively (e.g., ’has legs’) and discriminatively (e.g., ’cats have it but dogs do not’). Previous works propose many different attribute learning models for image and video understanding. However, there are several intrinsic limitations and problems that exist in previous attribute learning work. Such limitations discussed in this thesis include limitations of user-defined attributes, projection domain-shift problems, prototype sparsity problems, inability to combine multiple semantic representations and noisy annotations of relative attributes. To tackle these limitations, this thesis explores attribute learning on image and video understanding from the following three aspects. Firstly to break the limitations of user-defined attributes, a framework for learning latent attributes is present for automatic classification and annotation of unstructured group social activity in videos, which enables the tasks of attribute learning for understanding complex multimedia data with sparse and incomplete labels. We investigate the learning of latent attributes for content-based understanding, which aims to model and predict classes and tags relevant to objects, sounds and events – anything likely to be used by humans to describe or search for media. Secondly, we propose the framework of transductive multi-view embedding hypergraph label propagation and solve three inherent limitations of most previous attribute learning work, i.e., the projection domain shift problems, the prototype sparsity problems and the inability to combine multiple semantic representations. We explore the manifold structure of the data distributions of different views projected onto the same embedding space via label propagation on a graph. Thirdly a novel framework for robust learning is presented to effectively learn relative attributes from the extremely noisy and sparse annotations. Relative attributes are increasingly learned from pairwise comparisons collected via crowdsourcing tools which are more economic and scalable than the conventional laboratory based data annotation. However, a major challenge for taking a crowdsourcing strategy is the detection and pruning of outliers. We thus propose a principled way to identify annotation outliers by formulating the relative attribute prediction task as a unified robust learning to rank problem, tackling both the outlier detection and relative attribute prediction tasks jointly. In summary, this thesis studies and solves the key challenges and limitations of attribute learning in image/video understanding. We show the benefits of solving these challenges and limitations in our approach which thus achieves better performance than previous methods

    Analysis of the Impact of Performance on Apps Retention

    Get PDF
    The non-stopping expansion of mobile technologies has produced the swift increase of smartphones with higher computational power, and sophisticated sensing and communication capabilities have provided the foundations to develop apps on the move with PC-like functionality. Indeed, nowadays apps are almost everywhere, and their number has increased exponentially with Apple AppStore, Google Play and other mobile app marketplaces offering millions of apps to users. In this scenario, it is common to find several apps providing similar functionalities to users. However, only a fraction of these applications has a long-term survival rate in app stores. Retention is a metric widely used to quantify the lifespan of mobile apps. Higher app retention corresponds to higher adoption and level of engagement. While existing scientific studies have analysed mobile users' behaviour and support the existence of factors that influence apps retention, the quantification about how do these factors affect long-term usage is still missing. In this thesis, we contribute to these studies quantifying and modelling one of the critical factors that affect app retention: performance. We deepen the analysis of performance based on two key-related variables: network connectivity and battery consumption. The analysis is performed by combining two large-scale crowdsensed datasets. The first includes measurements about network quality and the second about app usage and energy consumption. Our results show the benefits of data fusion to introduce richer contexts impossible of being discovered when analysing data sources individually. We also demonstrate that, indeed, high variations of these variables together and individually affect the likelihood of long-term app usage. But also, that retention is regulated by what users consider reasonable standards of performance, meaning that the improvement of latency and energy consumption does not guarantee higher retention. To provide further insights, we develop a model to predict retention using performance-related variables. Its accuracy in the results allows generalising the effect of performance in long-term usage across categories, locations and moderating variables

    Blind Image Quality Assessment: Exploiting New Evaluation and Design Methodologies

    Get PDF
    The great content diversity of real-world digital images poses a grand challenge to automatically and accurately assess their perceptual quality in a timely manner. In this thesis, we focus on blind image quality assessment (BIQA), which predicts image quality with no access to its pristine quality counterpart. We first establish a large-scale IQA database---the Waterloo Exploration Database. It contains 4,744 pristine natural and 94,880 distorted images, the largest in the IQA field. Instead of collecting subjective opinions for each image, which is extremely difficult, we present three test criteria for evaluating objective BIQA models: pristine/distorted image discriminability test (D-test), listwise ranking consistency test (L-test), and pairwise preference consistency test (P-test). Moreover, we propose a general psychophysical methodology, which we name the group MAximum Differentiation (gMAD) competition method, for comparing computational models of perceptually discriminable quantities. We apply gMAD to the field of IQA and compare 16 objective IQA models of diverse properties. Careful investigations of selected stimuli shed light on how to improve existing models and how to develop next-generation IQA models. The gMAD framework is extensible, allowing future IQA models to be added to the competition. We explore novel approaches for BIQA from two different perspectives. First, we show that a vast amount of reliable training data in the form of quality-discriminable image pairs (DIPs) can be obtained automatically at low cost. We extend a pairwise learning-to-rank (L2R) algorithm to learn BIQA models from millions of DIPs. Second, we propose a multi-task deep neural network for BIQA. It consists of two sub-networks---a distortion identification network and a quality prediction network---sharing the early layers. In the first stage, we train the distortion identification sub-network, for which large-scale training samples are readily available. In the second stage, starting from the pre-trained early layers and the outputs of the first sub-network, we train the quality prediction sub-network using a variant of stochastic gradient descent. Extensive experiments on four benchmark IQA databases demonstrate the proposed two approaches outperform state-of-the-art BIQA models. The robustness of learned models is also significantly improved as confirmed by the gMAD competition methodology

    Multi-objective Search-based Mobile Testing

    Get PDF
    Despite the tremendous popularity of mobile applications, mobile testing still relies heavily on manual testing. This thesis presents mobile test automation approaches based on multi-objective search. We introduce three approaches: Sapienz (for native Android app testing), Octopuz (for hybrid/web JavaScript app testing) and Polariz (for using crowdsourcing to support search-based mobile testing). These three approaches represent the primary scientific and technical contributions of the thesis. Since crowdsourcing is, itself, an emerging research area, and less well understood than search-based software engineering, the thesis also provides the first comprehensive survey on the use of crowdsourcing in software testing (in particular) and in software engineering (more generally). This survey represents a secondary contribution. Sapienz is an approach to Android testing that uses multi-objective search-based testing to automatically explore and optimise test sequences, minimising their length, while simultaneously maximising their coverage and fault revelation. The results of empirical studies demonstrate that Sapienz significantly outperforms both the state-of-the-art technique Dynodroid and the widely-used tool, Android Monkey, on all three objectives. When applied to the top 1,000 Google Play apps, Sapienz found 558 unique, previously unknown crashes. Octopuz reuses the Sapienz multi-objective search approach for automated JavaScript testing, aiming to investigate whether it replicates the Sapienz’ success on JavaScript testing. Experimental results on 10 real-world JavaScript apps provide evidence that Octopuz significantly outperforms the state of the art (and current state of practice) in automated JavaScript testing. Polariz is an approach that combines human (crowd) intelligence with machine (computational search) intelligence for mobile testing. It uses a platform that enables crowdsourced mobile testing from any source of app, via any terminal client, and by any crowd of workers. It generates replicable test scripts based on manual test traces produced by the crowd workforce, and automatically extracts from these test traces, motif events that can be used to improve search-based mobile testing approaches such as Sapienz

    Adaptivity of 3D web content in web-based virtual museums : a quality of service and quality of experience perspective

    Get PDF
    The 3D Web emerged as an agglomeration of technologies that brought the third dimension to the World Wide Web. Its forms spanned from being systems with limited 3D capabilities to complete and complex Web-Based Virtual Worlds. The advent of the 3D Web provided great opportunities to museums by giving them an innovative medium to disseminate collections' information and associated interpretations in the form of digital artefacts, and virtual reconstructions thus leading to a new revolutionary way in cultural heritage curation, preservation and dissemination thereby reaching a wider audience. This audience consumes 3D Web material on a myriad of devices (mobile devices, tablets and personal computers) and network regimes (WiFi, 4G, 3G, etc.). Choreographing and presenting 3D Web components across all these heterogeneous platforms and network regimes present a significant challenge yet to overcome. The challenge is to achieve a good user Quality of Experience (QoE) across all these platforms. This means that different levels of fidelity of media may be appropriate. Therefore, servers hosting those media types need to adapt to the capabilities of a wide range of networks and devices. To achieve this, the research contributes the design and implementation of Hannibal, an adaptive QoS & QoE-aware engine that allows Web-Based Virtual Museums to deliver the best possible user experience across those platforms. In order to ensure effective adaptivity of 3D content, this research furthers the understanding of the 3D web in terms of Quality of Service (QoS) through empirical investigations studying how 3D Web components perform and what are their bottlenecks and in terms of QoE studying the subjective perception of fidelity of 3D Digital Heritage artefacts. Results of these experiments lead to the design and implementation of Hannibal
    corecore