35,779 research outputs found

    Data Analysis in Multimedia Quality Assessment: Revisiting the Statistical Tests

    Full text link
    Assessment of multimedia quality relies heavily on subjective assessment, and is typically done by human subjects in the form of preferences or continuous ratings. Such data is crucial for analysis of different multimedia processing algorithms as well as validation of objective (computational) methods for the said purpose. To that end, statistical testing provides a theoretical framework towards drawing meaningful inferences, and making well grounded conclusions and recommendations. While parametric tests (such as t test, ANOVA, and error estimates like confidence intervals) are popular and widely used in the community, there appears to be a certain degree of confusion in the application of such tests. Specifically, the assumption of normality and homogeneity of variance is often not well understood. Therefore, the main goal of this paper is to revisit them from a theoretical perspective and in the process provide useful insights into their practical implications. Experimental results on both simulated and real data are presented to support the arguments made. A software implementing the said recommendations is also made publicly available, in order to achieve the goal of reproducible research

    Measurement errors in body size of sea scallops (Placopecten magellanicus) and their effect on stock assessment models

    Get PDF
    Body-size measurement errors are usually ignored in stock assessments, but may be important when body-size data (e.g., from visual sur veys) are imprecise. We used experiments and models to quantify measurement errors and their effects on assessment models for sea scallops (Placopecten magellanicus). Errors in size data obscured modes from strong year classes and increased frequency and size of the largest and smallest sizes, potentially biasing growth, mortality, and biomass estimates. Modeling techniques for errors in age data proved useful for errors in size data. In terms of a goodness of model fit to the assessment data, it was more important to accommodate variance than bias. Models that accommodated size errors fitted size data substantially better. We recommend experimental quantification of errors along with a modeling approach that accommodates measurement errors because a direct algebraic approach was not robust and because error parameters were diff icult to estimate in our assessment model. The importance of measurement errors depends on many factors and should be evaluated on a case by case basis

    No Modes left behind: Capturing the data distribution effectively using GANs

    Full text link
    Generative adversarial networks (GANs) while being very versatile in realistic image synthesis, still are sensitive to the input distribution. Given a set of data that has an imbalance in the distribution, the networks are susceptible to missing modes and not capturing the data distribution. While various methods have been tried to improve training of GANs, these have not addressed the challenges of covering the full data distribution. Specifically, a generator is not penalized for missing a mode. We show that these are therefore still susceptible to not capturing the full data distribution. In this paper, we propose a simple approach that combines an encoder based objective with novel loss functions for generator and discriminator that improves the solution in terms of capturing missing modes. We validate that the proposed method results in substantial improvements through its detailed analysis on toy and real datasets. The quantitative and qualitative results demonstrate that the proposed method improves the solution for the problem of missing modes and improves training of GANs.Comment: accepted to AAAI 2018 conferenc

    Subjective Annotation for a Frame Interpolation Benchmark using Artefact Amplification

    Get PDF
    Current benchmarks for optical flow algorithms evaluate the estimation either directly by comparing the predicted flow fields with the ground truth or indirectly by using the predicted flow fields for frame interpolation and then comparing the interpolated frames with the actual frames. In the latter case, objective quality measures such as the mean squared error are typically employed. However, it is well known that for image quality assessment, the actual quality experienced by the user cannot be fully deduced from such simple measures. Hence, we conducted a subjective quality assessment crowdscouring study for the interpolated frames provided by one of the optical flow benchmarks, the Middlebury benchmark. We collected forced-choice paired comparisons between interpolated images and corresponding ground truth. To increase the sensitivity of observers when judging minute difference in paired comparisons we introduced a new method to the field of full-reference quality assessment, called artefact amplification. From the crowdsourcing data, we reconstructed absolute quality scale values according to Thurstone's model. As a result, we obtained a re-ranking of the 155 participating algorithms w.r.t. the visual quality of the interpolated frames. This re-ranking not only shows the necessity of visual quality assessment as another evaluation metric for optical flow and frame interpolation benchmarks, the results also provide the ground truth for designing novel image quality assessment (IQA) methods dedicated to perceptual quality of interpolated images. As a first step, we proposed such a new full-reference method, called WAE-IQA. By weighing the local differences between an interpolated image and its ground truth WAE-IQA performed slightly better than the currently best FR-IQA approach from the literature.Comment: arXiv admin note: text overlap with arXiv:1901.0536
    • …
    corecore