35,779 research outputs found
Data Analysis in Multimedia Quality Assessment: Revisiting the Statistical Tests
Assessment of multimedia quality relies heavily on subjective assessment, and
is typically done by human subjects in the form of preferences or continuous
ratings. Such data is crucial for analysis of different multimedia processing
algorithms as well as validation of objective (computational) methods for the
said purpose. To that end, statistical testing provides a theoretical framework
towards drawing meaningful inferences, and making well grounded conclusions and
recommendations. While parametric tests (such as t test, ANOVA, and error
estimates like confidence intervals) are popular and widely used in the
community, there appears to be a certain degree of confusion in the application
of such tests. Specifically, the assumption of normality and homogeneity of
variance is often not well understood. Therefore, the main goal of this paper
is to revisit them from a theoretical perspective and in the process provide
useful insights into their practical implications. Experimental results on both
simulated and real data are presented to support the arguments made. A software
implementing the said recommendations is also made publicly available, in order
to achieve the goal of reproducible research
Measurement errors in body size of sea scallops (Placopecten magellanicus) and their effect on stock assessment models
Body-size measurement errors are usually ignored in stock
assessments, but may be important when body-size data (e.g., from visual sur veys) are imprecise. We used
experiments and models to quantify measurement errors and their effects on assessment models for sea scallops
(Placopecten magellanicus). Errors in size data obscured modes from strong year classes and increased frequency
and size of the largest and smallest sizes, potentially biasing growth, mortality, and biomass estimates. Modeling
techniques for errors in age data proved useful for errors in size data. In terms of a goodness of model fit to the assessment data, it was more important to accommodate variance than bias. Models that accommodated size errors fitted size data substantially better. We recommend experimental quantification of errors along with a modeling approach that accommodates measurement errors because a direct algebraic approach was not robust and because error parameters were diff icult to estimate in our assessment model. The importance of measurement errors depends on
many factors and should be evaluated on a case by case basis
No Modes left behind: Capturing the data distribution effectively using GANs
Generative adversarial networks (GANs) while being very versatile in
realistic image synthesis, still are sensitive to the input distribution. Given
a set of data that has an imbalance in the distribution, the networks are
susceptible to missing modes and not capturing the data distribution. While
various methods have been tried to improve training of GANs, these have not
addressed the challenges of covering the full data distribution. Specifically,
a generator is not penalized for missing a mode. We show that these are
therefore still susceptible to not capturing the full data distribution.
In this paper, we propose a simple approach that combines an encoder based
objective with novel loss functions for generator and discriminator that
improves the solution in terms of capturing missing modes. We validate that the
proposed method results in substantial improvements through its detailed
analysis on toy and real datasets. The quantitative and qualitative results
demonstrate that the proposed method improves the solution for the problem of
missing modes and improves training of GANs.Comment: accepted to AAAI 2018 conferenc
Subjective Annotation for a Frame Interpolation Benchmark using Artefact Amplification
Current benchmarks for optical flow algorithms evaluate the estimation either
directly by comparing the predicted flow fields with the ground truth or
indirectly by using the predicted flow fields for frame interpolation and then
comparing the interpolated frames with the actual frames. In the latter case,
objective quality measures such as the mean squared error are typically
employed. However, it is well known that for image quality assessment, the
actual quality experienced by the user cannot be fully deduced from such simple
measures. Hence, we conducted a subjective quality assessment crowdscouring
study for the interpolated frames provided by one of the optical flow
benchmarks, the Middlebury benchmark. We collected forced-choice paired
comparisons between interpolated images and corresponding ground truth. To
increase the sensitivity of observers when judging minute difference in paired
comparisons we introduced a new method to the field of full-reference quality
assessment, called artefact amplification. From the crowdsourcing data, we
reconstructed absolute quality scale values according to Thurstone's model. As
a result, we obtained a re-ranking of the 155 participating algorithms w.r.t.
the visual quality of the interpolated frames. This re-ranking not only shows
the necessity of visual quality assessment as another evaluation metric for
optical flow and frame interpolation benchmarks, the results also provide the
ground truth for designing novel image quality assessment (IQA) methods
dedicated to perceptual quality of interpolated images. As a first step, we
proposed such a new full-reference method, called WAE-IQA. By weighing the
local differences between an interpolated image and its ground truth WAE-IQA
performed slightly better than the currently best FR-IQA approach from the
literature.Comment: arXiv admin note: text overlap with arXiv:1901.0536
- …