4,480 research outputs found

    Méthodes pour l'évaluation et la prédiction de la Qualité d'expérience, la préférence et l'inconfort visuel dans les applications multimédia. Focus sur la TV 3D stéréoscopique

    Get PDF
    Multimedia technology is aiming to improve people's viewing experience, seeking for better immersiveness and naturalness. The development of HDTV, 3DTV, and Ultra HDTV are recent illustrative examples of this trend. The Quality of Experience (QoE) in multimedia encompass multiple perceptual dimensions. For instance, in 3DTV, three primary dimensions have been identified in literature: image quality, depth quality and visual comfort. In this thesis, focusing on the 3DTV, two basic questions about QoE are studied. One is "how to subjectively assess QoE taking care of its multidimensional aspect?". The other is dedicated to one particular dimension, i.e., "what would induce visual discomfort and how to predict it?". In the first part, the challenges of the subjective assessment on QoE are introduced, and a possible solution called "Paired Comparison" is analyzed. To overcome drawbacks of Paired Comparison method, a new formalism based on a set of optimized paired comparison designs is proposed and evaluated by different subjective experiments. The test results verified efficiency and robustness of this new formalism. An application is the described focusing on the evaluation of the influence factor on 3D QoE. In the second part, the influence of 3D motion on visual discomfort is studied. An objective visual discomfort model is proposed. The model showed high correlation with the subjective data obtained through various experimental conditions. Finally, a physiological study on the relationship between visual discomfort and eye blinking rate is presented.La technologie multimédia vise à améliorer l'expérience visuelle des spectateurs, notamment sur le plan de l'immersion. Les développements récents de la TV HD, TV 3D, et TV Ultra HD s'inscrivent dans cette logique. La qualité d'expérience (QoE) multimédia implique plusieurs dimensions perceptuelles. Dans le cas particulier de la TV 3D stéréoscopique, trois dimensions primaires ont été identifiées dans la littérature: qualité d'image, qualité de la profondeur et confort visuel. Dans cette thèse, deux questions fondamentales sur la QoE sont étudiés. L'une a pour objet "comment évaluer subjectivement le caractère multidimensionnel de la QoE". L'autre s'intéresse à une dimension particuliére de QoE, "la mesure de l'inconfort et sa prédiction?". Dans la première partie, les difficultés de l'évaluation subjective de la QoE sont introduites, les mérites de méthodes de type "Comparaison par paire" (Paired Comparison en anglais) sont analysés. Compte tenu des inconvénients de la méthode de Comparaison par paires, un nouveau formalisme basé sur un ensemble de comparaisons par paires optimisées, est proposé. Celui-ci est évalué au travers de différentes expériences subjectives. Les résultats des tests confirment l'efficacité et la robustesse de ce formalisme. Un exemple d'application dans le cas de l'étude de l'évaluation des facteurs influençant la QoE est ensuite présenté. Dans la seconde partie, l'influence du mouvement tri-dimensionnel (3D) sur l'inconfort visuel est étudié. Un modèle objectif de l'inconfort visuel est proposé. Pour évaluer ce modèle, une expérience subjective de comparaison par paires a été conduite. Ce modèle de prédiction conduit à des corrélations élevées avec les données subjectives. Enfin, une étude sur des mesures physiologiques tentant de relier inconfort visuel et fréquence de clignements des yeux présentée

    Performance and Evaluation in Computed Tomographic Colonography Screening for Colorectal Cancer

    Get PDF
    Each year over 20,000 people die from colorectal cancer (CRC). However, despite causing the second highest number of cancer deaths, CRC is not only curable if detected early but can be prevented by population screening. The detection and removal of pre-malignant polyps in the colon prevents cancer from ever developing. As such, screening of the at-risk population (those over 45-50 years) confers protection against CRC incidence and mortality. Although the principles and benefit of screening are well established, the adequate provision of screening is a complex process requiring robust healthcare infrastructure, evidence-based quality assurance and resources. The success of any screening programme is dependent on the accuracy of the screening investigations deployed and sufficiently high uptake by the target population. In England, the Bowel Cancer Screening Programme (BCSP) delivers screening via initial stool testing to triage patients for the endoscopic procedure, colonoscopy, or the radiological investigation CT colonography (CTC) in some patients. There has been considerable investment in colonoscopy accreditation processes which contribute to high quality services, suitable access for patients and a competent endoscopy workforce. The performance of colonoscopists in the BCSP is tightly monitored and regulated; however, the same is not true for CTC. Comparatively, there has been little investment in CTC services, and in fact there is no mandatory accreditation or centralised training. Instead, CTC reporting radiologists must learn ad hoc on the job, or at self-funded commercial workshops. This inevitably leads to variability in quality and expertise, inequity in service provision, and could negatively impact patient outcomes. To address this disparity and develop evidence-based training, one must determine what factors affect the performance of CTC reporting radiologists, what CTC training is necessary, and what training works. This thesis investigates these topics and is structured as follows: Section A reviews the background literature, describing the public health burden of CRC and the role of screening. Aspects of CTC screening and its role in the BCSP are explored. The importance of performance monitoring and value of accreditation are examined and the disparity between CTC, colonoscopy and other imaging-based screening programmes is discussed. Section B expands on radiologist performance by determining the post-imaging CRC (or interval cancer) rate through systematic review and meta-analysis. Factors contributing to the interval cancer rate are evaluated, and an observational study assessing factors affecting CTC accuracy is presented. The impact of CTC training is assessed via a structured review and best principles for training delivery are discussed. Section C presents a multicentre, cluster-randomised control trial developed from the data and understanding described in Sections A and B. Section D summarises the thesis and discusses future recommendations and research

    A deep evaluator for image retargeting quality by geometrical and contextual interaction

    Get PDF
    An image is compressed or stretched during the multidevice displaying, which will have a very big impact on perception quality. In order to solve this problem, a variety of image retargeting methods have been proposed for the retargeting process. However, how to evaluate the results of different image retargeting is a very critical issue. In various application systems, the subjective evaluation method cannot be applied on a large scale. So we put this problem in the accurate objective-quality evaluation. Currently, most of the image retargeting quality assessment algorithms use simple regression methods as the last step to obtain the evaluation result, which are not corresponding with the perception simulation in the human vision system (HVS). In this paper, a deep quality evaluator for image retargeting based on the segmented stacked AutoEnCoder (SAE) is proposed. Through the help of regularization, the designed deep learning framework can solve the overfitting problem. The main contributions in this framework are to simulate the perception of retargeted images in HVS. Especially, it trains two separated SAE models based on geometrical shape and content matching. Then, the weighting schemes can be used to combine the obtained scores from two models. Experimental results in three well-known databases show that our method can achieve better performance than traditional methods in evaluating different image retargeting results

    Temporal Dynamics of Decision-Making during Motion Perception in the Visual Cortex

    Get PDF
    How does the brain make decisions? Speed and accuracy of perceptual decisions covary with certainty in the input, and correlate with the rate of evidence accumulation in parietal and frontal cortical "decision neurons." A biophysically realistic model of interactions within and between Retina/LGN and cortical areas V1, MT, MST, and LIP, gated by basal ganglia, simulates dynamic properties of decision-making in response to ambiguous visual motion stimuli used by Newsome, Shadlen, and colleagues in their neurophysiological experiments. The model clarifies how brain circuits that solve the aperture problem interact with a recurrent competitive network with self-normalizing choice properties to carry out probablistic decisions in real time. Some scientists claim that perception and decision-making can be described using Bayesian inference or related general statistical ideas, that estimate the optimal interpretation of the stimulus given priors and likelihoods. However, such concepts do not propose the neocortical mechanisms that enable perception, and make decisions. The present model explains behavioral and neurophysiological decision-making data without an appeal to Bayesian concepts and, unlike other existing models of these data, generates perceptual representations and choice dynamics in response to the experimental visual stimuli. Quantitative model simulations include the time course of LIP neuronal dynamics, as well as behavioral accuracy and reaction time properties, during both correct and error trials at different levels of input ambiguity in both fixed duration and reaction time tasks. Model MT/MST interactions compute the global direction of random dot motion stimuli, while model LIP computes the stochastic perceptual decision that leads to a saccadic eye movement.National Science Foundation (SBE-0354378, IIS-02-05271); Office of Naval Research (N00014-01-1-0624); National Institutes of Health (R01-DC-02852

    Clutter Detection and Removal in 3D Scenes with View-Consistent Inpainting

    Full text link
    Removing clutter from scenes is essential in many applications, ranging from privacy-concerned content filtering to data augmentation. In this work, we present an automatic system that removes clutter from 3D scenes and inpaints with coherent geometry and texture. We propose techniques for its two key components: 3D segmentation from shared properties and 3D inpainting, both of which are important problems. The definition of 3D scene clutter (frequently-moving objects) is not well captured by commonly-studied object categories in computer vision. To tackle the lack of well-defined clutter annotations, we group noisy fine-grained labels, leverage virtual rendering, and impose an instance-level area-sensitive loss. Once clutter is removed, we inpaint geometry and texture in the resulting holes by merging inpainted RGB-D images. This requires novel voting and pruning strategies that guarantee multi-view consistency across individually inpainted images for mesh reconstruction. Experiments on ScanNet and Matterport dataset show that our method outperforms baselines for clutter segmentation and 3D inpainting, both visually and quantitatively.Comment: 18 pages. ICCV 2023. Project page: https://weify627.github.io/clutter

    Application and Test of Web-based Adaptive Polyhedral Conjoint Analysis

    Get PDF
    In response to the need for more rapid and iterative feedback on customer preferences, researchers are developing new web-based conjoint analysis methods that adapt the design of conjoint questions based on a respondent’s answers to previous questions. Adapting within a respondent is a difficult dy-namic optimization problem and until recently adaptive conjoint analysis (ACA) was the dominant method available for addressing this adaptation. In this paper we apply and test a new polyhedral method that uses “interior-point” math programming techniques. This method is benchmarked against both ACA and an efficient non-adaptive design (Fixed). Over 300 respondents were randomly assigned to different experimental conditions and were asked to complete a web-based conjoint exercise. The conditions varied based on the design of the con-joint exercise. Respondents in one group completed a conjoint exercise designed using the ACA method, respondents in another group completed an exercise designed using the Fixed method, and the remaining respondents completed an exercise designed using the polyhedral method. Following the conjoint exer-cise respondents were given 100andallowedtomakeapurchasefromaParetochoicesetoffivenewtothemarketlaptopcomputerbags.Therespondentsreceivedtheirchosenbagtogetherwiththedifferenceincashbetweenthepriceoftheirchosenbagandthe100 and allowed to make a purchase from a Pareto choice set of five new-to-the-market laptop computer bags. The respondents received their chosen bag together with the differ-ence in cash between the price of their chosen bag and the 100. We compare the methods on both internal and external validity. Internal validity is evaluated by comparing how well the different conjoint methods predict several holdout conjoint questions. External validity is evaluated by comparing how well the conjoint methods predict the respondents’ selections from the choice sets of five bags. The results reveal a remarkable level of consistency across the two validation tasks. The polyhe-dral method was consistently more accurate than both the ACA and Fixed methods. However, even better performance was achieved by combining (post hoc) different components of each method to create a range of hybrid methods. Additional analyses evaluate the robustness of the predictions and explore al-ternative estimation methods such as Hierarchical Bayes. At the time of the test, the bags were proto-types. Based, in part, on the results of this study these bags are now commercially available.The Sloan School of Management, the Center for Innovation in Product Development at MIT and the EBusiness Center at MI

    Geometric Structure Extraction and Reconstruction

    Get PDF
    Geometric structure extraction and reconstruction is a long-standing problem in research communities including computer graphics, computer vision, and machine learning. Within different communities, it can be interpreted as different subproblems such as skeleton extraction from the point cloud, surface reconstruction from multi-view images, or manifold learning from high dimensional data. All these subproblems are building blocks of many modern applications, such as scene reconstruction for AR/VR, object recognition for robotic vision and structural analysis for big data. Despite its importance, the extraction and reconstruction of a geometric structure from real-world data are ill-posed, where the main challenges lie in the incompleteness, noise, and inconsistency of the raw input data. To address these challenges, three studies are conducted in this thesis: i) a new point set representation for shape completion, ii) a structure-aware data consolidation method, and iii) a data-driven deep learning technique for multi-view consistency. In addition to theoretical contributions, the algorithms we proposed significantly improve the performance of several state-of-the-art geometric structure extraction and reconstruction approaches, validated by extensive experimental results

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
    corecore