1,426 research outputs found

    Revisiting Visual Question Answering Baselines

    Full text link
    Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding. Many of the recently proposed VQA systems include attention or memory mechanisms designed to support "reasoning". For multiple-choice VQA, nearly all of these systems train a multi-class classifier on image and question features to predict an answer. This paper questions the value of these common practices and develops a simple alternative model based on binary classification. Instead of treating answers as competing choices, our model receives the answer as input and predicts whether or not an image-question-answer triplet is correct. We evaluate our model on the Visual7W Telling and the VQA Real Multiple Choice tasks, and find that even simple versions of our model perform competitively. Our best model achieves state-of-the-art performance on the Visual7W Telling task and compares surprisingly well with the most complex systems proposed for the VQA Real Multiple Choice task. We explore variants of the model and study its transferability between both datasets. We also present an error analysis of our model that suggests a key problem of current VQA systems lies in the lack of visual grounding of concepts that occur in the questions and answers. Overall, our results suggest that the performance of current VQA systems is not significantly better than that of systems designed to exploit dataset biases.Comment: European Conference on Computer Visio

    Ordinal coding of image microstructure

    Get PDF

    Joint induction of shape features and tree classifiers

    Full text link

    Moment instabilities in multidimensional systems with noise

    Full text link
    We present a systematic study of moment evolution in multidimensional stochastic difference systems, focusing on characterizing systems whose low-order moments diverge in the neighborhood of a stable fixed point. We consider systems with a simple, dominant eigenvalue and stationary, white noise. When the noise is small, we obtain general expressions for the approximate asymptotic distribution and moment Lyapunov exponents. In the case of larger noise, the second moment is calculated using a different approach, which gives an exact result for some types of noise. We analyze the dependence of the moments on the system's dimension, relevant system properties, the form of the noise, and the magnitude of the noise. We determine a critical value for noise strength, as a function of the unperturbed system's convergence rate, above which the second moment diverges and large fluctuations are likely. Analytical results are validated by numerical simulations. We show that our results cannot be extended to the continuous time limit except in certain special cases.Comment: 21 pages, 15 figure

    3D Model based stereo reconstruction using coupled Markov random fields

    Full text link

    A Solution to the Galactic Foreground Problem for LISA

    Full text link
    Low frequency gravitational wave detectors, such as the Laser Interferometer Space Antenna (LISA), will have to contend with large foregrounds produced by millions of compact galactic binaries in our galaxy. While these galactic signals are interesting in their own right, the unresolved component can obscure other sources. The science yield for the LISA mission can be improved if the brighter and more isolated foreground sources can be identified and regressed from the data. Since the signals overlap with one another we are faced with a ``cocktail party'' problem of picking out individual conversations in a crowded room. Here we present and implement an end-to-end solution to the galactic foreground problem that is able to resolve tens of thousands of sources from across the LISA band. Our algorithm employs a variant of the Markov Chain Monte Carlo (MCMC) method, which we call the Blocked Annealed Metropolis-Hastings (BAM) algorithm. Following a description of the algorithm and its implementation, we give several examples ranging from searches for a single source to searches for hundreds of overlapping sources. Our examples include data sets from the first round of Mock LISA Data Challenges.Comment: 19 pages, 27 figure

    Multi-State Image Restoration by Transmission of Bit-Decomposed Data

    Get PDF
    We report on the restoration of gray-scale image when it is decomposed into a binary form before transmission. We assume that a gray-scale image expressed by a set of Q-Ising spins is first decomposed into an expression using Ising (binary) spins by means of the threshold division, namely, we produce (Q-1) binary Ising spins from a Q-Ising spin by the function F(\sigma_i - m) = 1 if the input data \sigma_i \in {0,.....,Q-1} is \sigma_i \geq m and 0 otherwise, where m \in {1,....,Q-1} is the threshold value. The effects of noise are different from the case where the raw Q-Ising values are sent. We investigate which is more effective to use the binary data for transmission or to send the raw Q-Ising values. By using the mean-field model, we first analyze the performance of our method quantitatively. Then we obtain the static and dynamical properties of restoration using the bit-decomposed data. In order to investigate what kind of original picture is efficiently restored by our method, the standard image in two dimensions is simulated by the mean-field annealing, and we compare the performance of our method with that using the Q-Ising form. We show that our method is more efficient than the one using the Q-Ising form when the original picture has large parts in which the nearest neighboring pixels take close values.Comment: latex 24 pages using REVTEX, 10 figures, 4 table

    Smart sensing systems for in-home health status and emotional well-being monitoring during COVID-19

    Get PDF
    The COVID-19 pandemic has restricted the mobility of the population. The experts propose several solutions in order to decrease the number of patients infected with this new virus by treating and monitoring them within the comfort of their own home. A new direction for the research has been identified including healthcare smart sensing systems which can provide medical diagnoses, surveillance, and treatment partially or totally remotely. The field of wearable, smart sensing solutions is becoming nowadays a widely accepted solution characterized also by the increased level of acceptance with regard to home health status monitoring. Pervasive computing and wearable solutions are frequently a topic included in current projects and are expected in new future developments, particularly in the pandemic context which forces people to remain mostly at home. As part of wearable devices the design of textiles, computer science, and smart materials are the three major development directions. The latest developments associated with the monitoring of health status and emotional well-being are presented and discussed in this chapter.info:eu-repo/semantics/submittedVersio

    Geodesics of Random Riemannian Metrics

    Full text link
    We analyze the disordered Riemannian geometry resulting from random perturbations of the Euclidean metric. We focus on geodesics, the paths traced out by a particle traveling in this quenched random environment. By taking the point of the view of the particle, we show that the law of its observed environment is absolutely continuous with respect to the law of the random metric, and we provide an explicit form for its Radon-Nikodym derivative. We use this result to prove a "local Markov property" along an unbounded geodesic, demonstrating that it eventually encounters any type of geometric phenomenon. We also develop in this paper some general results on conditional Gaussian measures. Our Main Theorem states that a geodesic chosen with random initial conditions (chosen independently of the metric) is almost surely not minimizing. To demonstrate this, we show that a minimizing geodesic is guaranteed to eventually pass over a certain "bump surface," which locally has constant positive curvature. By using Jacobi fields, we show that this is sufficient to destabilize the minimizing property.Comment: 55 pages. Supplementary material at arXiv:1206.494

    Interpretable by Design: Learning Predictors by Composing Interpretable Queries

    Full text link
    There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive domains such as healthcare. We argue that machine learning algorithms should be interpretable by design and that the language in which these interpretations are expressed should be domain- and task-dependent. Consequently, we base our model's prediction on a family of user-defined and task-specific binary functions of the data, each having a clear interpretation to the end-user. We then minimize the expected number of queries needed for accurate prediction on any given input. As the solution is generally intractable, following prior work, we choose the queries sequentially based on information gain. However, in contrast to previous work, we need not assume the queries are conditionally independent. Instead, we leverage a stochastic generative model (VAE) and an MCMC algorithm (Unadjusted Langevin) to select the most informative query about the input based on previous query-answers. This enables the online determination of a query chain of whatever depth is required to resolve prediction ambiguities. Finally, experiments on vision and NLP tasks demonstrate the efficacy of our approach and its superiority over post-hoc explanations.Comment: 29 pages, 14 figures. Accepted as a Regular Paper in Transactions on Pattern Analysis and Machine Intelligenc
    corecore