187,496 research outputs found

    CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering

    Full text link
    Visual Question Answering (VQA) is a multi-discipline research task. To produce the right answer, it requires an understanding of the visual content of images, the natural language questions, as well as commonsense reasoning over the information contained in the image and world knowledge. Recently, large-scale Vision-and-Language Pre-trained Models (VLPMs) have been the mainstream approach to VQA tasks due to their superior performance. The standard practice is to fine-tune large-scale VLPMs pre-trained on huge general-domain datasets using the domain-specific VQA datasets. However, in reality, the application domain can change over time, necessitating VLPMs to continually learn and adapt to new domains without forgetting previously acquired knowledge. Most existing continual learning (CL) research concentrates on unimodal tasks, whereas a more practical application scenario, i.e, CL on cross-domain VQA, has not been studied. Motivated by this, we introduce CL-CrossVQA, a rigorous Continual Learning benchmark for Cross-domain Visual Question Answering, through which we conduct extensive experiments on 4 VLPMs, 4 CL approaches, and 5 VQA datasets from different domains. In addition, by probing the forgetting phenomenon of the intermediate layers, we provide insights into how model architecture affects CL performance, why CL approaches can help mitigate forgetting in VLPMs to some extent, and how to design CL approaches suitable for VLPMs in this challenging continual learning environment. To facilitate future work on CL for cross-domain VQA, we will release our datasets and code.Comment: 10 pages, 6 figure

    Visual7W: Grounded Question Answering in Images

    Full text link
    We have seen great progress in basic perceptual tasks such as object recognition and detection. However, AI models still fail to match humans in high-level vision tasks due to the lack of capacities for deeper reasoning. Recently the new task of visual question answering (QA) has been proposed to evaluate a model's capacity for deep image understanding. Previous works have established a loose, global association between QA sentences and images. However, many questions and answers, in practice, relate to local regions in the images. We establish a semantic link between textual descriptions and image regions by object-level grounding. It enables a new type of QA with visual answers, in addition to textual answers used in previous work. We study the visual QA tasks in a grounded setting with a large collection of 7W multiple-choice QA pairs. Furthermore, we evaluate human performance and several baseline models on the QA tasks. Finally, we propose a novel LSTM model with spatial attention to tackle the 7W QA tasks.Comment: CVPR 201

    Silent reading of direct versus indirect speech activates voice-selective areas in the auditory cortex

    Get PDF
    In human communication, direct speech (e.g., Mary said: “I'm hungry”) is perceived to be more vivid than indirect speech (e.g., Mary said [that] she was hungry). However, for silent reading, the representational consequences of this distinction are still unclear. Although many of us share the intuition of an “inner voice,” particularly during silent reading of direct speech statements in text, there has been little direct empirical confirmation of this experience so far. Combining fMRI with eye tracking in human volunteers, we show that silent reading of direct versus indirect speech engenders differential brain activation in voice-selective areas of the auditory cortex. This suggests that readers are indeed more likely to engage in perceptual simulations (or spontaneous imagery) of the reported speaker's voice when reading direct speech as opposed to meaning-equivalent indirect speech statements as part of a more vivid representation of the former. Our results may be interpreted in line with embodied cognition and form a starting point for more sophisticated interdisciplinary research on the nature of auditory mental simulation during reading

    The economic value of intermediate vocational education and qualifications

    Get PDF

    Does early success and market dominance help or hinder future innovative capability?

    Get PDF
    Thesis (S.M.M.O.T.)--Massachusetts Institute of Technology, Sloan School of Management, Management of Technology Program, 2007.Includes bibliographical references (p. 65).Many successful innovative companies are acquired and become absorbed into larger more structured organizations. The innovation capabilities of the company change in the new environment depending on the extent to which they are nurtured or overridden. This thesis looks at one particular story of such an acquisition and follows its progress after it has been formally integrated into the acquiring company. More than five years after the acquisition the company's innovation is struggling, perhaps even more so in recent years. This thesis looks at the underlying causes of that struggle, the inflexibility of the larger more structured organization and the resistance of the acquired company, due to its earlier success, to adapt itself to the rigours of a larger company,. The thesis strives to answer the question: "Does early success and market dominance help or hinder future innovative capability?" The author worked with the acquired company, as a management consultant, for a four year period beginning shortly after the company was acquired. The culture then, was strikingly positive and very enjoyable to work in. It had an almost magnetic draw.(cont.) During the years that followed that culture began to be eroded and the atmosphere changed palpably as people struggled with the manner in which new systems and structures were being established. There was constantly a sense of being imposed on by the parent company rather than being support. That eventually took its toll on people and in recent years some key employees have left. Having stepped back from the organization, the author continues to reflect on what could have been done differently along with what can be done today to retain and restore some of that strong company creativity and innovativeness. The author's underlying purpose for doing this thesis, in addition to answering the research questions, is to reaffirm the belief that profitable, successful businesses and strongly held values can and should coexist.by Sinead E. O'Flanagan.S.M.M.O.T
    • …
    corecore