519 research outputs found

    Domain Generalization by Solving Jigsaw Puzzles

    Full text link
    Human adaptability relies crucially on the ability to learn and merge knowledge both from supervised and unsupervised learning: the parents point out few important concepts, but then the children fill in the gaps on their own. This is particularly effective, because supervised learning can never be exhaustive and thus learning autonomously allows to discover invariances and regularities that help to generalize. In this paper we propose to apply a similar approach to the task of object recognition across domains: our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals how to solve a jigsaw puzzle on the same images. This secondary task helps the network to learn the concepts of spatial correlation while acting as a regularizer for the classification task. Multiple experiments on the PACS, VLCS, Office-Home and digits datasets confirm our intuition and show that this simple method outperforms previous domain generalization and adaptation solutions. An ablation study further illustrates the inner workings of our approach.Comment: Accepted at CVPR 2019 (oral

    Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts

    Get PDF
    Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people

    NSL-BP: A Meta Classifier Model Based Prediction of Amazon Product Reviews

    Get PDF
    In machine learning, the product rating prediction based on the semantic analysis of the consumers' reviews is a relevant topic. Amazon is one of the most popular online retailers, with millions of customers purchasing and reviewing products. In the literature, many research projects work on the rating prediction of a given review. In this research project, we introduce a novel approach to enhance the accuracy of rating prediction by machine learning methods by processing the reviewed text. We trained our model by using many methods, so we propose a combined model to predict the ratings of products corresponding to a given review content. First, using k-means and LDA, we cluster the products and topics so that it will be easy to predict the ratings having the same kind of products and reviews together. We trained low, neutral, and high models based on clusters and topics of products. Then, by adopting a stacking ensemble model, we combine NaĂŻve Bayes, Logistic Regression, and SVM to predict the ratings. We will combine these models into a two-level stack. We called this newly introduced model, NSL model, and compared the prediction performance with other methods at state of the art

    Concise and Effective Network for 3D Human Modeling from Orthogonal Silhouettes

    Full text link
    In this paper, we revisit the problem of 3D human modeling from two orthogonal silhouettes of individuals (i.e., front and side views). Different from our prior work {\cite{wang2003virtual}}, a supervised learning approach based on \textit{convolutional neural network} (CNN) is investigated to solve the problem by establishing a mapping function that can effectively extract features from two silhouettes and fuse them into coefficients in the shape space of human bodies. A new CNN structure is proposed in our work to exact not only the discriminative features of front and side views and also their mixed features for the mapping function. 3D human models with high accuracy are synthesized from coefficients generated by the mapping function. Existing CNN approaches for 3D human modeling usually learn a large number of parameters (from {8.5M} to {355.4M}) from two binary images. Differently, we investigate a new network architecture and conduct the samples on silhouettes as input. As a consequence, more accurate models can be generated by our network with only {2.4M} coefficients. The training of our network is conducted on samples obtained by augmenting a publicly accessible dataset. Learning transfer by using datasets with a smaller number of scanned models is applied to our network to enable the function of generating results with gender-oriented (or geographical) patterns

    Neural and visual correlates of perceptual decision making in adult dyslexia

    Get PDF
    Humans have to make decisions based on visual information numerous times every day—for example, judging whether it is a friend or simply a nice stranger who is waving at us from the other side of the street, or whether the content of a contract we are about to sign is correct. In particular, perceptual decisions based on good reading comprehension might disadvantage people affected by the specific learning disorder dyslexia, characterised by impairments in reading and writing. In recent years, neuroscience has begun to uncover the neural basis of these impairments in children and adults. However, it remains unknown what neural differences might underlie impaired processing of the physical properties of written words, such as font type and style. The current thesis sought to characterise the neural and oculomotor temporal correlates of font-modulated reading comprehension while also probing a more fundamental deficit in non-linguistic sensory perceptual decision making in adult dyslexia by using a combination of electrophysiological and eye-tracking methods. The first of our three studies (Chapter 2), investigated the impact of italics—a commonly used font style for highlighting important content—on reading comprehension in a sentence reading lexical decision task. Overall, the performance of dyslexics was worse than that of non-dyslexics. Cluster-based event-related potential (ERP) analysis revealed that brain responses within the first 300 ms following the target (decision) word differed in amplitude and spatial distribution between dyslexics and non-dyslexics when processing italicised text. The two ERP components we observed within this period showed a dissociation in peak time, spatial profile, and their ability to predict behavioural performance. These findings emphasise the importance of choosing font style carefully to optimise word processing and reading comprehension by dyslexics. Based on these differences, our second study (Chapter 3) asked whether a specific dyslexia font can be used to alleviate difficulties with reading comprehension in adult dyslexia, and what effects such a font has on cognitive and oculomotor mechanisms. Using standardised texts coupled with validated comprehension questions, we demonstrated that reading comprehension across all participants was better on trials presented in the dyslexia font OpenDyslexic compared to those presented in traditional Times New Roman font. These benefits were larger among dyslexics. Conversely, participants’ reading speed was unaffected by OpenDyslexic. Our eye-tracking data showed increases in visual search intensity and ease of visual processing on OpenDyslexic trials in the form of decreases in median fixation duration and fixation to saccade ratio, as well as a smaller number of falsely programmed forward saccades among dyslexics. These findings provide empirical evidence for the efficacy of OpenDyslexic in longer texts and its ability to improve the visual reading strategy. Finally, recent evidence has shown that adults with dyslexia exhibit obvious fundamental deficits spanning multiple sensory systems when performing simple perceptual decision tasks, such as integrating beeps and flashes. These deficits extend beyond the well-established linguistic difficulties. Particularly, dyslexics reading impairments are believed to be a consequence of deficient integration of congruent audio-visual information. However, it remains unclear whether dyslexic adults exhibit similar impairments when integrating audio-visual evidence in a non-linguistic perceptual decision task with noisy real-world objects. To address this question, and informed by our previous work in non-dyslexics, we used a linear multivariate discriminant analysis to investigate the extent to which audio-visual integration affects early sensory evidence encoding (‘early’) or later decision-related stages (‘late’) in dyslexia. We found increased decision accuracy and slower response times during audio-visual trials for both groups. However, overall, dyslexics showed worse performance than non-dyslexics. When comparing audio-visual to visual trials, we observed that dyslexics exhibited an increase in the magnitude of an EEG component situated between the early and late processing stages. Conversely, non-dyslexics exhibited increased component amplitudes for a later post-sensory EEG component, consistent with a post-sensory influence of audio-visual integration. Our results suggest that adult dyslexics benefit from congruent audio-visual evidence of noisy perceptual stimuli to a similar extent but rely on a different neural process to achieve these improvements. In conclusion, our results provide novel insights into the neural dynamics, visual and cognitive mechanisms underlying adult dyslexics’ perceptual decision making. They further offer empirical evidence and practical suggestions for easily implementable applications that can improve text comprehension by everyone

    Putting the Semantics into Semantic Versioning

    Full text link
    The long-standing aspiration for software reuse has made astonishing strides in the past few years. Many modern software development ecosystems now come with rich sets of publicly-available components contributed by the community. Downstream developers can leverage these upstream components, boosting their productivity. However, components evolve at their own pace. This imposes obligations on and yields benefits for downstream developers, especially since changes can be breaking, requiring additional downstream work to adapt to. Upgrading too late leaves downstream vulnerable to security issues and missing out on useful improvements; upgrading too early results in excess work. Semantic versioning has been proposed as an elegant mechanism to communicate levels of compatibility, enabling downstream developers to automate dependency upgrades. While it is questionable whether a version number can adequately characterize version compatibility in general, we argue that developers would greatly benefit from tools such as semantic version calculators to help them upgrade safely. The time is now for the research community to develop such tools: large component ecosystems exist and are accessible, component interactions have become observable through automated builds, and recent advances in program analysis make the development of relevant tools feasible. In particular, contracts (both traditional and lightweight) are a promising input to semantic versioning calculators, which can suggest whether an upgrade is likely to be safe.Comment: to be published as Onward! Essays 202

    Altered Alpha Oscillatory Power Dynamics Underlie Difficulties with Cognitive Flexibility

    Get PDF
    Cognitive flexibility is an important mental faculty, but there are certain populations that experience reduced flexibility, which may be associated with altered neural activity. Rumination is when an individual becomes mentally stuck on a thought, and they experience difficulty shifting their attention away from the ruminative thought demonstrating reduced cognitive flexibility. In a similar manner, individuals diagnosed with substance use disorder show varying degrees of attentional bias towards drug related stimuli. The drug cues capture attention, and it is difficult for these individuals to shift attention away from thoughts related to drug cues. Both populations experience difficulty shifting attention when they experience highly salient thoughts (high automatic constraints). Here we suggest and demonstrate that reduced cognitive flexibility in these populations is associated with altered activity of alpha oscillations, as alpha oscillations play an important role in supporting cognitive flexibility. In our first study, we assess the relationship between trait tendency to ruminate and resting state alpha power in left frontal and parietal located electrodes. Individuals higher in trait rumination exhibit higher alpha power in left frontal located electrodes. This finding suggests that higher alpha power may contribute to mental inflexibility associated with rumination. In our second study, we assess the relationship between attentional bias towards drug cues and alpha power while automatic constraints on thought are high during an emotional version of the Stroop task and when drug cues are not present and therefore automatic constraints are low, but flexibility is required during a probabilistic reversal learning task. The emotional version of the Stroop task includes traditional congruent and incongruent word meanings as well as drug related and neutral word meanings. Participants in this study were long-term nicotine smokers, therefore the emotional stimuli were smoking related. The probabilistic reversal learning task instructs participants to choose one of two presented stimuli on each trial. The stimuli have different probabilities of reward or punishment. If the participant chooses the stimulus with the higher probability of reward several trials in a row, the reward probabilities reverse, and the participant must adapt to the new reward contingencies. Participants demonstrate the traditional Stroop effect of lower accuracy and slower reaction time during incongruent trials compared to congruent trials. Additionally, participants show a slowed reaction time during drug trials compared to neutral trials suggesting attentional bias during drug trials. Greater attentional bias is associated with higher alpha power in left frontal electrodes during drug trials. No significant relationship between attentional bias and alpha power during the probabilistic reversal learning task was revealed. Together, these results suggest higher alpha power in left frontal regions may contribute to mental inflexibility prompted by attentional bias when automatic constraints are high, but when automatic constraints are low, flexibility may not be reduced. All together these results reveal a relationship between reduced cognitive flexibility when salient stimuli or thoughts are present and altered alpha power dynamics, which may offer new avenues for behavioral intervention to improve cognitive flexibility

    IDEAS-1997-2021-Final-Programs

    Get PDF
    This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)
    • 

    corecore