519 research outputs found
Domain Generalization by Solving Jigsaw Puzzles
Human adaptability relies crucially on the ability to learn and merge
knowledge both from supervised and unsupervised learning: the parents point out
few important concepts, but then the children fill in the gaps on their own.
This is particularly effective, because supervised learning can never be
exhaustive and thus learning autonomously allows to discover invariances and
regularities that help to generalize. In this paper we propose to apply a
similar approach to the task of object recognition across domains: our model
learns the semantic labels in a supervised fashion, and broadens its
understanding of the data by learning from self-supervised signals how to solve
a jigsaw puzzle on the same images. This secondary task helps the network to
learn the concepts of spatial correlation while acting as a regularizer for the
classification task. Multiple experiments on the PACS, VLCS, Office-Home and
digits datasets confirm our intuition and show that this simple method
outperforms previous domain generalization and adaptation solutions. An
ablation study further illustrates the inner workings of our approach.Comment: Accepted at CVPR 2019 (oral
Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts
Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but âSpecificâ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The âcontextâ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people
NSL-BP: A Meta Classifier Model Based Prediction of Amazon Product Reviews
In machine learning, the product rating prediction based on the semantic analysis of the consumers' reviews is a relevant topic. Amazon is one of the most popular online retailers, with millions of customers purchasing and reviewing products. In the literature, many research projects work on the rating prediction of a given review. In this research project, we introduce a novel approach to enhance the accuracy of rating prediction by machine learning methods by processing the reviewed text. We trained our model by using many methods, so we propose a combined model to predict the ratings of products corresponding to a given review content. First, using k-means and LDA, we cluster the products and topics so that it will be easy to predict the ratings having the same kind of products and reviews together. We trained low, neutral, and high models based on clusters and topics of products. Then, by adopting a stacking ensemble model, we combine NaĂŻve Bayes, Logistic Regression, and SVM to predict the ratings. We will combine these models into a two-level stack. We called this newly introduced model, NSL model, and compared the prediction performance with other methods at state of the art
Concise and Effective Network for 3D Human Modeling from Orthogonal Silhouettes
In this paper, we revisit the problem of 3D human modeling from two
orthogonal silhouettes of individuals (i.e., front and side views). Different
from our prior work {\cite{wang2003virtual}}, a supervised learning approach
based on \textit{convolutional neural network} (CNN) is investigated to solve
the problem by establishing a mapping function that can effectively extract
features from two silhouettes and fuse them into coefficients in the shape
space of human bodies. A new CNN structure is proposed in our work to exact not
only the discriminative features of front and side views and also their mixed
features for the mapping function. 3D human models with high accuracy are
synthesized from coefficients generated by the mapping function. Existing CNN
approaches for 3D human modeling usually learn a large number of parameters
(from {8.5M} to {355.4M}) from two binary images. Differently, we investigate a
new network architecture and conduct the samples on silhouettes as input. As a
consequence, more accurate models can be generated by our network with only
{2.4M} coefficients. The training of our network is conducted on samples
obtained by augmenting a publicly accessible dataset. Learning transfer by
using datasets with a smaller number of scanned models is applied to our
network to enable the function of generating results with gender-oriented (or
geographical) patterns
Neural and visual correlates of perceptual decision making in adult dyslexia
Humans have to make decisions based on visual information numerous times every dayâfor example, judging whether it is a friend or simply a nice stranger who is waving at us from the other side of the street, or whether the content of a contract we are about to sign is correct. In particular, perceptual decisions based on good reading comprehension might disadvantage people affected by the specific learning disorder dyslexia, characterised by impairments in reading and writing. In recent years, neuroscience has begun to uncover the neural basis of these impairments in children and adults. However, it remains unknown what neural differences might underlie impaired processing of the physical properties of written words, such as font type and style.
The current thesis sought to characterise the neural and oculomotor temporal correlates of font-modulated reading comprehension while also probing a more fundamental deficit in non-linguistic sensory perceptual decision making in adult dyslexia by using a combination of electrophysiological and eye-tracking methods. The first of our three studies (Chapter 2), investigated the impact of italicsâa commonly used font style for highlighting important contentâon reading comprehension in a sentence reading lexical decision task. Overall, the performance of dyslexics was worse than that of non-dyslexics. Cluster-based event-related potential (ERP) analysis revealed that brain responses within the first 300 ms following the target (decision) word differed in amplitude and spatial distribution between dyslexics and non-dyslexics when processing italicised text. The two ERP components we observed within this period showed a dissociation in peak time, spatial profile, and their ability to predict behavioural performance. These findings emphasise the importance of choosing font style carefully to optimise word processing and reading comprehension by dyslexics.
Based on these differences, our second study (Chapter 3) asked whether a specific dyslexia font can be used to alleviate difficulties with reading comprehension in adult dyslexia, and what effects such a font has on cognitive and oculomotor mechanisms. Using standardised texts coupled with validated comprehension questions, we demonstrated that reading comprehension across all participants was better on trials presented in the dyslexia font OpenDyslexic compared to those presented in traditional Times New Roman font. These benefits were larger among dyslexics. Conversely, participantsâ reading speed was unaffected by OpenDyslexic. Our eye-tracking data showed increases in visual search intensity and ease of visual processing on OpenDyslexic trials in the form of decreases in median fixation duration and fixation to saccade ratio, as well as a smaller number of falsely programmed forward saccades among dyslexics. These findings provide empirical evidence for the efficacy of OpenDyslexic in longer texts and its ability to improve the visual reading strategy.
Finally, recent evidence has shown that adults with dyslexia exhibit obvious fundamental deficits spanning multiple sensory systems when performing simple perceptual decision tasks, such as integrating beeps and flashes. These deficits extend beyond the well-established linguistic difficulties. Particularly, dyslexics reading impairments are believed to be a consequence of deficient integration of congruent audio-visual information. However, it remains unclear whether dyslexic adults exhibit similar impairments when integrating audio-visual evidence in a non-linguistic perceptual decision task with noisy real-world objects. To address this question, and informed by our previous work in non-dyslexics, we used a linear multivariate discriminant analysis to investigate the extent to which audio-visual integration affects early sensory evidence encoding (âearlyâ) or later decision-related stages (âlateâ) in dyslexia. We found increased decision accuracy and slower response times during audio-visual trials for both groups. However, overall, dyslexics showed worse performance than non-dyslexics. When comparing audio-visual to visual trials, we observed that dyslexics exhibited an increase in the magnitude of an EEG component situated between the early and late processing stages. Conversely, non-dyslexics exhibited increased component amplitudes for a later post-sensory EEG component, consistent with a post-sensory influence of audio-visual integration. Our results suggest that adult dyslexics benefit from congruent audio-visual evidence of noisy perceptual stimuli to a similar extent but rely on a different neural process to achieve these improvements.
In conclusion, our results provide novel insights into the neural dynamics, visual and cognitive mechanisms underlying adult dyslexicsâ perceptual decision making. They further offer empirical evidence and practical suggestions for easily implementable applications that can improve text comprehension by everyone
Putting the Semantics into Semantic Versioning
The long-standing aspiration for software reuse has made astonishing strides
in the past few years. Many modern software development ecosystems now come
with rich sets of publicly-available components contributed by the community.
Downstream developers can leverage these upstream components, boosting their
productivity.
However, components evolve at their own pace. This imposes obligations on and
yields benefits for downstream developers, especially since changes can be
breaking, requiring additional downstream work to adapt to. Upgrading too late
leaves downstream vulnerable to security issues and missing out on useful
improvements; upgrading too early results in excess work. Semantic versioning
has been proposed as an elegant mechanism to communicate levels of
compatibility, enabling downstream developers to automate dependency upgrades.
While it is questionable whether a version number can adequately characterize
version compatibility in general, we argue that developers would greatly
benefit from tools such as semantic version calculators to help them upgrade
safely. The time is now for the research community to develop such tools: large
component ecosystems exist and are accessible, component interactions have
become observable through automated builds, and recent advances in program
analysis make the development of relevant tools feasible. In particular,
contracts (both traditional and lightweight) are a promising input to semantic
versioning calculators, which can suggest whether an upgrade is likely to be
safe.Comment: to be published as Onward! Essays 202
Recommended from our members
Computational Toxinology
Venoms are complex mixtures of biological macromolecules and other compounds that are used for predatory and defensive purposes by hundreds of thousands of known species worldwide. Throughout human history, venoms and venom components have been used to treat a vast array of illnesses, causing them to be of great clinical, economic, and academic interest to the drug discovery and toxinology communities. In spite of major computational advances that facilitate data-driven drug discovery, most therapeutic venom effects are still discovered via tedious trial-and-error, or simply by accident. In this dissertation, I describe a body of work that aims to establish a new subdiscipline of translational bioinformatics, which I name âcomputational toxinologyâ.
To accomplish this goal, I present three integrated components that span a wide range of informatics techniques: (1) VenomKB, (2) VenomSeq, and (3) VenomKBâs Semantic API. To provide a platform for structuring, representing, retrieving, and integrating venom data relevant to drug discovery, VenomKB provides a database-backed web application and knowledge base for computational toxinology. VenomKB is structured according to a fully-featured ontology of venoms, and provides data aggregated from many popular web re- sources. VenomSeq is a biotechnology workflow that is designed to generate new high-throughput sequencing data for incorporation into VenomKB. Specifically, we expose human cells to controlled doses of crude venoms, conduct RNA-Sequencing, and build profiles of differential gene expression, which we then compare to publicly-available differential expression data for known dis- eases and drugs with known effects, and use those comparisons to hypothesize ways that the venoms could act in a therapeutic manner, as well. These data are then integrated into VenomKB, where they can be effectively retrieved and evaluated using existing data and known therapeutic associations. VenomKBâs Semantic API further develops this functionality by providing an intelligent, powerful, and user-friendly interface for querying the complex underlying data in VenomKB in a way that reflects the intuitive, human-understandable mean- ing of those data. The Semantic API is designed to cater to the needs of advanced users as well as laypersons and bench scientists without previous expertise in computational biology and semantic data analysis.
In each chapter of the dissertation, I describe how we evaluated these 3 components through various approaches. We demonstrate the utility of VenomKB and the Semantic API by testing a number of practical use-cases for each, designed to highlight their ability to rediscover existing knowledge as well as suggesting potential areas for future exploration. We use statistics and data science techniques to evaluate VenomSeq on 25 diverse species of venomous animals, and propose biologically feasible explanations for significant findings. In evaluating the Semantic API, I show how observations on VenomSeq data can be interpreted and placed into the context of past research by members of the larger toxinology community.
Computational toxinology is a toolbox designed to be used by multiple stakeholders (toxinologists, computational biologists, and systems pharmacologists, among others) to improve the return rate of clinically-significant findings from manual experimentation. It aims to achieve this goal by enabling access to data, providing means for easy validation of results, and suggesting specific hypotheses that are preliminarily supported by rigorous inferential statistics. All components of the research I describe are open-access and publicly available, to improve reproducibility and encourage widespread adoptio
Altered Alpha Oscillatory Power Dynamics Underlie Difficulties with Cognitive Flexibility
Cognitive flexibility is an important mental faculty, but there are certain populations that experience reduced flexibility, which may be associated with altered neural activity. Rumination is when an individual becomes mentally stuck on a thought, and they experience difficulty shifting their attention away from the ruminative thought demonstrating reduced cognitive flexibility. In a similar manner, individuals diagnosed with substance use disorder show varying degrees of attentional bias towards drug related stimuli. The drug cues capture attention, and it is difficult for these individuals to shift attention away from thoughts related to drug cues. Both populations experience difficulty shifting attention when they experience highly salient thoughts (high automatic constraints). Here we suggest and demonstrate that reduced cognitive flexibility in these populations is associated with altered activity of alpha oscillations, as alpha oscillations play an important role in supporting cognitive flexibility. In our first study, we assess the relationship between trait tendency to ruminate and resting state alpha power in left frontal and parietal located electrodes. Individuals higher in trait rumination exhibit higher alpha power in left frontal located electrodes. This finding suggests that higher alpha power may contribute to mental inflexibility associated with rumination. In our second study, we assess the relationship between attentional bias towards drug cues and alpha power while automatic constraints on thought are high during an emotional version of the Stroop task and when drug cues are not present and therefore automatic constraints are low, but flexibility is required during a probabilistic reversal learning task. The emotional version of the Stroop task includes traditional congruent and incongruent word meanings as well as drug related and neutral word meanings. Participants in this study were long-term nicotine smokers, therefore the emotional stimuli were smoking related. The probabilistic reversal learning task instructs participants to choose one of two presented stimuli on each trial. The stimuli have different probabilities of reward or punishment. If the participant chooses the stimulus with the higher probability of reward several trials in a row, the reward probabilities reverse, and the participant must adapt to the new reward contingencies. Participants demonstrate the traditional Stroop effect of lower accuracy and slower reaction time during incongruent trials compared to congruent trials. Additionally, participants show a slowed reaction time during drug trials compared to neutral trials suggesting attentional bias during drug trials. Greater attentional bias is associated with higher alpha power in left frontal electrodes during drug trials. No significant relationship between attentional bias and alpha power during the probabilistic reversal learning task was revealed. Together, these results suggest higher alpha power in left frontal regions may contribute to mental inflexibility prompted by attentional bias when automatic constraints are high, but when automatic constraints are low, flexibility may not be reduced. All together these results reveal a relationship between reduced cognitive flexibility when salient stimuli or thoughts are present and altered alpha power dynamics, which may offer new avenues for behavioral intervention to improve cognitive flexibility
IDEAS-1997-2021-Final-Programs
This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)
- âŠ