86 research outputs found

    General Greedy De-bias Learning

    Full text link
    Neural networks often make predictions relying on the spurious correlations from the datasets rather than the intrinsic properties of the task of interest, facing sharp degradation on out-of-distribution (OOD) test data. Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios. Others implicitly identify the dataset bias by special design low capability biased models or losses, but they degrade when the training and testing data are from the same distribution. In this paper, we propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model. The base model is encouraged to focus on examples that are hard to solve with biased models, thus remaining robust against spurious correlations in the test stage. GGD largely improves models' OOD generalization ability on various tasks, but sometimes over-estimates the bias level and degrades on the in-distribution test. We further re-analyze the ensemble process of GGD and introduce the Curriculum Regularization inspired by curriculum learning, which achieves a good trade-off between in-distribution and out-of-distribution performance. Extensive experiments on image classification, adversarial question answering, and visual question answering demonstrate the effectiveness of our method. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.Comment: This work has been submitted to IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    An Effective Method to Measure Disease Similarity Using Gene and Phenotype Associations

    Get PDF
    Motivation: In order to create controlled vocabularies for shared use in different biomedical domains, a large number of biomedical ontologies such as Disease Ontology (DO) and Human Phenotype Ontology (HPO), etc., are created in the bioinformatics community. Quantitative measures of the associations among diseases could help researchers gain a deep insight of human diseases, since similar diseases are usually caused by similar molecular origins or have similar phenotypes, which is beneficial to reveal the common attributes of diseases and improve the corresponding diagnoses and treatment plans. Some previous are proposed to measure the disease similarity using a particular biomedical ontology during the past few years, but for a newly discovered disease or a disease with few related genetic information in Disease Ontology (i.e., a disease with less disease-gene associations), these previous approaches usually ignores the joint computation of disease similarity by integrating gene and phenotype associations.Results: In this paper we propose a novel method called GPSim to effectively deduce the semantic similarity of diseases. In particular, GPSim calculates the similarity by jointly utilizing gene, disease and phenotype associations extracted from multiple biomedical ontologies and databases. We also explore the phenotypic factors such as the depth of HPO terms and the number of phenotypic associations that affect the evaluation performance. A final experimental evaluation is carried out to evaluate the performance of GPSim and shows its advantages over previous approaches

    Composite Adversarial Attacks

    Full text link
    Adversarial attack is a technique for deceiving Machine Learning (ML) models, which provides a way to evaluate the adversarial robustness. In practice, attack algorithms are artificially selected and tuned by human experts to break a ML system. However, manual selection of attackers tends to be sub-optimal, leading to a mistakenly assessment of model security. In this paper, a new procedure called Composite Adversarial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms and their hyper-parameters from a candidate pool of \textbf{32 base attackers}. We design a search space where attack policy is represented as an attacking sequence, i.e., the output of the previous attacker is used as the initialization input for successors. Multi-objective NSGA-II genetic algorithm is adopted for finding the strongest attack policy with minimum complexity. The experimental result shows CAA beats 10 top attackers on 11 diverse defenses with less elapsed time (\textbf{6 ×\times faster than AutoAttack}), and achieves the new state-of-the-art on ll_{\infty}, l2l_{2} and unrestricted adversarial attacks.Comment: To appear in AAAI 2021, code will be released late

    Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding

    Full text link
    Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage. In referring expressions, people usually describe a target entity in terms of its relationship with other contextual entities as well as visual attributes. However, previous weakly supervised REG methods rarely pay attention to the relationship between the entities. In this paper, we propose a knowledge-guided pairwise reconstruction network (KPRN), which models the relationship between the target entity (subject) and contextual entity (object) as well as grounds these two entities. Specifically, we first design a knowledge extraction module to guide the proposal selection of subject and object. The prior knowledge is obtained in a specific form of semantic similarities between each proposal and the subject/object. Second, guided by such knowledge, we design the subject and object attention module to construct the subject-object proposal pairs. The subject attention excludes the unrelated proposals from the candidate proposals. The object attention selects the most suitable proposal as the contextual proposal. Third, we introduce a pairwise attention and an adaptive weighting scheme to learn the correspondence between these proposal pairs and the query. Finally, a pairwise reconstruction module is used to measure the grounding for weakly supervised learning. Extensive experiments on four large-scale datasets show our method outperforms existing state-of-the-art methods by a large margin.Comment: Accepted by ACMMM 2019. arXiv admin note: text overlap with arXiv:1908.1056

    Long-term trends and drivers of aerosol pH in eastern China

    Get PDF
    Aerosol acidity plays a key role in regulating the chemistry and toxicity of atmospheric aerosol particles. The trend of aerosol pH and its drivers is crucial in understanding the multiphase formation pathways of aerosols. Here, we reported the first trend analysis of aerosol pH from 2011 to 2019 in eastern China, calculated with the ISORROPIA model based on observed gas and aerosol compositions. The implementation of the Air Pollution Prevention and Control Action Plan led to −35.8 %, −37.6 %, −9.6 %, −81.0 % and 1.2 % changes of PM2.5, SO42-, NHx, non-volatile cations (NVCs) and NO3- in the Yangtze River Delta (YRD) region during this period. Different from the drastic changes of aerosol compositions due to the implementation of the Air Pollution Prevention and Control Action Plan, aerosol pH showed a minor change of −0.24 over the 9 years. Besides the multiphase buffer effect, the opposite effects from the changes of SO42- and non-volatile cations played key roles in determining this minor pH trend, contributing to a change of +0.38 and −0.35, respectively. Seasonal variations in aerosol pH were mainly driven by the temperature, while the diurnal variations were driven by both temperature and relative humidity. In the future, SO2, NOx and NH3 emissions are expected to be further reduced by 86.9 %, 74.9 % and 41.7 % in 2050 according to the best health effect pollution control scenario (SSP1-26-BHE). The corresponding aerosol pH in eastern China is estimated to increase by ∼0.19, resulting in 0.04 less NO3- and 0.12 less NH4+ partitioning ratios, which suggests that NH3 and NOx emission controls are effective in mitigating haze pollution in eastern China.</p

    Carbon-contacted single molecule electrical junctions

    Get PDF
    International audienceA fully metal-free molecular junction (MJ) has been built by using an electrochemically etched carbon fibre STM tip as the top electrode and graphene as the bottom electrode. The corresponding conductance values for 1,n-alkanediamine and 1,nn-alkanedithiol (nn = 2, 4, 6, 8 and 10) have been measured using the STM-II(s) technique. The tunnelling decay constant of the alkanediamine and alkanedithiol junctions with these carbon contacts is much lower than the corresponding metal contacted junctions of 0.24 and 0.38 per –CH_2 unit, but the junction conductance with these carbon contacts is also lower. The carbon fibre tip can be considered a good candidate as an electrode. Compared with a gold tip, the carbon fibre tip leads to correspondingly lower molecular junction conductanc

    Fast and straightforward analysis approach of charge transport data in single molecule junctions

    Get PDF
    International audienceIn this study, we introduce an efficient data sorting algorithm, including filters for noisy signals, conductance mapping for analyzing the most dominant conductance group and sub-population groups. The capacity of our data analysis process has also been corroborated on real experimental data sets of Au-1,6-hexanedithiol-Au and Au-1,8-octanedithiol-Au molecular junctions. The fully automated and unsupervised program requires less than one minute on a standard PC to sort the data and generate histograms. The resulting one-dimensional and two-dimensional log histograms give conductance values in good agreement with previous studies. Our algorithm is a straightforward, fast and user-friendly tool for single molecule charge transport data analysis. We also analyze the data in a form of a conductance map which can offer evidence for diversity in molecular conductance. The code for automatic data analysis is openly available, well-documented and ready to use, thereby offering a useful new tool for single molecule electronics
    corecore