27 research outputs found

    Augmenting Knowledge Transfer across Graphs

    Full text link
    Given a resource-rich source graph and a resource-scarce target graph, how can we effectively transfer knowledge across graphs and ensure a good generalization performance? In many high-impact domains (e.g., brain networks and molecular graphs), collecting and annotating data is prohibitively expensive and time-consuming, which makes domain adaptation an attractive option to alleviate the label scarcity issue. In light of this, the state-of-the-art methods focus on deriving domain-invariant graph representation that minimizes the domain discrepancy. However, it has recently been shown that a small domain discrepancy loss may not always guarantee a good generalization performance, especially in the presence of disparate graph structures and label distribution shifts. In this paper, we present TRANSNET, a generic learning framework for augmenting knowledge transfer across graphs. In particular, we introduce a novel notion named trinity signal that can naturally formulate various graph signals at different granularity (e.g., node attributes, edges, and subgraphs). With that, we further propose a domain unification module together with a trinity-signal mixup scheme to jointly minimize the domain discrepancy and augment the knowledge transfer across graphs. Finally, comprehensive empirical results show that TRANSNET outperforms all existing approaches on seven benchmark datasets by a significant margin

    Dynamic Transfer Learning across Graphs

    Full text link
    Transferring knowledge across graphs plays a pivotal role in many high-stake domains, ranging from transportation networks to e-commerce networks, from neuroscience to finance. To date, the vast majority of existing works assume both source and target domains are sampled from a universal and stationary distribution. However, many real-world systems are intrinsically dynamic, where the underlying domains are evolving over time. To bridge the gap, we propose to shift the problem to the dynamic setting and ask: given the label-rich source graphs and the label-scarce target graphs observed in previous T timestamps, how can we effectively characterize the evolving domain discrepancy and optimize the generalization performance of the target domain at the incoming T+1 timestamp? To answer the question, for the first time, we propose a generalization bound under the setting of dynamic transfer learning across graphs, which implies the generalization performance is dominated by domain evolution and domain discrepancy between source and target domains. Inspired by the theoretical results, we propose a novel generic framework DyTrans to improve knowledge transferability across dynamic graphs. In particular, we start with a transformer-based temporal encoding module to model temporal information of the evolving domains; then, we further design a dynamic domain unification module to efficiently learn domain-invariant representations across the source and target domains. Finally, extensive experiments on various real-world datasets demonstrate the effectiveness of DyTrans in transferring knowledge from dynamic source domains to dynamic target domains

    Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks

    Full text link
    As machine learning has been deployed ubiquitously across applications in modern data science, algorithmic fairness has become a great concern and varieties of fairness criteria have been proposed. Among them, imposing fairness constraints during learning, i.e. in-processing fair training, has been a popular type of training method because they don't require accessing sensitive attributes during test time in contrast to post-processing methods. Although imposing fairness constraints have been studied extensively for classical machine learning models, the effect these techniques have on deep neural networks is still unclear. Recent research has shown that adding fairness constraints to the objective function leads to severe over-fitting to fairness criteria in large models, and how to solve this challenge is an important open question. To address this challenge, we leverage the wisdom and power of pre-training and fine-tuning and develop a simple but novel framework to train fair neural networks in an efficient and inexpensive way. We conduct comprehensive experiments on two popular image datasets with state-of-art architectures under different fairness notions to show that last-layer fine-tuning is sufficient for promoting fairness of the deep neural network. Our framework brings new insights into representation learning in training fair neural networks

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    A Highly Active Pd On Ni-B Bimetallic Catalyst For Liquid-Phase Hydrodechlorination Of 4-Chlorophenol Under Mild Conditions

    No full text
    A Pd on Ni-B (Pd/Ni-B) bimetallic catalyst was prepared and tested in the hydrodechlorination (HDC) of 4-chlorophenol (4-CP). The catalysts were synthesized by replacement method and treated at different temperatures (298-673 K). The results showed that the one treated at 473 K could achieve complete dechlorination of 200 ppm 4-CP under the pH = 8 within 30 min and keep the high activity in the first four recycles. The introducing of Pd greatly promoted the catalytic HDC efficiency of Ni-B, and the high dispersion of Pd species ensured the high activity of Pd/Ni-B catalysts. The catalytic HDC reaction of 4-CP followed the pseudo-first-order dynamics and the kinetic data was obtained. Graphical Abstract: A series of Pd/Ni-B catalysts were prepared by replacement method and treated at different temperatures. These catalysts were characterized and applied to the hydrodechlorination reaction of 4-chlorophenol, and complete degradation of chlorophenols could be realized in a few minutes.[Figure not available: see fulltext.] © 2011 Springer Science+Business Media, LLC

    Genetic Evolution of Mycobacterium abscessus Conferring Clarithromycin Resistance during Long-Term Antibiotic Therapy

    No full text
    Objectives. Clarithromycin is recommended as the core agent for treating M. abscessus infections, which usually calls for at least one year of treatment course, facilitating the development of resistance. This study aimed to identify the underlying mechanism of in vivo development of clarithromycin resistance in M. abscessus clinical isolates. Methods. M. abscessus isolates from patients with lung infections during long-term antibiotic therapy were longitudinally collected and sequenced. PFGE DNA fingerprinting was used to confirm the genetic relationships of the isolates. Whole genome comparative analysis was performed to identify the genetic determinants that confer the clarithromycin resistance. Results. Three pairs of initially clarithromycin-susceptible and subsequently clarithromycin-resistant M. abscessus isolates were obtained. We found that the clarithromycin-resistant isolates emerged relatively rapidly, after 4–16 months of antibiotic therapy. PFGE DNA fingerprinting showed that the clarithromycin-resistant isolates were identical to the initial clarithromycin-susceptible ones. Whole genome sequencing and bioinformatics analysis identified several genetic alternations in clarithromycin-resistant isolates, including genes encoding efflux pump/transporter, integral component of membrane, and the tetR and lysR family transcriptional regulators. Conclusion. We identified genes likely encoding new factors contributing to clarithromycin-resistance phenotype of M. abscessus, which can be useful in prediction of clarithromycin resistance in M. abscessus

    De Novo Biosynthesis of Caffeic Acid from Glucose by Engineered Saccharomyces cerevisiae

    No full text
    Caffeic acid is a plant phenolic compound possessing extensive pharmacological activities. Here, we identified that p-coumaric acid 3-hydroxylase from Arabidopsis thaliana was capable of hydroxylating p-coumaric acid to form caffeic acid in Saccharomyces cerevisiae. Then, we introduced a combined caffeic acid biosynthetic pathway into S.\ua0cerevisiae and obtained 0.183 mg L-1 caffeic acid from glucose. Next we improved the tyrosine biosynthesis in S.\ua0cerevisiae by blocking the pathway flux to aromatic alcohols and eliminating the tyrosine-induced feedback inhibition resulting in caffeic acid production of 2.780 mg L-1. Finally, the medium was optimized, and the highest caffeic acid production obtained was 11.432 mg L-1 in YPD medium containing 4% glucose. This study opens a route to produce caffeic acid from glucose in S.\ua0cerevisiae and establishes a platform for the biosynthesis of caffeic acid derived metabolites

    Comparative transcriptome analysis of genomic region deletion strain with enhanced l-tyrosine production in Saccharomyces cerevisiae

    No full text
    Objective To determine the effect of large genomic region deletion in a Saccharomyces cerevisiae strain on tyrosine yield and to identify new genetic modification targets through transcriptome analysis. Results TAL was used to produce p-coumaric acid (p-CA) from tyrosine to quantity tyrosine yield. S. cerevisiae mutant strain NK14 with deletion of a 23.8 kb genomic region was identified to have p-CA production of 10.3 mg L- 1, while the wild-type strain BY4741 had p-CA production of 1.06 mg L- 1. Analysis of growth patterns and stress tolerance showed that the deletion did not affect the growth phenotype of NK14. Transcriptome analysis suggested that, compared to BY4741, genes related to glycolysis (ENO2, TKL1) and the tyrosine pathway (ARO1, ARO2, ARO4, ARO7, TYR1) were upregulated in NK14 at different levels. Besides genes related to the tyrosine biosynthetic pathway, amino acid transporters (AVT6, VBA5, THI72) and transcription factor (ARO80) also showed changes in transcription levels. Conclusions We developed a strain with improved tyrosine yield and identified new genetic modification candidates for tyrosine production
    corecore