27 research outputs found
Augmenting Knowledge Transfer across Graphs
Given a resource-rich source graph and a resource-scarce target graph, how
can we effectively transfer knowledge across graphs and ensure a good
generalization performance? In many high-impact domains (e.g., brain networks
and molecular graphs), collecting and annotating data is prohibitively
expensive and time-consuming, which makes domain adaptation an attractive
option to alleviate the label scarcity issue. In light of this, the
state-of-the-art methods focus on deriving domain-invariant graph
representation that minimizes the domain discrepancy. However, it has recently
been shown that a small domain discrepancy loss may not always guarantee a good
generalization performance, especially in the presence of disparate graph
structures and label distribution shifts. In this paper, we present TRANSNET, a
generic learning framework for augmenting knowledge transfer across graphs. In
particular, we introduce a novel notion named trinity signal that can naturally
formulate various graph signals at different granularity (e.g., node
attributes, edges, and subgraphs). With that, we further propose a domain
unification module together with a trinity-signal mixup scheme to jointly
minimize the domain discrepancy and augment the knowledge transfer across
graphs. Finally, comprehensive empirical results show that TRANSNET outperforms
all existing approaches on seven benchmark datasets by a significant margin
Dynamic Transfer Learning across Graphs
Transferring knowledge across graphs plays a pivotal role in many high-stake
domains, ranging from transportation networks to e-commerce networks, from
neuroscience to finance. To date, the vast majority of existing works assume
both source and target domains are sampled from a universal and stationary
distribution. However, many real-world systems are intrinsically dynamic, where
the underlying domains are evolving over time. To bridge the gap, we propose to
shift the problem to the dynamic setting and ask: given the label-rich source
graphs and the label-scarce target graphs observed in previous T timestamps,
how can we effectively characterize the evolving domain discrepancy and
optimize the generalization performance of the target domain at the incoming
T+1 timestamp? To answer the question, for the first time, we propose a
generalization bound under the setting of dynamic transfer learning across
graphs, which implies the generalization performance is dominated by domain
evolution and domain discrepancy between source and target domains. Inspired by
the theoretical results, we propose a novel generic framework DyTrans to
improve knowledge transferability across dynamic graphs. In particular, we
start with a transformer-based temporal encoding module to model temporal
information of the evolving domains; then, we further design a dynamic domain
unification module to efficiently learn domain-invariant representations across
the source and target domains. Finally, extensive experiments on various
real-world datasets demonstrate the effectiveness of DyTrans in transferring
knowledge from dynamic source domains to dynamic target domains
Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks
As machine learning has been deployed ubiquitously across applications in
modern data science, algorithmic fairness has become a great concern and
varieties of fairness criteria have been proposed. Among them, imposing
fairness constraints during learning, i.e. in-processing fair training, has
been a popular type of training method because they don't require accessing
sensitive attributes during test time in contrast to post-processing methods.
Although imposing fairness constraints have been studied extensively for
classical machine learning models, the effect these techniques have on deep
neural networks is still unclear. Recent research has shown that adding
fairness constraints to the objective function leads to severe over-fitting to
fairness criteria in large models, and how to solve this challenge is an
important open question. To address this challenge, we leverage the wisdom and
power of pre-training and fine-tuning and develop a simple but novel framework
to train fair neural networks in an efficient and inexpensive way. We conduct
comprehensive experiments on two popular image datasets with state-of-art
architectures under different fairness notions to show that last-layer
fine-tuning is sufficient for promoting fairness of the deep neural network.
Our framework brings new insights into representation learning in training fair
neural networks
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
A Highly Active Pd On Ni-B Bimetallic Catalyst For Liquid-Phase Hydrodechlorination Of 4-Chlorophenol Under Mild Conditions
A Pd on Ni-B (Pd/Ni-B) bimetallic catalyst was prepared and tested in the hydrodechlorination (HDC) of 4-chlorophenol (4-CP). The catalysts were synthesized by replacement method and treated at different temperatures (298-673 K). The results showed that the one treated at 473 K could achieve complete dechlorination of 200 ppm 4-CP under the pH = 8 within 30 min and keep the high activity in the first four recycles. The introducing of Pd greatly promoted the catalytic HDC efficiency of Ni-B, and the high dispersion of Pd species ensured the high activity of Pd/Ni-B catalysts. The catalytic HDC reaction of 4-CP followed the pseudo-first-order dynamics and the kinetic data was obtained. Graphical Abstract: A series of Pd/Ni-B catalysts were prepared by replacement method and treated at different temperatures. These catalysts were characterized and applied to the hydrodechlorination reaction of 4-chlorophenol, and complete degradation of chlorophenols could be realized in a few minutes.[Figure not available: see fulltext.] © 2011 Springer Science+Business Media, LLC
Genetic Evolution of Mycobacterium abscessus Conferring Clarithromycin Resistance during Long-Term Antibiotic Therapy
Objectives. Clarithromycin is recommended as the core agent for treating M. abscessus infections, which usually calls for at least one year of treatment course, facilitating the development of resistance. This study aimed to identify the underlying mechanism of in vivo development of clarithromycin resistance in M. abscessus clinical isolates. Methods. M. abscessus isolates from patients with lung infections during long-term antibiotic therapy were longitudinally collected and sequenced. PFGE DNA fingerprinting was used to confirm the genetic relationships of the isolates. Whole genome comparative analysis was performed to identify the genetic determinants that confer the clarithromycin resistance. Results. Three pairs of initially clarithromycin-susceptible and subsequently clarithromycin-resistant M. abscessus isolates were obtained. We found that the clarithromycin-resistant isolates emerged relatively rapidly, after 4–16 months of antibiotic therapy. PFGE DNA fingerprinting showed that the clarithromycin-resistant isolates were identical to the initial clarithromycin-susceptible ones. Whole genome sequencing and bioinformatics analysis identified several genetic alternations in clarithromycin-resistant isolates, including genes encoding efflux pump/transporter, integral component of membrane, and the tetR and lysR family transcriptional regulators. Conclusion. We identified genes likely encoding new factors contributing to clarithromycin-resistance phenotype of M. abscessus, which can be useful in prediction of clarithromycin resistance in M. abscessus
De Novo Biosynthesis of Caffeic Acid from Glucose by Engineered Saccharomyces cerevisiae
Caffeic acid is a plant phenolic compound possessing extensive pharmacological activities. Here, we identified that p-coumaric acid 3-hydroxylase from Arabidopsis thaliana was capable of hydroxylating p-coumaric acid to form caffeic acid in Saccharomyces cerevisiae. Then, we introduced a combined caffeic acid biosynthetic pathway into S.\ua0cerevisiae and obtained 0.183 mg L-1 caffeic acid from glucose. Next we improved the tyrosine biosynthesis in S.\ua0cerevisiae by blocking the pathway flux to aromatic alcohols and eliminating the tyrosine-induced feedback inhibition resulting in caffeic acid production of 2.780 mg L-1. Finally, the medium was optimized, and the highest caffeic acid production obtained was 11.432 mg L-1 in YPD medium containing 4% glucose. This study opens a route to produce caffeic acid from glucose in S.\ua0cerevisiae and establishes a platform for the biosynthesis of caffeic acid derived metabolites
Comparative transcriptome analysis of genomic region deletion strain with enhanced l-tyrosine production in Saccharomyces cerevisiae
Objective To determine the effect of large genomic region deletion in a Saccharomyces cerevisiae strain on tyrosine yield and to identify new genetic modification targets through transcriptome analysis. Results TAL was used to produce p-coumaric acid (p-CA) from tyrosine to quantity tyrosine yield. S. cerevisiae mutant strain NK14 with deletion of a 23.8 kb genomic region was identified to have p-CA production of 10.3 mg L- 1, while the wild-type strain BY4741 had p-CA production of 1.06 mg L- 1. Analysis of growth patterns and stress tolerance showed that the deletion did not affect the growth phenotype of NK14. Transcriptome analysis suggested that, compared to BY4741, genes related to glycolysis (ENO2, TKL1) and the tyrosine pathway (ARO1, ARO2, ARO4, ARO7, TYR1) were upregulated in NK14 at different levels. Besides genes related to the tyrosine biosynthetic pathway, amino acid transporters (AVT6, VBA5, THI72) and transcription factor (ARO80) also showed changes in transcription levels. Conclusions We developed a strain with improved tyrosine yield and identified new genetic modification candidates for tyrosine production