59 research outputs found
Rethinking Semi-supervised Learning with Language Models
Semi-supervised learning (SSL) is a popular setting aiming to effectively
utilize unlabelled data to improve model performance in downstream natural
language processing (NLP) tasks. Currently, there are two popular approaches to
make use of unlabelled data: Self-training (ST) and Task-adaptive pre-training
(TAPT). ST uses a teacher model to assign pseudo-labels to the unlabelled data,
while TAPT continues pre-training on the unlabelled data before fine-tuning. To
the best of our knowledge, the effectiveness of TAPT in SSL tasks has not been
systematically studied, and no previous work has directly compared TAPT and ST
in terms of their ability to utilize the pool of unlabelled data. In this
paper, we provide an extensive empirical study comparing five state-of-the-art
ST approaches and TAPT across various NLP tasks and data sizes, including in-
and out-of-domain settings. Surprisingly, we find that TAPT is a strong and
more robust SSL learner, even when using just a few hundred unlabelled samples
or in the presence of domain shifts, compared to more sophisticated ST
approaches, and tends to bring greater improvements in SSL than in
fully-supervised settings. Our further analysis demonstrates the risks of using
ST approaches when the size of labelled or unlabelled data is small or when
domain shifts exist. We offer a fresh perspective for future SSL research,
suggesting the use of unsupervised pre-training objectives over dependency on
pseudo labels
Ampere-hour-scale soft-package potassium-ion hybrid capacitors enabling 6-minute fast-charging
Extreme fast charging of Ampere-hour (Ah)-scale electrochemical energy storage devices targeting charging times of less than 10 minutes are desired to increase widespread adoption. However, this metric is difficult to achieve in conventional Li-ion batteries due to their inherent reaction mechanism and safety hazards at high current densities. In this work, we report 1 Ah soft-package potassium-ion hybrid supercapacitors (PIHCs), which combine the merits of high-energy density of battery-type negative electrodes and high-power density of capacitor-type positive electrodes. The PIHC consists of a defect-rich, high specific surface area N-doped carbon nanotube-based positive electrode, MnO quantum dots inlaid spacing-expanded carbon nanotube-based negative electrode, carbonate-based non-aqueous electrolyte, and a binder- and current collector-free cell design. Through the optimization of the cell configuration, electrodes, and electrolyte, the full cells (1 Ah) exhibit a cell voltage up to 4.8 V, high full-cell level specific energy of 140 Wh kg-1 (based on the whole mass of device) with a full charge of 6 minutes. An 88% capacity retention after 200 cycles at 10 C (10 A) and a voltage retention of 99% at 25 ± 1 °C are also demonstrated
Innovative use of industrially produced steel slag powders in asphalt mixture to replace mineral fillers
Using steel slag to partially replace the natural aggregate in asphalt mixture to produce high-performance asphalt mixture has gained significant interest in recent years as a value-added option to recycle steel slag. However, the poor homogeneity of the material properties of steel slag aggregates remains a concern for this recycling approach. In this study, an innovative method of using industrially produced steel slag powder (SSP) to replace the mineral filler in asphalt mixture was proposed to address this concern. Five fillers, including four SSP fillers, obtained by grinding different steel slag aggregates with an industrialized production line, and one conventional limestone powder (LP) filler, were evaluated. The chemical compositions and micro-morphologies of the SSPs were first characterized to evaluate the material homogeneity and gain insights into the advantages of using SSPs as fillers. Then, asphalt mixtures with different fillers were designed and produced, and their moisture stability, rutting resistance, and low-temperature crack resistance, were characterized. It was found that the industrially produced SSPs possessed homogeneous properties, and improved the compatibility between filler particles and asphalt binder. Besides, the asphalt mixtures with SSP fillers showed better resistance to the moisture damage, permanent deformation, low-temperature crack in terms of fracture energy, than the asphalt mixture with LP filler. Therefore, it was concluded that using SSPs as a replacement of mineral fillers in asphalt mixture provided a reliable and value-added solution to recycle steel slag
Targeting Thioredoxin System with an Organosulfur Compound, Diallyl Trisulfide (DATS), Attenuates Progression and Metastasis of Triple-Negative Breast Cancer (TNBC)
Background/Aims: Metastasis is the leading cause resulting in high mortality in triple negative breast cancer (TNBC) patients. Cancer cells are skilled at utilizing thioredoxin (Trx) system as an efficient antioxidant system to counteract oxidative damage, facilitating the occurrence of metastasis. Here, we identified an organosulfur compound named DATS isolated from garlic, that inhibits the expression of Trx-1 and the enzyme activity of Trx reductase in breast cancer cells. Methods: Tissue microarray of breast cancer patients and immunohistochemical method were used to analyze the role of Trx-1 in breast cancer metastasis. Spotaneous metastasis model and experimental metastasis model combined with HE staining, immunohistochemistry were used to verify in vivo anti-metastatic effect of DATS as well as its regulation on thioredoxin. Western blot, immunofluorescence, redox state assessment and detection of enzyme activity were employed to determine the effect of DATS on thioredoxin system. Trx-1 siRNA interference was used to investigate the conclusive evidence that Trx-1 was the target of DATS. Results: In agreement with reduced Trx-1 nuclear translocation from cytoplasm by DATS, the production of reduced form of Trx-1 was dramatically decreased. Furthermore, in vivo, DATS administration was observed to significantly suppress spontaneous and experimental metastasis in nude mice. Delivery of DATS also resulted in decreased expression of Trx-1 as the direct target, as well as expression of NF-ÎşB and MMP2/9 in primary tumor and lung tissue. Notably, the effects of DATS on the expression of downstream metastasis-associated genes were mediated by Trx-1, as demonstrated by the combination use of DATS and Trx-1 siRNA. Conclusion: Collectively, this present study indicates that targeting Trx system with DATS may provide a promising strategy for treating metastasis of TNBC
Pronostic moléculaire basé sur l'ordre des gènes et découverte de biomarqueurs guidé par des réseaux pour le cancer du sein
Breast cancer is the second most common cancer worldwide and the leading cause of women's death from cancer. Improving cancer prognosis has been one of the problems of primary interest towards better clinical management and treatment decision making for cancer patients. With the rapid advancement of genomic profiling technologies in the past decades, easy availability of a substantial amount of genomic data for medical research has been motivating the currently popular trend of using computational tools, especially machine learning in the era of data science, to discover molecular biomarkers regarding prognosis improvement. This thesis is conceived following two lines of approaches intended to address two major challenges arising in genomic data analysis for breast cancer prognosis from a methodological standpoint of machine learning: rank-based approaches for improved molecular prognosis and network-guided approaches for enhanced biomarker discovery. Furthermore, the methodologies developed and investigated in this thesis, pertaining respectively to learning with rank data and learning on graphs, have a significant contribution to several branches of machine learning, concerning applications across but not limited to cancer biology and social choice theory.Le cancer du sein est le deuxième cancer le plus répandu dans le monde et la principale cause de décès due à un cancer chez les femmes. L'amélioration du pronostic du cancer a été l'une des principales préoccupations afin de permettre une meilleure gestion et un meilleur traitement clinique des patients. Avec l'avancement rapide des technologies de profilage génomique durant ces dernières décennies, la disponibilité aisée d'une grande quantité de données génomiques pour la recherche médicale a motivé la tendance actuelle qui consiste à utiliser des outils informatiques tels que l'apprentissage statistique dans le domaine de la science des données afin de découvrir les biomarqueurs moléculaires en lien avec l'amélioration du pronostic. Cette thèse est conçue suivant deux directions d'approches destinées à répondre à deux défis majeurs dans l'analyse de données génomiques pour le pronostic du cancer du sein d'un point de vue méthodologique de l'apprentissage statistique : les approches basées sur le classement pour améliorer le pronostic moléculaire et les approches guidées par un réseau donné pour améliorer la découverte de biomarqueurs. D'autre part, les méthodologies développées et étudiées dans cette thèse, qui concernent respectivement l'apprentissage à partir de données de classements et l'apprentissage sur un graphe, apportent une contribution significative à plusieurs branches de l'apprentissage statistique, concernant au moins les applications à la biologie du cancer et la théorie du choix social
Rank-based Molecular Prognosis and Network-guided Biomarker Discovery for Breast Cancer
Le cancer du sein est le deuxième cancer le plus répandu dans le monde et la principale cause de décès due à un cancer chez les femmes. L'amélioration du pronostic du cancer a été l'une des principales préoccupations afin de permettre une meilleure gestion et un meilleur traitement clinique des patients. Avec l'avancement rapide des technologies de profilage génomique durant ces dernières décennies, la disponibilité aisée d'une grande quantité de données génomiques pour la recherche médicale a motivé la tendance actuelle qui consiste à utiliser des outils informatiques tels que l'apprentissage statistique dans le domaine de la science des données afin de découvrir les biomarqueurs moléculaires en lien avec l'amélioration du pronostic. Cette thèse est conçue suivant deux directions d'approches destinées à répondre à deux défis majeurs dans l'analyse de données génomiques pour le pronostic du cancer du sein d'un point de vue méthodologique de l'apprentissage statistique : les approches basées sur le classement pour améliorer le pronostic moléculaire et les approches guidées par un réseau donné pour améliorer la découverte de biomarqueurs. D'autre part, les méthodologies développées et étudiées dans cette thèse, qui concernent respectivement l'apprentissage à partir de données de classements et l'apprentissage sur un graphe, apportent une contribution significative à plusieurs branches de l'apprentissage statistique, concernant au moins les applications à la biologie du cancer et la théorie du choix social.Breast cancer is the second most common cancer worldwide and the leading cause of women's death from cancer. Improving cancer prognosis has been one of the problems of primary interest towards better clinical management and treatment decision making for cancer patients. With the rapid advancement of genomic profiling technologies in the past decades, easy availability of a substantial amount of genomic data for medical research has been motivating the currently popular trend of using computational tools, especially machine learning in the era of data science, to discover molecular biomarkers regarding prognosis improvement. This thesis is conceived following two lines of approaches intended to address two major challenges arising in genomic data analysis for breast cancer prognosis from a methodological standpoint of machine learning: rank-based approaches for improved molecular prognosis and network-guided approaches for enhanced biomarker discovery. Furthermore, the methodologies developed and investigated in this thesis, pertaining respectively to learning with rank data and learning on graphs, have a significant contribution to several branches of machine learning, concerning applications across but not limited to cancer biology and social choice theory
- …