548 research outputs found

    Lysine acetyltransferase Gcn5-B regulates the expression of crucial genes in Toxoplasma and its function is regulated through lysine acetylation

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Histone acetylation has been linked to developmental changes in gene expression and is a validated drug target of apicomplexan parasites, but little is known about the roles of individual histone modifying enzymes and how they are recruited to target genes. The protozoan parasite Toxoplasma gondii (phylum Apicomplexa) is unusual among invertebrates in possessing two GCN5-family lysine acetyltransferases (KATs). While GCN5a is required for gene expression in response to alkaline stress, this KAT is dispensable for parasite proliferation in normal culture conditions. In contrast, GCN5b cannot be disrupted, suggesting it is essential for Toxoplasma viability. To further explore the function of GCN5b, we generated clonal parasites expressing an inducible HA-tagged form of GCN5b containing a point mutation that ablates enzymatic activity (E703G). Stabilization of this dominant-negative form of GCN5b was mediated through ligand-binding to a destabilization domain (dd) fused to the protein. Induced accumulation of the ddHAGCN5b(E703G) protein led to a rapid arrest in parasite replication. Growth arrest was accompanied by a decrease in histone H3 acetylation at specific lysine residues as well as reduced expression of GCN5b target genes in GCN5b(E703G) parasites, which were identified using chromatin immunoprecipitation coupled with microarray hybridization (ChIP-chip). We also demonstrate that GCN5b interacts with AP2-domain proteins, which are plant-like transcription factors in Apicomplexa. The interactions between GCN5b, AP2IX-7, and AP2X-8 were confirmed by reciprocal co-immunoprecipitation and revealed a “core complex” that includes the co-activator ADA2-A, TFIID subunits, LEO1 polymerase-associated factor (Paf1) subunit, and RRM proteins. The dominant-negative phenotype of ddHAGCN5b(E703G) parasites, considered with the proteomics and ChIP-chip data, indicate that GCN5b plays a central role in transcriptional and chromatin remodeling complexes. We conclude that GCN5b has a non-redundant and indispensable role in regulating gene expression required during the Toxoplasma lytic cycle

    A Note on "Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms"

    Full text link
    Data valuation is a growing research field that studies the influence of individual data points for machine learning (ML) models. Data Shapley, inspired by cooperative game theory and economics, is an effective method for data valuation. However, it is well-known that the Shapley value (SV) can be computationally expensive. Fortunately, Jia et al. (2019) showed that for K-Nearest Neighbors (KNN) models, the computation of Data Shapley is surprisingly simple and efficient. In this note, we revisit the work of Jia et al. (2019) and propose a more natural and interpretable utility function that better reflects the performance of KNN models. We derive the corresponding calculation procedure for the Data Shapley of KNN classifiers/regressors with the new utility functions. Our new approach, dubbed soft-label KNN-SV, achieves the same time complexity as the original method. We further provide an efficient approximation algorithm for soft-label KNN-SV based on locality sensitive hashing (LSH). Our experimental results demonstrate that Soft-label KNN-SV outperforms the original method on most datasets in the task of mislabeled data detection, making it a better baseline for future work on data valuation

    Data Banzhaf: A Robust Data Valuation Framework for Machine Learning

    Full text link
    Data valuation has wide use cases in machine learning, including improving data quality and creating economic incentives for data sharing. This paper studies the robustness of data valuation to noisy model performance scores. Particularly, we find that the inherent randomness of the widely used stochastic gradient descent can cause existing data value notions (e.g., the Shapley value and the Leave-one-out error) to produce inconsistent data value rankings across different runs. To address this challenge, we introduce the concept of safety margin, which measures the robustness of a data value notion. We show that the Banzhaf value, a famous value notion that originated from cooperative game theory literature, achieves the largest safety margin among all semivalues (a class of value notions that satisfy crucial properties entailed by ML applications and include the famous Shapley value and Leave-one-out error). We propose an algorithm to efficiently estimate the Banzhaf value based on the Maximum Sample Reuse (MSR) principle. Our evaluation demonstrates that the Banzhaf value outperforms the existing semivalue-based data value notions on several ML tasks such as learning with weighted samples and noisy label detection. Overall, our study suggests that when the underlying ML algorithm is stochastic, the Banzhaf value is a promising alternative to the other semivalue-based data value schemes given its computational advantage and ability to robustly differentiate data quality.Comment: AISTATS 2023 Ora

    Transformer Based Multi-Grained Features for Unsupervised Person Re-Identification

    Full text link
    Multi-grained features extracted from convolutional neural networks (CNNs) have demonstrated their strong discrimination ability in supervised person re-identification (Re-ID) tasks. Inspired by them, this work investigates the way of extracting multi-grained features from a pure transformer network to address the unsupervised Re-ID problem that is label-free but much more challenging. To this end, we build a dual-branch network architecture based upon a modified Vision Transformer (ViT). The local tokens output in each branch are reshaped and then uniformly partitioned into multiple stripes to generate part-level features, while the global tokens of two branches are averaged to produce a global feature. Further, based upon offline-online associated camera-aware proxies (O2CAP) that is a top-performing unsupervised Re-ID method, we define offline and online contrastive learning losses with respect to both global and part-level features to conduct unsupervised learning. Extensive experiments on three person Re-ID datasets show that the proposed method outperforms state-of-the-art unsupervised methods by a considerable margin, greatly mitigating the gap to supervised counterparts. Code will be available soon at https://github.com/RikoLi/WACV23-workshop-TMGF.Comment: Accepted by WACVW 2023, 3rd Workshop on Real-World Surveillance: Applications and Challenge

    Novel Adaptive Sampling Algorithm for POD-Based Non-Intrusive Reduced Order Model

    Get PDF
    The proper orthogonal decomposition (POD) based reduced-order model (ROM) has been an effective tool for flow field prediction in the engineering industry. The sample selection in the design space for POD basis construction affects the ROM performance sensitively. Adaptive sampling can significantly reduce the number of samples to achieve the required model accuracy. In this work, we propose a novel adaptive sampling algorithm, called conjunction sampling strategy, which is based on proven strategies. The conjunction sampling strategy is demonstrated on airfoil flow field prediction within the transonic regime. We demonstrate the performance of the proposed strategy by running 10 trials for each strategy for the robustness tests. Results show that the conjunction sampling strategy consistently achieves higher predictive accuracy compared with Latin hypercube sampling (LHS) and existing strategies. Specifically, under the same computational budget (40 training samples in total), the conjunction strategy reduced the L2 error by 56.7% compared with LHS. In addition, the conjunction strategy reduced the standard deviation of L2 errors by 62.1% with a 2.6% increase on the mean error compared with the best existing strategy
    • …