945 research outputs found
The Bayesian two-sample t-test
In this article we show how the pooled-variance two-sample t-statistic arises from a Bayesian formulation of the two-sided point null testing problem, with emphasis on teaching. We identify a reasonable and useful prior giving a closed-form Bayes factor that can be written in terms of the distribution of the two-sample t-statistic under the null and alternative hypotheses respectively. This provides a Bayesian motivation for the two-sample t-statistic, which has heretofore been buried as a special case of more complex linear models, or given only roughly via analytic or Monte Carlo approximations. The resulting formulation of the Bayesian test is easy to apply in practice, and also easy to teach in an introductory course that emphasizes Bayesian methods. The priors are easy to use and simple to elicit, and the posterior probabilities are easily computed using available software, in some cases using spreadsheets
Recommended from our members
Conspiracy, Religion, and the Public Sphere: The Discourses of Far-Right Counterpublics in the U.S. and South Korea
Much research within the noncritical perspective on the public sphere has been quantitative. To strengthen the argument for an ideologically disinterested approach to the study of publicity and counterpublicity, we use ethnomethodological discourse analysis to analyze how far-right movements claim counterpublicity, or âdo being a counterpublic.â Specifically, we study the U.S. pundit Alex Jones and a prayer meeting of South Korean Evangelical Christians. For each, we considered how they created a shared discourse and attempted to change mainstream discourse while claiming being marginalized and different from the mainstream. Across these two case studies, the strategies for âdoing being a counterpublicâ are similar, even though they use different organizing symbolsâconspiracy in the U.S. versus religion in Korea. These case studies show that the functionalist perspective yields benefits to understanding how publicity and counterpublicity are negotiated among various groups of activist citizens
Coupling multiple views of relations for recommendation
Š Springer International Publishing Switzerland 2015. Learning user/item relation is a key issue in recommender system, and existing methods mostly measure the user/item relation from one particular aspect, e.g., historical ratings, etc. However, the relations between users/items could be influenced by multifaceted factors, so any single type of measure could get only a partial view of them. Thus it is more advisable to integrate measures from different aspects to estimate the underlying user/item relation. Furthermore, the estimation of underlying user/item relation should be optimal for current task. To this end, we propose a novel model to couple multiple relations measured on different aspects, and determine the optimal user/item relations via learning the optimal way of integrating these relation measures. Specifically, matrix factorization model is extended in this paper by considering the relations between latent factors of different users/items. Experiments are conducted and our method shows good performance and outperforms other baseline methods
Link Mining for Kernel-based Compound-Protein Interaction Predictions Using a Chemogenomics Approach
Virtual screening (VS) is widely used during computational drug discovery to
reduce costs. Chemogenomics-based virtual screening (CGBVS) can be used to
predict new compound-protein interactions (CPIs) from known CPI network data
using several methods, including machine learning and data mining. Although
CGBVS facilitates highly efficient and accurate CPI prediction, it has poor
performance for prediction of new compounds for which CPIs are unknown. The
pairwise kernel method (PKM) is a state-of-the-art CGBVS method and shows high
accuracy for prediction of new compounds. In this study, on the basis of link
mining, we improved the PKM by combining link indicator kernel (LIK) and
chemical similarity and evaluated the accuracy of these methods. The proposed
method obtained an average area under the precision-recall curve (AUPR) value
of 0.562, which was higher than that achieved by the conventional Gaussian
interaction profile (GIP) method (0.425), and the calculation time was only
increased by a few percent
Recommended from our members
Tailoring Tryptophan Synthase TrpB for Selective Quaternary Carbon Bond Formation
We previously engineered the β-subunit of tryptophan synthase (TrpB), which catalyzes the condensation of L-serine and indole to L-tryptophan, to synthesize a range of noncanonical amino acids from L-serine and indole derivatives or other nucleophiles. Here we employ directed evolution to engineer TrpB to accept 3-substituted oxindoles and form CâC bonds leading to new quaternary stereocenters. Initially, the variants that could use 3-substituted oxindoles preferentially formed NâC bonds on Nâ of the substrate. Protecting Nâ encouraged evolution toward C-alkylation, which persisted when protection was removed. Six generations of directed evolution resulted in TrpB Pf_(quat) with a 400-fold improvement in activity for alkylation of 3-substituted oxindoles and the ability to selectively form a new, all-carbon quaternary stereocenter at the Îł-position of the amino acid products. The enzyme can also alkylate and form all-carbon quaternary stereocenters on structurally similar lactones and ketones, where it exhibits excellent regioselectivity for the tertiary carbon. The configurations of the Îł-stereocenters of two of the products were determined via microcrystal electron diffraction (MicroED), and we report the MicroED structure of a small molecule obtained using the Falcon III direct electron detector. Highly thermostable and expressed at >500 mg/L E. coli culture, TrpB Pf_(quat) offers an efficient, sustainable, and selective platform for the construction of diverse noncanonical amino acids bearing all-carbon quaternary stereocenters
A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching
Entity Matching (EM) is a core data cleaning task, aiming to identify
different mentions of the same real-world entity. Active learning is one way to
address the challenge of scarce labeled data in practice, by dynamically
collecting the necessary examples to be labeled by an Oracle and refining the
learned model (classifier) upon them. In this paper, we build a unified active
learning benchmark framework for EM that allows users to easily combine
different learning algorithms with applicable example selection algorithms. The
goal of the framework is to enable concrete guidelines for practitioners as to
what active learning combinations will work well for EM. Towards this, we
perform comprehensive experiments on publicly available EM datasets from
product and publication domains to evaluate active learning methods, using a
variety of metrics including EM quality, #labels and example selection
latencies. Our most surprising result finds that active learning with fewer
labels can learn a classifier of comparable quality as supervised learning. In
fact, for several of the datasets, we show that there is an active learning
combination that beats the state-of-the-art supervised learning result. Our
framework also includes novel optimizations that improve the quality of the
learned model by roughly 9% in terms of F1-score and reduce example selection
latencies by up to 10x without affecting the quality of the model.Comment: accepted for publication in ACM-SIGMOD 2020, 15 page
Numerous proteins with unique characteristics are degraded by the 26S proteasome following monoubiquitination
The "canonical" proteasomal degradation signal is a substrate-anchored polyubiquitin chain. However, a handful of proteins were shown to be targeted following monoubiquitination. In this study, we established-in both human and yeast cells-a systematic approach for the identification of monoubiquitination-dependent proteasomal substrates. The cellular wild-type polymerizable ubiquitin was replaced with ubiquitin that cannot form chains. Using proteomic analysis, we screened for substrates that are nevertheless degraded under these conditions compared with those that are stabilized, and therefore require polyubiquitination for their degradation. For randomly sampled representative substrates, we confirmed that their cellular stability is in agreement with our screening prediction. Importantly, the two groups display unique features: monoubiquitinated substrates are smaller than the polyubiquitinated ones, are enriched in specific pathways, and, in humans, are structurally less disordered. We suggest that monoubiquitination-dependent degradation is more widespread than assumed previously, and plays key roles in various cellular processes
A human cell atlas of fetal gene expression
The gene expression program underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of gene expression and chromatin accessibility in fetal tissues. For gene expression, we applied three-level combinatorial indexing to >110 samples representing 15 organs, ultimately profiling ~4 million single cells. We leveraged the literature and other atlases to identify and annotate hundreds of cell types and subtypes, both within and across tissues. Our analyses focused on organ-specific specializations of broadly distributed cell types (such as blood, endothelial, and epithelial), sites of fetal erythropoiesis (which notably included the adrenal gland), and integration with mouse developmental atlases (such as conserved specification of blood cells). These data represent a rich resource for the exploration of in vivo human gene expression in diverse tissues and cell types
MiRNA-Mediated Control of HLA-G Expression and Function
HLA-G is a non-classical HLA class-Ib molecule expressed mainly by the extravillous cytotrophoblasts (EVT) of the placenta. The expression of HLA-G on these fetal cells protects the EVT cells from immune rejection and is therefore important for a healthy pregnancy. The mechanisms controlling HLA-G expression are largely unknown. Here we demonstrate that miR-148a and miR-152 down-regulate HLA-G expression by binding its 3â˛UTR and that this down-regulation of HLA-G affects LILRB1 recognition and consequently, abolishes the LILRB1-mediated inhibition of NK cell killing. We further demonstrate that the C/G polymorphism at position +3142 of HLA-G 3â˛UTR has no effect on the miRNA targeting of HLA-G. We show that in the placenta both miR-148a and miR-152 miRNAs are expressed at relatively low levels, compared to other healthy tissues, and that the mRNA levels of HLA-G are particularly high and we therefore suggest that this might enable the tissue specific expression of HLA-G
- âŚ