42 research outputs found
Bayesian Robust Tensor Factorization for Incomplete Multiway Data
We propose a generative model for robust tensor factorization in the presence
of both missing data and outliers. The objective is to explicitly infer the
underlying low-CP-rank tensor capturing the global information and a sparse
tensor capturing the local information (also considered as outliers), thus
providing the robust predictive distribution over missing entries. The
low-CP-rank tensor is modeled by multilinear interactions between multiple
latent factors on which the column sparsity is enforced by a hierarchical
prior, while the sparse tensor is modeled by a hierarchical view of Student-
distribution that associates an individual hyperparameter with each element
independently. For model learning, we develop an efficient closed-form
variational inference under a fully Bayesian treatment, which can effectively
prevent the overfitting problem and scales linearly with data size. In contrast
to existing related works, our method can perform model selection automatically
and implicitly without need of tuning parameters. More specifically, it can
discover the groundtruth of CP rank and automatically adapt the sparsity
inducing priors to various types of outliers. In addition, the tradeoff between
the low-rank approximation and the sparse representation can be optimized in
the sense of maximum model evidence. The extensive experiments and comparisons
with many state-of-the-art algorithms on both synthetic and real-world datasets
demonstrate the superiorities of our method from several perspectives.Comment: in IEEE Transactions on Neural Networks and Learning Systems, 201
Towards Efficient and Accurate Approximation: Tensor Decomposition Based on Randomized Block Krylov Iteration
Efficient and accurate low-rank approximation (LRA) methods are of great
significance for large-scale data analysis. Randomized tensor decompositions
have emerged as powerful tools to meet this need, but most existing methods
perform poorly in the presence of noise interference. Inspired by the
remarkable performance of randomized block Krylov iteration (rBKI) in reducing
the effect of tail singular values, this work designs an rBKI-based Tucker
decomposition (rBKI-TK) for accurate approximation, together with a
hierarchical tensor ring decomposition based on rBKI-TK for efficient
compression of large-scale data. Besides, the error bound between the
deterministic LRA and the randomized LRA is studied. Numerical experiences
demonstrate the efficiency, accuracy and scalability of the proposed methods in
both data compression and denoising
Effect of coffee consumption on thyroid function: NHANES 2007-2012 and Mendelian randomization
BackgroundCoffee is one of the most consumed beverages worldwide, but the effects on the thyroid are unknown. This study aims to examine the association between coffee and thyroid function.MethodsParticipant data (≥ 20 years, n = 6578) for the observational study were obtained from NHANES 2007-2012. Analysis was performed using weighted linear regression models and multiple logistic regression models. Genetic datasets for Hyperthyroidism and Hypothyroidism were obtained from the IEU database and contained 462,933 European samples. Mendelian randomization (MR) was used for the analysis, inverse variance weighting (IVW) was the main method of analysis.ResultsIn the model adjusted for other covariates, participants who drank 2-4 cups of coffee per day had significantly lower TSH concentrations compared to non-coffee drinkers (b=-0.23, 95% CI: -0.30, -0.16), but no statistically significant changes in TT4, FT4, TT3 and FT3. In addition, participants who drank <2 cups of coffee per day showed a low risk of developing subclinical hypothyroidism. (OR=0.60, 95% CI: 0.41, 0.88) Observational studies and MR studies have demonstrated both that coffee consumption has no effect on the risk of hyperthyroidism and hypothyroidism.ConclusionsOur study showed that drinking <2 cups of coffee per day reduced the risk of subclinical hypothyroidism and drinking 2-4 cups of coffee reduced serum TSH concentrations. In addition, coffee consumption was not associated with the risk of hyperthyroidism and hypothyroidism
Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural Networks
Achieving efficient and robust multi-channel data learning is a challenging
task in data science. By exploiting low-rankness in the transformed domain,
i.e., transformed low-rankness, tensor Singular Value Decomposition (t-SVD) has
achieved extensive success in multi-channel data representation and has
recently been extended to function representation such as Neural Networks with
t-product layers (t-NNs). However, it still remains unclear how t-SVD
theoretically affects the learning behavior of t-NNs. This paper is the first
to answer this question by deriving the upper bounds of the generalization
error of both standard and adversarially trained t-NNs. It reveals that the
t-NNs compressed by exact transformed low-rank parameterization can achieve a
sharper adversarial generalization bound. In practice, although t-NNs rarely
have exactly transformed low-rank weights, our analysis further shows that by
adversarial training with gradient flow (GF), the over-parameterized t-NNs with
ReLU activations are trained with implicit regularization towards transformed
low-rank parameterization under certain conditions. We also establish
adversarial generalization bounds for t-NNs with approximately transformed
low-rank weights. Our analysis indicates that the transformed low-rank
parameterization can promisingly enhance robust generalization for t-NNs.Comment: 46 pages, accepted to NeurIPS 2023. We have corrected several typos
in the first version (arXiv:2303.00196
Guaranteed Robust Tensor Completion via ∗L-SVD with Applications to Remote Sensing Data
This paper conducts a rigorous analysis for the problem of robust tensor completion, which aims at recovering an unknown three-way tensor from incomplete observations corrupted by gross sparse outliers and small dense noises simultaneously due to various reasons such as sensor dead pixels, communication loss, electromagnetic interferences, cloud shadows, etc. To estimate the underlying tensor, a new penalized least squares estimator is first formulated by exploiting the low rankness of the signal tensor within the framework of tensor ∗L-Singular Value Decomposition (∗L-SVD) and leveraging the sparse structure of the outlier tensor. Then, an algorithm based on the Alternating Direction Method of Multipliers (ADMM) is designed to compute the estimator in an efficient way. Statistically, the non-asymptotic upper bound on the estimation error is established and further proved to be optimal (up to a log factor) in a minimax sense. Simulation studies on synthetic data demonstrate that the proposed error bound can predict the scaling behavior of the estimation error with problem parameters (i.e., tubal rank of the underlying tensor, sparsity of the outliers, and the number of uncorrupted observations). Both the effectiveness and efficiency of the proposed algorithm are evaluated through experiments for robust completion on seven different types of remote sensing data
Five New Terpenes with Cytotoxic Activity from <i>Pestalotiopsis</i> sp.
Five new compounds called Pestalotis A–E (1–5), comprising three monoterpene-lactone compounds (1–3), one tetrahydrobenzofuran derivative (4), and one sesquiterpene (5), were isolated from the EtOAc extract of Pestalotiopsis sp. The structures of the new compounds were elucidated by analysis of their NMR, HRMS, and ECD spectra, and the absolute configurations were established through the comparison of experimental and calculated ECD spectra. All compounds were tested for antitumor activity against SW-480, LoVo, HuH-7, and MCF-7. The results showed that compounds 2 and 4 exhibited potent antitumor activity against SW-480, LoVo, and HuH-7 cell lines. Furthermore, compound 4 was assessed against HuH-7, and the results indicated that the rate of apoptosis was dose-dependent