139 research outputs found
Minimax Optimal Rate for Parameter Estimation in Multivariate Deviated Models
We study the maximum likelihood estimation (MLE) in the multivariate deviated
model where the data are generated from the density function
in
which is a known function, and are unknown parameters to estimate. The main challenges in
deriving the convergence rate of the MLE mainly come from two issues: (1) The
interaction between the function and the density function ; (2) The
deviated proportion can go to the extreme points of as
the sample size tends to infinity. To address these challenges, we develop the
\emph{distinguishability condition} to capture the linear independent relation
between the function and the density function . We then provide
comprehensive convergence rates of the MLE via the vanishing rate of
to zero as well as the distinguishability of two functions
and .Comment: Dat Do and Huy Nguyen contributed equally to this work; 38 pages, 20
figure
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
Mixture-of-experts (MoE) model incorporates the power of multiple submodels
via gating functions to achieve greater performance in numerous regression and
classification applications. From a theoretical perspective, while there have
been previous attempts to comprehend the behavior of that model under the
regression settings through the convergence analysis of maximum likelihood
estimation in the Gaussian MoE model, such analysis under the setting of a
classification problem has remained missing in the literature. We close this
gap by establishing the convergence rates of density estimation and parameter
estimation in the softmax gating multinomial logistic MoE model. Notably, when
part of the expert parameters vanish, these rates are shown to be slower than
polynomial rates owing to an inherent interaction between the softmax gating
and expert functions via partial differential equations. To address this issue,
we propose using a novel class of modified softmax gating functions which
transform the input value before delivering them to the gating functions. As a
result, the previous interaction disappears and the parameter estimation rates
are significantly improved.Comment: 36 page
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts
Top-K sparse softmax gating mixture of experts has been widely used for
scaling up massive deep-learning architectures without increasing the
computational cost. Despite its popularity in real-world applications, the
theoretical understanding of that gating function has remained an open problem.
The main challenge comes from the structure of the top-K sparse softmax gating
function, which partitions the input space into multiple regions with distinct
behaviors. By focusing on a Gaussian mixture of experts, we establish
theoretical results on the effects of the top-K sparse softmax gating function
on both density and parameter estimations. Our results hinge upon defining
novel loss functions among parameters to capture different behaviors of the
input regions. When the true number of experts is known, we
demonstrate that the convergence rates of density and parameter estimations are
both parametric on the sample size. However, when becomes unknown
and the true model is over-specified by a Gaussian mixture of experts where
, our findings suggest that the number of experts selected from
the top-K sparse softmax gating function must exceed the total cardinality of a
certain number of Voronoi cells associated with the true parameters to
guarantee the convergence of the density estimation. Moreover, while the
density estimation rate remains parametric under this setting, the parameter
estimation rates become substantially slow due to an intrinsic interaction
between the softmax gating and expert functions.Comment: 35 page
Hierarchical Sliced Wasserstein Distance
Sliced Wasserstein (SW) distance has been widely used in different
application scenarios since it can be scaled to a large number of supports
without suffering from the curse of dimensionality. The value of sliced
Wasserstein distance is the average of transportation cost between
one-dimensional representations (projections) of original measures that are
obtained by Radon Transform (RT). Despite its efficiency in the number of
supports, estimating the sliced Wasserstein requires a relatively large number
of projections in high-dimensional settings. Therefore, for applications where
the number of supports is relatively small compared with the dimension, e.g.,
several deep learning applications where the mini-batch approaches are
utilized, the complexities from matrix multiplication of Radon Transform become
the main computational bottleneck. To address this issue, we propose to derive
projections by linearly and randomly combining a smaller number of projections
which are named bottleneck projections. We explain the usage of these
projections by introducing Hierarchical Radon Transform (HRT) which is
constructed by applying Radon Transform variants recursively. We then formulate
the approach into a new metric between measures, named Hierarchical Sliced
Wasserstein (HSW) distance. By proving the injectivity of HRT, we derive the
metricity of HSW. Moreover, we investigate the theoretical properties of HSW
including its connection to SW variants and its computational and sample
complexities. Finally, we compare the computational cost and generative quality
of HSW with the conventional SW on the task of deep generative modeling using
various benchmark datasets including CIFAR10, CelebA, and Tiny ImageNet.Comment: 28 pages, 7 figures, 3 table
Pre-treatment potential of electro-coagulation process using aluminum and titanium electrodes for instant coffee processing wastewater
This study aimed at investigating the potential of electrocoagulation (EC) process using Al-Al and Al-Ti electrodes for the pre-treatment of instant coffee processing wastewater. Effects of various operating conditions, including cell voltage, time of treatment, inter-electrode distance, pH of solution, solution conductivity and agitation speed on the removals of chemical oxygen demand (COD) and color were considered. The maximum removal of COD and color was achieved at 87% and 99%, respectively, corresponding to COD and color in the effluents of 359-384 mg/L and 58-101 Pt-Co. Biodegradability of treated wastewater was significantly improved since BOD5/COD increased from initial value of 0.42 to 0.65 after treatment. Nether mixing nor adding of electrolyte was recommended. Moreover, the COD removal kinetics during EC process appeared to follow the first-order kinetic model. The operating costs were also determined as a reference for cost assessment of the treatment
Myosin-II proteins are involved in the growth, morphogenesis, and virulence of the human pathogenic fungus Mucor circinelloides
Mucormycosis is an emerging lethal invasive fungal infection. The infection caused by fungi belonging to the order Mucorales has been reported recently as one of the most common fungal infections among COVID-19 patients. The lack of understanding of pathogens, particularly at the molecular level, is one of the reasons for the difficulties in the management of the infection. Myosin is a diverse superfamily of actin-based motor proteins that have various cellular roles. Four families of myosin motors have been found in filamentous fungi, including myosin I, II, V, and fungus-specific chitin synthase with myosin motor domains. Our previous study on Mucor circinelloides, a common pathogen of mucormycosis, showed that the Myo5 protein (ID 51513) belonging to the myosin type V family had a critical impact on the growth and virulence of this fungus. In this study, to investigate the roles of myosin II proteins in M. circinelloides, silencing phenotypes and null mutants corresponding to myosin II encoding genes, designated mcmyo2A (ID 149958) and mcmyo2B (ID 136314), respectively, were generated. Those mutant strains featured a significantly reduced growth rate and impaired sporulation in comparison with the wild-type strain. Notably, the disruption of mcmyo2A led to an almost complete lack of sporulation. Both mutant strains displayed abnormally short, septate, and inflated hyphae with the presence of yeast-like cells and an unusual accumulation of pigment-filled vesicles. In vivo virulence assays of myosin-II mutant strains performed in the invertebrate model Galleria mellonella indicated that the mcmyo2A-knockout strain was avirulent, while the pathogenesis of the mcmyo2B null mutant was unaltered despite the low growth rate and impaired sporulation. The findings provide suggestions for critical contributions of the myosin II proteins to the polarity growth, septation, morphology, pigment transportation, and pathogenesis of M. circinelloides. The findings also implicate the myosin family as a potential target for future therapy to treat mucormycosis
Ultrasonic-Assisted Cathodic Plasma Electrolysis Approach for Producing of Graphene Nanosheets
In this chapter, we review on the production of graphene by ultrasonic-assisted cathodic plasma electrolysis approach which involves a combination process of conventional electrolysis and plasma at ambient pressure and moderate temperature. Firstly, we review on the techniques for electrochemical preparation of graphene. Then, we briefly describe plasma electrolysis approach for producing of graphene. The mechanism, advantages, and disadvantages of this technique are discussed in detail
IncepSE: Leveraging InceptionTime's performance with Squeeze and Excitation mechanism in ECG analysis
Our study focuses on the potential for modifications of Inception-like
architecture within the electrocardiogram (ECG) domain. To this end, we
introduce IncepSE, a novel network characterized by strategic architectural
incorporation that leverages the strengths of both InceptionTime and channel
attention mechanisms. Furthermore, we propose a training setup that employs
stabilization techniques that are aimed at tackling the formidable challenges
of severe imbalance dataset PTB-XL and gradient corruption. By this means, we
manage to set a new height for deep learning model in a supervised learning
manner across the majority of tasks. Our model consistently surpasses
InceptionTime by substantial margins compared to other state-of-the-arts in
this domain, noticeably 0.013 AUROC score improvement in the "all" task, while
also mitigating the inherent dataset fluctuations during training
CHARACTERIZATION AND ADSORPTION CAPACITY OF AMINE-SIO2 MATERIAL FOR NITRATE AND PHOSPHATE REMOVAL
Amine-SiO2 material was synthesized and applied as a novel adsorbent for nitrate and phosphate removal from aqueous solution. The characterization of Amine-SiO2 were done by using TGA, FTIR, BET, and SEM analyses. Results showed that Amine-SiO2 had higher nitrate and phosphate adsorption capacity of 1.14 and 4.16 times, respectively, than commercial anion exchange resin (Akualite A420). In addition, Amine-SiO2 also had good durability with stable performance after at least 10 regeneration times, indicating that this material is very promising for commercialization in the future as an adsorbent for water treatment
- …