Search CORE

118 research outputs found

A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts

Author: Akbarian Pedram
Ho Nhat
Nguyen Huy
Nguyen TrungTin
Publication venue
Publication date: 22/10/2023
Field of study

Mixture-of-experts (MoE) model incorporates the power of multiple submodels via gating functions to achieve greater performance in numerous regression and classification applications. From a theoretical perspective, while there have been previous attempts to comprehend the behavior of that model under the regression settings through the convergence analysis of maximum likelihood estimation in the Gaussian MoE model, such analysis under the setting of a classification problem has remained missing in the literature. We close this gap by establishing the convergence rates of density estimation and parameter estimation in the softmax gating multinomial logistic MoE model. Notably, when part of the expert parameters vanish, these rates are shown to be slower than polynomial rates owing to an inherent interaction between the softmax gating and expert functions via partial differential equations. To address this issue, we propose using a novel class of modified softmax gating functions which transform the input value before delivering them to the gating functions. As a result, the previous interaction disappears and the parameter estimation rates are significantly improved.Comment: 36 page

arXiv.org e-Print Archive

Minimax Optimal Rate for Parameter Estimation in Multivariate Deviated Models

Author: Do Dat
Ho Nhat
Nguyen Huy
Nguyen Khai
Publication venue
Publication date: 29/10/2023
Field of study

We study the maximum likelihood estimation (MLE) in the multivariate deviated model where the data are generated from the density function

(1-\lambda^{\ast})h_{0}(x)+\lambda^{\ast}f(x|\mu^{\ast}, \Sigma^{\ast})

in which

h_{0}

is a known function,

\lambda^{\ast} \in [0,1]

and

(\mu^{\ast}, \Sigma^{\ast})

are unknown parameters to estimate. The main challenges in deriving the convergence rate of the MLE mainly come from two issues: (1) The interaction between the function

h_{0}

and the density function

f

; (2) The deviated proportion

\lambda^{\ast}

can go to the extreme points of

[0,1]

as the sample size tends to infinity. To address these challenges, we develop the \emph{distinguishability condition} to capture the linear independent relation between the function

h_{0}

and the density function

f

. We then provide comprehensive convergence rates of the MLE via the vanishing rate of

\lambda^{\ast}

to zero as well as the distinguishability of two functions

h_{0}

and

f

.Comment: Dat Do and Huy Nguyen contributed equally to this work; 38 pages, 20 figure

arXiv.org e-Print Archive

Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts

Author: Akbarian Pedram
Ho Nhat
Nguyen Huy
Yan Fanqi
Publication venue
Publication date: 24/09/2023
Field of study

Top-K sparse softmax gating mixture of experts has been widely used for scaling up massive deep-learning architectures without increasing the computational cost. Despite its popularity in real-world applications, the theoretical understanding of that gating function has remained an open problem. The main challenge comes from the structure of the top-K sparse softmax gating function, which partitions the input space into multiple regions with distinct behaviors. By focusing on a Gaussian mixture of experts, we establish theoretical results on the effects of the top-K sparse softmax gating function on both density and parameter estimations. Our results hinge upon defining novel loss functions among parameters to capture different behaviors of the input regions. When the true number of experts

k_{\ast}

is known, we demonstrate that the convergence rates of density and parameter estimations are both parametric on the sample size. However, when

k_{\ast}

becomes unknown and the true model is over-specified by a Gaussian mixture of

k

experts where

k > k_{\ast}

, our findings suggest that the number of experts selected from the top-K sparse softmax gating function must exceed the total cardinality of a certain number of Voronoi cells associated with the true parameters to guarantee the convergence of the density estimation. Moreover, while the density estimation rate remains parametric under this setting, the parameter estimation rates become substantially slow due to an intrinsic interaction between the softmax gating and expert functions.Comment: 35 page

arXiv.org e-Print Archive

Hierarchical Sliced Wasserstein Distance

Author: Ho Nhat
Nguyen Huy
Nguyen Khai
Nguyen Tan
Ren Tongzheng
Rout Litu
Publication venue
Publication date: 29/09/2022
Field of study

Sliced Wasserstein (SW) distance has been widely used in different application scenarios since it can be scaled to a large number of supports without suffering from the curse of dimensionality. The value of sliced Wasserstein distance is the average of transportation cost between one-dimensional representations (projections) of original measures that are obtained by Radon Transform (RT). Despite its efficiency in the number of supports, estimating the sliced Wasserstein requires a relatively large number of projections in high-dimensional settings. Therefore, for applications where the number of supports is relatively small compared with the dimension, e.g., several deep learning applications where the mini-batch approaches are utilized, the complexities from matrix multiplication of Radon Transform become the main computational bottleneck. To address this issue, we propose to derive projections by linearly and randomly combining a smaller number of projections which are named bottleneck projections. We explain the usage of these projections by introducing Hierarchical Radon Transform (HRT) which is constructed by applying Radon Transform variants recursively. We then formulate the approach into a new metric between measures, named Hierarchical Sliced Wasserstein (HSW) distance. By proving the injectivity of HRT, we derive the metricity of HSW. Moreover, we investigate the theoretical properties of HSW including its connection to SW variants and its computational and sample complexities. Finally, we compare the computational cost and generative quality of HSW with the conventional SW on the task of deep generative modeling using various benchmark datasets including CIFAR10, CelebA, and Tiny ImageNet.Comment: 28 pages, 7 figures, 3 table

arXiv.org e-Print Archive

Myosin-II proteins are involved in the growth, morphogenesis, and virulence of the human pathogenic fungus Mucor circinelloides

Author: Huy Nhat Chu
Huy Nhat Chu
Mai Ngoc Le
Phuong Anh Nguyen
Trung Anh Trieu
Publication venue: 'Frontiers Media SA'
Publication date: 01/12/2022
Field of study

Mucormycosis is an emerging lethal invasive fungal infection. The infection caused by fungi belonging to the order Mucorales has been reported recently as one of the most common fungal infections among COVID-19 patients. The lack of understanding of pathogens, particularly at the molecular level, is one of the reasons for the difficulties in the management of the infection. Myosin is a diverse superfamily of actin-based motor proteins that have various cellular roles. Four families of myosin motors have been found in filamentous fungi, including myosin I, II, V, and fungus-specific chitin synthase with myosin motor domains. Our previous study on Mucor circinelloides, a common pathogen of mucormycosis, showed that the Myo5 protein (ID 51513) belonging to the myosin type V family had a critical impact on the growth and virulence of this fungus. In this study, to investigate the roles of myosin II proteins in M. circinelloides, silencing phenotypes and null mutants corresponding to myosin II encoding genes, designated mcmyo2A (ID 149958) and mcmyo2B (ID 136314), respectively, were generated. Those mutant strains featured a significantly reduced growth rate and impaired sporulation in comparison with the wild-type strain. Notably, the disruption of mcmyo2A led to an almost complete lack of sporulation. Both mutant strains displayed abnormally short, septate, and inflated hyphae with the presence of yeast-like cells and an unusual accumulation of pigment-filled vesicles. In vivo virulence assays of myosin-II mutant strains performed in the invertebrate model Galleria mellonella indicated that the mcmyo2A-knockout strain was avirulent, while the pathogenesis of the mcmyo2B null mutant was unaltered despite the low growth rate and impaired sporulation. The findings provide suggestions for critical contributions of the myosin II proteins to the polarity growth, septation, morphology, pigment transportation, and pathogenesis of M. circinelloides. The findings also implicate the myosin family as a potential target for future therapy to treat mucormycosis

Directory of Open Access Journals

Pre-treatment potential of electro-coagulation process using aluminum and titanium electrodes for instant coffee processing wastewater

Author: Anh Nguyen Thi Ngoc
Duc Nguyen Duc Dat
Hoan Nguyen Xuan
Nguyen Nhat Huy
Phan Hoang Quang Huy
Que Nguyen Thi
Thuy Nguyen Thi
Publication venue: 'IAIN Surakarta'
Publication date: 23/12/2019
Field of study

This study aimed at investigating the potential of electrocoagulation (EC) process using Al-Al and Al-Ti electrodes for the pre-treatment of instant coffee processing wastewater. Effects of various operating conditions, including cell voltage, time of treatment, inter-electrode distance, pH of solution, solution conductivity and agitation speed on the removals of chemical oxygen demand (COD) and color were considered. The maximum removal of COD and color was achieved at 87% and 99%, respectively, corresponding to COD and color in the effluents of 359-384 mg/L and 58-101 Pt-Co. Biodegradability of treated wastewater was significantly improved since BOD5/COD increased from initial value of 0.42 to 0.65 after treatment. Nether mixing nor adding of electrolyte was recommended. Moreover, the COD removal kinetics during EC process appeared to follow the first-order kinetic model. The operating costs were also determined as a reference for cost assessment of the treatment

Sustinere: Journal of Environment and Sustainability

Ultrasonic-Assisted Cathodic Plasma Electrolysis Approach for Producing of Graphene Nanosheets

Author: Dung Nguyen Quoc
Huy Nguyen Nhat
Van Hao Pham
Van Thanh Dang
Van Truong Nguyen
Publication venue: 'IntechOpen'
Publication date: 21/09/2019
Field of study

In this chapter, we review on the production of graphene by ultrasonic-assisted cathodic plasma electrolysis approach which involves a combination process of conventional electrolysis and plasma at ambient pressure and moderate temperature. Firstly, we review on the techniques for electrochemical preparation of graphene. Then, we briefly describe plasma electrolysis approach for producing of graphene. The mechanism, advantages, and disadvantages of this technique are discussed in detail

IntechOpen

Crossref

Regularization for a nonlinear backward parabolic problem with continuous spectrum operator

Author: Nhat Nguyen Do Minh
Trong Dang Duc
Tuan Nguyen Huy
Publication venue: 'Mathematical Notes'
Publication date: 01/01/2013
Field of study

Repository of the Academy's Library

CHARACTERIZATION AND ADSORPTION CAPACITY OF AMINE-SIO2 MATERIAL FOR NITRATE AND PHOSPHATE REMOVAL

Author: Hang Le Ngoc
Huy Nguyen Nhat
Thanh Nguyen Trung
Thich Le Tri
Toan Phan Phuoc
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 01/07/2019
Field of study

Amine-SiO2 material was synthesized and applied as a novel adsorbent for nitrate and phosphate removal from aqueous solution. The characterization of Amine-SiO2 were done by using TGA, FTIR, BET, and SEM analyses. Results showed that Amine-SiO2 had higher nitrate and phosphate adsorption capacity of 1.14 and 4.16 times, respectively, than commercial anion exchange resin (Akualite A420). In addition, Amine-SiO2 also had good durability with stable performance after at least 10 regeneration times, indicating that this material is very promising for commercialization in the future as an adsorbent for water treatment

Vietnam Academy of Science and Technology: Journals Online

Application of the cut-off projection to solve a backward heat conduction problem in a two-slab composite system

Author: Hung Tran The
Khoa Vo Anh
Minh Mach Nguyet
Truong Mai Thanh Nhat
Tuan Nguyen Huy
Publication venue
Publication date: 30/03/2018
Field of study

The main goal of this paper is applying the cut-off projection for solving one-dimensional backward heat conduction problem in a two-slab system with a perfect contact. In a constructive manner, we commence by demonstrating the Fourier-based solution that contains the drastic growth due to the high-frequency nature of the Fourier series. Such instability leads to the need of studying the projection method where the cut-off approach is derived consistently. In the theoretical framework, the first two objectives are to construct the regularized problem and prove its stability for each noise level. Our second interest is estimating the error in -norm. Another supplementary objective is computing the eigen-elements. All in all, this paper can be considered as a preliminary attempt to solve the heating/cooling of a two-slab composite system backward in time. Several numerical tests are provided to corroborate the qualitative analysis.Peer reviewe

arXiv.org e-Print Archive

Crossref

Helsingin yliopiston digitaalinen arkisto