107 research outputs found
Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech
Converting written texts into their spoken forms is an essential problem in
any text-to-speech (TTS) systems. However, building an effective text
normalization solution for a real-world TTS system face two main challenges:
(1) the semantic ambiguity of non-standard words (NSWs), e.g., numbers, dates,
ranges, scores, abbreviations, and (2) transforming NSWs into pronounceable
syllables, such as URL, email address, hashtag, and contact name. In this
paper, we propose a new two-phase normalization approach to deal with these
challenges. First, a model-based tagger is designed to detect NSWs. Then,
depending on NSW types, a rule-based normalizer expands those NSWs into their
final verbal forms. We conducted three empirical experiments for NSW detection
using Conditional Random Fields (CRFs), BiLSTM-CNN-CRF, and BERT-BiGRU-CRF
models on a manually annotated dataset including 5819 sentences extracted from
Vietnamese news articles. In the second phase, we propose a forward
lexicon-based maximum matching algorithm to split down the hashtag, email, URL,
and contact name. The experimental results of the tagging phase show that the
average F1 scores of the BiLSTM-CNN-CRF and CRF models are above 90.00%,
reaching the highest F1 of 95.00% with the BERT-BiGRU-CRF model. Overall, our
approach has low sentence error rates, at 8.15% with CRF and 7.11% with
BiLSTM-CNN-CRF taggers, and only 6.67% with BERT-BiGRU-CRF tagger.Comment: The 14th International Conference on Knowledge and Systems
Engineering (KSE 2022
Level-Based Analysis of the Univariate Marginal Distribution Algorithm
Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that
search for optimal solutions by learning and sampling from probabilistic
models. Despite their popularity in real-world applications, there is little
rigorous understanding of their performance. Even for the Univariate Marginal
Distribution Algorithm (UMDA) -- a simple population-based EDA assuming
independence between decision variables -- the optimisation time on the linear
problem OneMax was until recently undetermined. The incomplete theoretical
understanding of EDAs is mainly due to lack of appropriate analytical tools.
We show that the recently developed level-based theorem for non-elitist
populations combined with anti-concentration results yield upper bounds on the
expected optimisation time of the UMDA. This approach results in the bound
on two problems, LeadingOnes and
BinVal, for population sizes , where and
are parameters of the algorithm. We also prove that the UMDA with
population sizes optimises
OneMax in expected time , and for larger population
sizes , in expected time
. The facility and generality of our arguments
suggest that this is a promising approach to derive bounds on the expected
optimisation time of EDAs.Comment: To appear in Algorithmica Journa
Law on Corporate Governance - The Development Trends of the World and the Problems Posed to Vietnam
Abstract In recent years, following the trend of extensive international integration, prestigious organizations in the world such as OECD, World Bank, IFC,... and countries are trying to develop effective legally regulations and principles on corporate governance. In general, these rules basically affect each other so they have certain similarities which are all emphasizing the importance of independence, transparency and accountability in corporate governance. By researching the development trends of world law on corporate governance, the article will give valuable experiences for perfecting corporate governance laws in Vietnam
A SOCIAL SURVEY ON COMMUNITY RESPONSE TO ROAD TRAFFIC NOISE IN HANOI
Joint Research on Environmental Science and Technology for the Eart
M^2UNet: MetaFormer Multi-scale Upsampling Network for Polyp Segmentation
Polyp segmentation has recently garnered significant attention, and multiple
methods have been formulated to achieve commendable outcomes. However, these
techniques often confront difficulty when working with the complex polyp
foreground and their surrounding regions because of the nature of convolution
operation. Besides, most existing methods forget to exploit the potential
information from multiple decoder stages. To address this challenge, we suggest
combining MetaFormer, introduced as a baseline for integrating CNN and
Transformer, with UNet framework and incorporating our Multi-scale Upsampling
block (MU). This simple module makes it possible to combine multi-level
information by exploring multiple receptive field paths of the shallow decoder
stage and then adding with the higher stage to aggregate better feature
representation, which is essential in medical image segmentation. Taken all
together, we propose MetaFormer Multi-scale Upsampling Network (MUNet) for
the polyp segmentation task. Extensive experiments on five benchmark datasets
demonstrate that our method achieved competitive performance compared with
several previous methods
Fabrication and evaluation of some electrochemical properties of screen-printed electrodes for use in electrochemical analysis
Three types of conductive inks, including Ceres, Acheson carbon inks, and Ag/AgCl ink, were utilized to fabricate screen-printed electrodes (SPEs) on a 0.4 mm thick polyethylene terephthalate substrate using a screen-printing technique. To enhance the electrical conductivity, the printed electrodes were cured at 80°C for 90 minutes. The basic electrochemical properties of the self-made SPEs using these conductive inks were determined, evaluated, and compared with commercial SPEs from Metrohm. Although the electroactive surface areas of the self-made SPEs were not significantly different from those of the commercial SPEs, the heterogeneous electron transfer rates on the surfaces of self-made SPEs using Ceres and Acheson inks were inferior to those of the commercial SPEs. However, after pre-condition by applying a potential of +1.2 V for 180 s in a 2 M Na2CO3 solution, the electrochemical properties of the self-made SPEs, including the active surface areas and heterogeneous electron transfer rates, were significantly improved and became better than those of the commercial SPEs
- …