107 research outputs found

    Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech

    Full text link
    Converting written texts into their spoken forms is an essential problem in any text-to-speech (TTS) systems. However, building an effective text normalization solution for a real-world TTS system face two main challenges: (1) the semantic ambiguity of non-standard words (NSWs), e.g., numbers, dates, ranges, scores, abbreviations, and (2) transforming NSWs into pronounceable syllables, such as URL, email address, hashtag, and contact name. In this paper, we propose a new two-phase normalization approach to deal with these challenges. First, a model-based tagger is designed to detect NSWs. Then, depending on NSW types, a rule-based normalizer expands those NSWs into their final verbal forms. We conducted three empirical experiments for NSW detection using Conditional Random Fields (CRFs), BiLSTM-CNN-CRF, and BERT-BiGRU-CRF models on a manually annotated dataset including 5819 sentences extracted from Vietnamese news articles. In the second phase, we propose a forward lexicon-based maximum matching algorithm to split down the hashtag, email, URL, and contact name. The experimental results of the tagging phase show that the average F1 scores of the BiLSTM-CNN-CRF and CRF models are above 90.00%, reaching the highest F1 of 95.00% with the BERT-BiGRU-CRF model. Overall, our approach has low sentence error rates, at 8.15% with CRF and 7.11% with BiLSTM-CNN-CRF taggers, and only 6.67% with BERT-BiGRU-CRF tagger.Comment: The 14th International Conference on Knowledge and Systems Engineering (KSE 2022

    Level-Based Analysis of the Univariate Marginal Distribution Algorithm

    Get PDF
    Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that search for optimal solutions by learning and sampling from probabilistic models. Despite their popularity in real-world applications, there is little rigorous understanding of their performance. Even for the Univariate Marginal Distribution Algorithm (UMDA) -- a simple population-based EDA assuming independence between decision variables -- the optimisation time on the linear problem OneMax was until recently undetermined. The incomplete theoretical understanding of EDAs is mainly due to lack of appropriate analytical tools. We show that the recently developed level-based theorem for non-elitist populations combined with anti-concentration results yield upper bounds on the expected optimisation time of the UMDA. This approach results in the bound O(nλlogλ+n2)\mathcal{O}(n\lambda\log \lambda+n^2) on two problems, LeadingOnes and BinVal, for population sizes λ>μ=Ω(logn)\lambda>\mu=\Omega(\log n), where μ\mu and λ\lambda are parameters of the algorithm. We also prove that the UMDA with population sizes μO(n)Ω(logn)\mu\in \mathcal{O}(\sqrt{n}) \cap \Omega(\log n) optimises OneMax in expected time O(λn)\mathcal{O}(\lambda n), and for larger population sizes μ=Ω(nlogn)\mu=\Omega(\sqrt{n}\log n), in expected time O(λn)\mathcal{O}(\lambda\sqrt{n}). The facility and generality of our arguments suggest that this is a promising approach to derive bounds on the expected optimisation time of EDAs.Comment: To appear in Algorithmica Journa

    Law on Corporate Governance - The Development Trends of the World and the Problems Posed to Vietnam

    Get PDF
    Abstract In recent years, following the trend of extensive international integration, prestigious organizations in the world such as OECD, World Bank, IFC,... and countries are trying to develop effective legally regulations and principles on corporate governance. In general, these rules basically affect each other so they have certain similarities which are all emphasizing the importance of independence, transparency and accountability in corporate governance. By researching the development trends of world law on corporate governance, the article will give valuable experiences for perfecting corporate governance laws in Vietnam

    A SOCIAL SURVEY ON COMMUNITY RESPONSE TO ROAD TRAFFIC NOISE IN HANOI

    Full text link
    Joint Research on Environmental Science and Technology for the Eart

    M^2UNet: MetaFormer Multi-scale Upsampling Network for Polyp Segmentation

    Full text link
    Polyp segmentation has recently garnered significant attention, and multiple methods have been formulated to achieve commendable outcomes. However, these techniques often confront difficulty when working with the complex polyp foreground and their surrounding regions because of the nature of convolution operation. Besides, most existing methods forget to exploit the potential information from multiple decoder stages. To address this challenge, we suggest combining MetaFormer, introduced as a baseline for integrating CNN and Transformer, with UNet framework and incorporating our Multi-scale Upsampling block (MU). This simple module makes it possible to combine multi-level information by exploring multiple receptive field paths of the shallow decoder stage and then adding with the higher stage to aggregate better feature representation, which is essential in medical image segmentation. Taken all together, we propose MetaFormer Multi-scale Upsampling Network (M2^2UNet) for the polyp segmentation task. Extensive experiments on five benchmark datasets demonstrate that our method achieved competitive performance compared with several previous methods

    Fabrication and evaluation of some electrochemical properties of screen-printed electrodes for use in electrochemical analysis

    Get PDF
    Three types of conductive inks, including Ceres, Acheson carbon inks, and Ag/AgCl ink, were utilized to fabricate screen-printed electrodes (SPEs) on a 0.4 mm thick polyethylene terephthalate substrate using a screen-printing technique. To enhance the electrical conductivity, the printed electrodes were cured at 80°C for 90 minutes. The basic electrochemical properties of the self-made SPEs using these conductive inks were determined, evaluated, and compared with commercial SPEs from Metrohm. Although the electroactive surface areas of the self-made SPEs were not significantly different from those of the commercial SPEs, the heterogeneous electron transfer rates on the surfaces of self-made SPEs using Ceres and Acheson inks were inferior to those of the commercial SPEs. However, after pre-condition by applying a potential of +1.2 V for 180 s in a 2 M Na2CO3 solution, the electrochemical properties of the self-made SPEs, including the active surface areas and heterogeneous electron transfer rates, were significantly improved and became better than those of the commercial SPEs
    corecore