44 research outputs found

    Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech

    Full text link
    Converting written texts into their spoken forms is an essential problem in any text-to-speech (TTS) systems. However, building an effective text normalization solution for a real-world TTS system face two main challenges: (1) the semantic ambiguity of non-standard words (NSWs), e.g., numbers, dates, ranges, scores, abbreviations, and (2) transforming NSWs into pronounceable syllables, such as URL, email address, hashtag, and contact name. In this paper, we propose a new two-phase normalization approach to deal with these challenges. First, a model-based tagger is designed to detect NSWs. Then, depending on NSW types, a rule-based normalizer expands those NSWs into their final verbal forms. We conducted three empirical experiments for NSW detection using Conditional Random Fields (CRFs), BiLSTM-CNN-CRF, and BERT-BiGRU-CRF models on a manually annotated dataset including 5819 sentences extracted from Vietnamese news articles. In the second phase, we propose a forward lexicon-based maximum matching algorithm to split down the hashtag, email, URL, and contact name. The experimental results of the tagging phase show that the average F1 scores of the BiLSTM-CNN-CRF and CRF models are above 90.00%, reaching the highest F1 of 95.00% with the BERT-BiGRU-CRF model. Overall, our approach has low sentence error rates, at 8.15% with CRF and 7.11% with BiLSTM-CNN-CRF taggers, and only 6.67% with BERT-BiGRU-CRF tagger.Comment: The 14th International Conference on Knowledge and Systems Engineering (KSE 2022

    Metaheuristic for Solving the Delivery Man Problem with Drone

    Get PDF
    Delivery Man Problem with Drone (DMPD) is a variant of Delivery Man Problem (DMP). The objective of DMP is to minimize the sum of customers' waiting times. In DMP, there is only a truck to deliver materials to customers while the delivery is completed by collaboration between truck and drone in DMPD. Using a drone is useful when a truck cannot reach some customers in particular circumstances such as narrow roads or natural disasters. For NP-hard problems, metaheuristic is a natural approach to solve medium to large-sized instances. In this paper, a metaheuristic algorithm is proposed. Initially, a solution without drone is created. Then, it is an input of split procedure to convert DMP-solution into DMPD-solution. After that, it is improved by the combination of Variable Neighborhood Search (VNS) and Tabu Search (TS). To explore a new solution space, diversification is applied. The proposed algorithm balances diversification and intensification to prevent the search from local optima. The experimental simulations show that the proposed algorithm reaches good solutions fast, even for large instances

    An Effective Metaheuristic for Multiple Traveling Repairman Problem with Distance Constraints

    Get PDF
    Multiple Traveling Repairman Problem with Distance Constraints (MTRPD) is an extension of the NP-hard Multiple Traveling Repairman Problem. In MTRPD, a fleet of identical vehicles is dispatched to serve a set of customers with the following constraints. First, each vehicle's travel distance is limited by a threshold. Second, each customer must be visited exactly once. Our goal is to find the visiting order that minimizes the sum of waiting times. To solve MTRPD we propose to combine the Insertion Heuristic (IH), Variable Neighborhood Search (VNS), and Tabu Search (TS) algorithms into an effective two-phase metaheuristic that includes a construction phase and an improvement phase. In the former phase, IH is used to create an initial solution. In the latter phase, we use VNS to generate various neighborhoods, while TS is employed to mainly prohibit from getting trapped into cycles. By doing so, our algorithm can support the search to escape local optima. In addition, we introduce a novel neighborhoods’ structure and a constant time operation which are efficient for calculating the cost of each neighboring solution. To show the efficiency of our proposed metaheuristic algorithm, we extensively experiment on benchmark instances. The results show that our algorithm can find the optimal solutions for all instances with up to 50 vertices in a fraction of seconds. Moreover, for instances from 60 to 80 vertices, almost all found solutions fall into the range of 0.9 %-1.1 % of the optimal solutions' lower bounds in a reasonable duration. For instances with a larger number of vertices, the algorithm reaches good-quality solutions fast. Moreover, in a comparison to the state-of-the-art metaheuristics, our proposed algorithm can find better solutions

    VnCoreNLP: A Vietnamese Natural Language Processing Toolkit

    Full text link
    We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language processing (NLP) tasks including word segmentation, part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing, and obtains state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to provide rich linguistic annotations to facilitate research work on Vietnamese NLP. Our VnCoreNLP is open-source and available at: https://github.com/vncorenlp/VnCoreNLPComment: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, NAACL 2018, to appea

    User Scheduling and Power Allocation for Precoded Multi-Beam High Throughput Satellite Systems With Individual Quality of Service Constraints

    Get PDF
    For extensive coverage areas, multi-beam high throughput satellite (HTS) communication is a promising technology that plays a crucial role in delivering broadband services to many users with diverse Quality of Service (QoS) requirements. This paper focuses on multi-beam HTS systems where all beams reuse the same spectrum. In particular, we propose a novel user scheduling and power allocation design capable of providing guarantees in terms of the individual QoS requirements while maximizing the system throughput under a limited power budget. Precoding is employed in the forward link to mitigate mutual interference among the users in multiple-access scenarios over different coherence time intervals. The combinatorial optimization structure from user scheduling requires an extremely high cost to obtain the global optimum even when a reduced number of users fit into a time slot. Therefore, we propose a heuristic algorithm yielding a good trade-off between performance and computational complexity, applicable to a static operation framework of geostationary (GEO) satellite networks. Although the power allocation optimization is signomial programming, non-convex on a standard form, the solution can be lower bounded by the global optimum of a geometric program with a hidden convex structure. A local solution to the joint user scheduling and power allocation problem is consequently obtained by a successive optimization approach. Numerical results demonstrate the effectiveness of our algorithms on GEO satellite networks by providing better QoS satisfaction combined with outstanding overall system throughput
    corecore