44 research outputs found
Non-Standard Vietnamese Word Detection and Normalization for Text-to-Speech
Converting written texts into their spoken forms is an essential problem in
any text-to-speech (TTS) systems. However, building an effective text
normalization solution for a real-world TTS system face two main challenges:
(1) the semantic ambiguity of non-standard words (NSWs), e.g., numbers, dates,
ranges, scores, abbreviations, and (2) transforming NSWs into pronounceable
syllables, such as URL, email address, hashtag, and contact name. In this
paper, we propose a new two-phase normalization approach to deal with these
challenges. First, a model-based tagger is designed to detect NSWs. Then,
depending on NSW types, a rule-based normalizer expands those NSWs into their
final verbal forms. We conducted three empirical experiments for NSW detection
using Conditional Random Fields (CRFs), BiLSTM-CNN-CRF, and BERT-BiGRU-CRF
models on a manually annotated dataset including 5819 sentences extracted from
Vietnamese news articles. In the second phase, we propose a forward
lexicon-based maximum matching algorithm to split down the hashtag, email, URL,
and contact name. The experimental results of the tagging phase show that the
average F1 scores of the BiLSTM-CNN-CRF and CRF models are above 90.00%,
reaching the highest F1 of 95.00% with the BERT-BiGRU-CRF model. Overall, our
approach has low sentence error rates, at 8.15% with CRF and 7.11% with
BiLSTM-CNN-CRF taggers, and only 6.67% with BERT-BiGRU-CRF tagger.Comment: The 14th International Conference on Knowledge and Systems
Engineering (KSE 2022
Metaheuristic for Solving the Delivery Man Problem with Drone
Delivery Man Problem with Drone (DMPD) is a variant of Delivery Man Problem (DMP). The objective of DMP is to minimize the sum of customers' waiting times. In DMP, there is only a truck to deliver materials to customers while the delivery is completed by collaboration between truck and drone in DMPD. Using a drone is useful when a truck cannot reach some customers in particular circumstances such as narrow roads or natural disasters. For NP-hard problems, metaheuristic is a natural approach to solve medium to large-sized instances. In this paper, a metaheuristic algorithm is proposed. Initially, a solution without drone is created. Then, it is an input of split procedure to convert DMP-solution into DMPD-solution. After that, it is improved by the combination of Variable Neighborhood Search (VNS) and Tabu Search (TS). To explore a new solution space, diversification is applied. The proposed algorithm balances diversification and intensification to prevent the search from local optima. The experimental simulations show that the proposed algorithm reaches good solutions fast, even for large instances
An Effective Metaheuristic for Multiple Traveling Repairman Problem with Distance Constraints
Multiple Traveling Repairman Problem with Distance Constraints (MTRPD) is an extension of the NP-hard Multiple Traveling Repairman Problem. In MTRPD, a fleet of identical vehicles is dispatched to serve a set of customers with the following constraints. First, each vehicle's travel distance is limited by a threshold. Second, each customer must be visited exactly once. Our goal is to find the visiting order that minimizes the sum of waiting times. To solve MTRPD we propose to combine the Insertion Heuristic (IH), Variable Neighborhood Search (VNS), and Tabu Search (TS) algorithms into an effective two-phase metaheuristic that includes a construction phase and an improvement phase. In the former phase, IH is used to create an initial solution. In the latter phase, we use VNS to generate various neighborhoods, while TS is employed to mainly prohibit from getting trapped into cycles. By doing so, our algorithm can support the search to escape local optima. In addition, we introduce a novel neighborhoods’ structure and a constant time operation which are efficient for calculating the cost of each neighboring solution. To show the efficiency of our proposed metaheuristic algorithm, we extensively experiment on benchmark instances. The results show that our algorithm can find the optimal solutions for all instances with up to 50 vertices in a fraction of seconds. Moreover, for instances from 60 to 80 vertices, almost all found solutions fall into the range of 0.9 %-1.1 % of the optimal solutions' lower bounds in a reasonable duration. For instances with a larger number of vertices, the algorithm reaches good-quality solutions fast. Moreover, in a comparison to the state-of-the-art metaheuristics, our proposed algorithm can find better solutions
VnCoreNLP: A Vietnamese Natural Language Processing Toolkit
We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP
annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language
processing (NLP) tasks including word segmentation, part-of-speech (POS)
tagging, named entity recognition (NER) and dependency parsing, and obtains
state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to
provide rich linguistic annotations to facilitate research work on Vietnamese
NLP. Our VnCoreNLP is open-source and available at:
https://github.com/vncorenlp/VnCoreNLPComment: Proceedings of the 2018 Conference of the North American Chapter of
the Association for Computational Linguistics: Demonstrations, NAACL 2018, to
appea
User Scheduling and Power Allocation for Precoded Multi-Beam High Throughput Satellite Systems With Individual Quality of Service Constraints
For extensive coverage areas, multi-beam high throughput satellite (HTS) communication is a promising technology that plays a crucial role in delivering broadband services to many users with diverse Quality of Service (QoS) requirements. This paper focuses on multi-beam HTS systems where all beams reuse the same spectrum. In particular, we propose a novel user scheduling and power allocation design capable of providing guarantees in terms of the individual QoS requirements while maximizing the system throughput under a limited power budget. Precoding is employed in the forward link to mitigate mutual interference among the users in multiple-access scenarios over different coherence time intervals. The combinatorial optimization structure from user scheduling requires an extremely high cost to obtain the global optimum even when a reduced number of users fit into a time slot. Therefore, we propose a heuristic algorithm yielding a good trade-off between performance and computational complexity, applicable to a static operation framework of geostationary (GEO) satellite networks. Although the power allocation optimization is signomial programming, non-convex on a standard form, the solution can be lower bounded by the global optimum of a geometric program with a hidden convex structure. A local solution to the joint user scheduling and power allocation problem is consequently obtained by a successive optimization approach. Numerical results demonstrate the effectiveness of our algorithms on GEO satellite networks by providing better QoS satisfaction combined with outstanding overall system throughput