Search CORE

84 research outputs found

R PEAK DETERMINATION USING A WDFR ALGORITHM AND ADAPTIVE THRESHOLD

Author: NGO Ba-Viet
NGUYEN Thanh-Hai
NGUYEN Thanh-Nghia
Publication venue: 'Politechnika Lubelska'
Publication date: 30/09/2022
Field of study

The determination of the R peak position in the ECG signal helps physicians not only to know the heart rate per minute, but also to monitor the patient’s health related to heart disease. This paper proposes a system to accurately determine the R peak position in the ECG signal. The system consists of a pre-processing block for filtering out noise using a WDFR algorithm and highlighting the amplitude of the R peak and a threshold value is calculated for determining the R peak. In this research, the MIT-BIH ECG dataset with 48 records are used for evaluation of the system. The results of the SEN, +P, DER and ACC parameters related to the system quality are 99.70%, 99.59%, 0.70% and 99.31%, respectively. The obtained performance of the proposed R peak position determination system is very high and can be applied to determine the R peak of the ECG signal measuring devices in practice

Lublin University of Technology Journals

Vietnamese Text Accent Restoration With Statistical Machine Translation

Author: Nguyen Vinh-Van
Pham Luan-Nghia
Tran Viet-Hong
Publication venue: Department of English, National Chengchi University
Publication date: 01/01/2013
Field of study

Waseda University Repository

Dynamic Wavelength routing in all optical mesh network

Author: Habibi Daryoush
Lo Kungmang
Nguyen Hoang Nghia
Phung Quoc Viet
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2005
Field of study

Wavelength-division multiplexing (WDM) offers the capability to handle the increasing demand of network traffic in a manner that takes advantage of already deployed optical fibers. Lightpaths are optical connections carried end-to-end over a wavelength on each intermediate link. Wavelengths are the main resource in WDM networks. Due to the inherent channel constraints, a dynamic control mechanism is required to efficiently utilize the resource to maximize lightpath connections. In this paper, we investigate a class of adaptive routing called dynamic wavelength routing (DWR), in which wavelength converters (WCs) are not utilized in the network. The objective is to maximize the wavelength utilization and reduces the blocking probability in an arbitrary network. This approach contains two sub-algorithms: least congestion with least nodal-degree routing algorithm (LCLNR) and dynamic two-end wavelength routing algorithm (DTWR). We demonstrate that DWR can significantly improve the blocking performance, and the results achieved as good as placing sparse WCs in the networ

Research Online @ ECU

Seropositivity against flaviviruses among pigs in Hanoi

Author: Bui Nghia Vuong
Can Xuan Minh
Hung Nguyen-Viet
Lindahl Johanna F.
Long Pham-Thanh
Lundkvist A.
Nguyen Tien Thang
Publication venue: International Livestock Research Institute
Publication date: 27/06/2019
Field of study

CGSpace (CGIAR)

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

Author: Dernoncourt Franck
Lai Viet Dac
Ngo Nghia Trung
Nguyen Thien Huu
Nguyen Thuat
Rossi Ryan A.
Van Nguyen Chien
Publication venue
Publication date: 01/08/2023
Field of study

A key technology for the development of large language models (LLMs) involves instruction tuning that helps align the models' responses with human expectations to realize impressive learning abilities. Two major approaches for instruction tuning characterize supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), which are currently applied to produce the best commercial LLMs (e.g., ChatGPT). To improve the accessibility of LLMs for research and development efforts, various instruction-tuned open-source LLMs have also been introduced recently, e.g., Alpaca, Vicuna, to name a few. However, existing open-source LLMs have only been instruction-tuned for English and a few popular languages, thus hindering their impacts and accessibility to many other languages in the world. Among a few very recent work to explore instruction tuning for LLMs in multiple languages, SFT has been used as the only approach to instruction-tune LLMs for multiple languages. This has left a significant gap for fine-tuned LLMs based on RLHF in diverse languages and raised important questions on how RLHF can boost the performance of multilingual instruction tuning. To overcome this issue, we present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages. Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research. We also present benchmark datasets to enable the evaluation of generative LLMs in multiple languages. Our experiments demonstrate the advantages of RLHF for multilingual instruction over SFT for different base models and datasets. Our framework and resources are released at https://github.com/nlp-uoregon/Okapi

arXiv.org e-Print Archive

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Author: Dernoncourt Franck
Lai Viet Dac
Man Hieu
Ngo Nghia Trung
Nguyen Thien Huu
Nguyen Thuat
Rossi Ryan A.
Van Nguyen Chien
Publication venue
Publication date: 17/09/2023
Field of study

The driving factors behind the development of large language models (LLMs) with impressive learning capabilities are their colossal model sizes and extensive training datasets. Along with the progress in natural language processing, LLMs have been frequently made accessible to the public to foster deeper investigation and applications. However, when it comes to training datasets for these LLMs, especially the recent state-of-the-art models, they are often not fully disclosed. Creating training data for high-performing LLMs involves extensive cleaning and deduplication to ensure the necessary level of quality. The lack of transparency for training data has thus hampered research on attributing and addressing hallucination and bias issues in LLMs, hindering replication efforts and further advancements in the community. These challenges become even more pronounced in multilingual learning scenarios, where the available multilingual text datasets are often inadequately collected and cleaned. Consequently, there is a lack of open-source and readily usable dataset to effectively train LLMs in multiple languages. To overcome this issue, we present CulturaX, a substantial multilingual dataset with 6.3 trillion tokens in 167 languages, tailored for LLM development. Our dataset undergoes meticulous cleaning and deduplication through a rigorous pipeline of multiple stages to accomplish the best quality for model training, including language identification, URL-based filtering, metric-based cleaning, document refinement, and data deduplication. CulturaX is fully released to the public in HuggingFace to facilitate research and advancements in multilingual LLMs: https://huggingface.co/datasets/uonlp/CulturaX.Comment: Ongoing Wor

arXiv.org e-Print Archive

Climate sensitive diseases in Vietnam: Aflatoxin B1 in maize and zoonotic diseases in pigs

Author: Bui Nghia Vuong
Grace Delia
Ha Minh Thanh
Hu Suk Lee
Hung Nguyen-Viet
Lindahl Johanna F.
Nguyen Van Huyen
Nguyen Viet Khong
Publication venue: International Livestock Research Institute
Publication date: 14/06/2018
Field of study

CGSpace (CGIAR)

Climate sensitive diseases in Vietnam: Aflatoxin B1 in maize and zoonotic diseases in pigs

Author: Bui Nghia Vuong
Grace Delia
Ha Minh Thanh
Hu Suk Lee
Hung Nguyen-Viet
Lindahl Johanna F.
Nguyen Van Huyen
Nguyen Viet Khong
Publication venue: International Livestock Research Institute
Publication date: 15/02/2018
Field of study

CGSpace (CGIAR)