Search CORE

7,060 research outputs found

Enriching Biomedical Knowledge for Vietnamese Low-resource Language Through Large-Scale Translation

Author: Chau Lam D.
Dang Tai
Phan Long
Phan Vy
Tran Hieu
Trinh Trieu H.
Publication venue
Publication date: 26/10/2022
Field of study

Biomedical data and benchmarks are highly valuable yet very limited in low-resource languages other than English such as Vietnamese. In this paper, we make use of a state-of-the-art translation model in English-Vietnamese to translate and produce both pretrained as well as supervised data in the biomedical domains. Thanks to such large-scale translation, we introduce ViPubmedT5, a pretrained Encoder-Decoder Transformer model trained on 20 million translated abstracts from the high-quality public PubMed corpus. ViPubMedT5 demonstrates state-of-the-art results on two different biomedical benchmarks in summarization and acronym disambiguation. Further, we release ViMedNLI - a new NLP task in Vietnamese translated from MedNLI using the recently public En-vi translation model and carefully refined by human experts, with evaluations of existing methods against ViPubmedT5

arXiv.org e-Print Archive

Text segmentation techniques: A critical review

Author: Pak Irina *
Teh Phoey Lee *
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/11/2017
Field of study

Text segmentation is widely used for processing text. It is a method of splitting a document into smaller parts, which is usually called segments. Each segment has its relevant meaning. Those segments categorized as word, sentence, topic, phrase or any information unit depending on the task of the text analysis. This study presents various reasons of usage of text segmentation for different analyzing approaches. We categorized the types of documents and languages used. The main contribution of this study includes a summarization of 50 research papers and an illustration of past decade (January 2007- January 2017)’s of research that applied text segmentation as their main approach for analysing text. Results revealed the popularity of using text segmentation in different languages. Besides that, the “word” seems to be the most practical and usable segment, as it is the smaller unit than the phrase, sentence or line

Sunway Institutional Repository

Developing collaborative partnerships with culturally and linguistically diverse families during the IEP process

Author: Ferguson D.
Newman L.
Ruiz Soto A. G.
Ryndak D.L.
Sauer J.
Trainor A. A.
Turnbull A.
Wolfe K.
Publication venue: 'SAGE Publications'
Publication date: 01/08/2017
Field of study

Family participation in the special education process has been federally mandated for 40 years, and educators recognize that effective collaboration with their students’ families leads to improved academic and social outcomes for students. However, while some family-school relationships are positive and collaborative, many are not, particularly for culturally and linguistically diverse (CLD) families. This article provides practice guidelines based in research for teachers who seek to improve their practices when working with CLD families who have children served by special education

Crossref

Boston University Institutional Repository (OpenBU)

Intimate Partner Violence in Immigrant and Refugee Communities: Challenges, Promising Practices and Recommendations

Author: Michael Runner
Mieko Yoshihama
Steve Novick
Publication venue: Robert Wood Johnson Foundation
Publication date: 03/03/2009
Field of study

Reviews research on intimate partner violence in immigrant and refugee communities and examines victims' needs, challenges for agencies, and promising practices for prevention. Makes recommendations for funders, service providers, and policy makers

IssueLab

Special Libraries, April 1962

Author: Special Libraries Association
Publication venue: SJSU ScholarWorks
Publication date: 01/04/1962
Field of study

Volume 53, Issue 4https://scholarworks.sjsu.edu/sla_sl_1962/1003/thumbnail.jp

SJSU ScholarWorks

The Case for Developing and Deploying an Open Source Electronic Logistics Management Information System

Author
Publication venue: Program for Appropriate Technology in Health (PATH)
Publication date: 12/12/2011
Field of study

Summarizes efforts to strengthen health information systems in low- and lower-middle-income countries, including development of common requirements. Outlines models for collaboration among stakeholders, national leaders, and health information users

IssueLab

The Hmong Medical Corpus: a biomedical corpus for a minority language

Author: White Nathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Biomedical communication is an area that increasingly benefits from natural language processing (NLP) work. Biomedical named entity recognition (NER) in particular provides a foundation for advanced NLP applications, such as automated medical question-answering and translation services. However, while a large body of biomedical documents are available in an array of languages, most work in biomedical NER remains in English, with the remainder in official national or regional languages. Minority languages so far remain an underexplored area. The Hmong language, a minority language with sizable populations in several countries and without official status anywhere, represents an exceptional challenge for effective communication in medical contexts. Taking advantage of the large number of government-produced medical information documents in Hmong, we have developed the first named entity-annotated biomedical corpus for a resource-poor minority language. The Hmong Medical Corpus contains 100,535 tokens with 4554 named entities (NEs) of three UMLS semantic types: diseases/syndromes, signs/symptoms, and body parts/organs/organ components. Furthermore, a subset of the corpus is annotated for word position and parts of speech, representing the first such gold-standard dataset publicly available for Hmong. The methodology presented provides a readily reproducible approach for the creation of biomedical NE-annotated corpora for other resource-poor languages

ResearchOnline at James Cook University