Search CORE

2,551 research outputs found

2023 Projects Day Booklet

Author: Seattle University
Publication venue: ScholarWorks @ SeattleU
Publication date: 01/01/2023
Field of study

https://scholarworks.seattleu.edu/projects-day/1002/thumbnail.jp

ScholarWorks @ SeattleU (Seattle University)

Natural language processing

Author: Adams
Amsler
Bangalore
Barker
Benoît
Bian
Bondale
Carrick
Ceric
Chandrasekar
Chang
Charniak
Chen
Chowdhury
Chowdhury
Costantino
Cowie
Craven
Craven
Craven
Dogru
Evans
Feldman
Fernandez
Gaizauskas
Glasgow
Haas
Hayes
Hayes
Hedlund
Herath
Ide
Isahara
Jelinek
Jeong
Jurafsky
Kazakov
Kehler
Khoo
Kim
King
Lange
Lee
Lehmam
Lehtokangas
Lewis
Liddy
Liddy
Lovis
Ma
Magnini
Mani
Manning
Marquez
Martinez
Martinez
McMurchie
Meyer
Mihalcea
Mock
Moens
Morin
Narita
Nerbonne
Oard
Ogura
Oudet
Owei
Paris
Pasero
Pedersen
Perez-Carballo
Petreley
Pirkola
Poesio
Rosenfield
Roux
Say
Scarlett
Schenker
Silber
Smeaton
Smeaton
Smith
Sokol
Song
Sparck Jones
Staab
Stock
Tolle
Trybula
Tsuda
Vickery
Waldrop
Warner
Weigard
Wilks
Wong
Yang
Yang
Zadrozny
Zweigenbaum
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

Crossref

University of Strathclyde Institutional Repository

OPUS - University of Technology Sydney

The Future of Information Sciences : INFuture2015 : e-Institutions – Openness, Accessibility, and Preservation

Author
Publication venue: Department of Information and Communication Sciences, Faculty of Humanities and Social Sciences, University of Zagreb
Publication date: 01/11/2015
Field of study

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Cloud enabled data analytics and visualization framework for health-shock prediction

Author: Mahmud Shahid
Publication venue
Publication date: 01/01/2016
Field of study

Coventry University Pure Portal

A mixed-method triangular approach to best practices in combating plagiarism and impersonation in online bachelor’s degree programs

Author: Johnson Stephens Alice Elizabeth
Publication venue: Marshall Digital Scholar
Publication date: 01/01/2023
Field of study

This study examines the phenomenon of plagiarism and impersonation in online course assignments. Technological advancements, coupled with lower costs and accessibility, have made online courses and programs a practical option for higher education students. Unfortunately, the increasing online enrollment and advancing technology have allowed an increase in the opportunity for students to commit the act of plagiarism and impersonation in online course assignments, thus potentially compromising the academic integrity of online degree programs. This study examines the various practices and approaches of plagiarism and impersonation made available to students. Utilizing the systemic review of literature, the researcher compiles a list of 20 best practices in combating plagiarism and impersonation in online course assignments. A Delphi method approach is employed, utilizing the expertise of professors who teach in fully online bachelor’s degree programs. The 20 best practices established through the literature review will be narrowed down to ten best practices via an ordinal ranking questionnaire using a two-round format. The questionnaire distribution occurs via e-mails. Researching professors that teach in fully online bachelor’s degree programs is how the researcher will obtain the e-mails. The first-round e-mail consists of the consent form and the original set of 20 best practices. In addition, a link to the Qualtrics ranking survey will be included in the e-mail. The second-round e-mail consists of the updated 15 best practices ranked from the initial e-mail and a link to the ranking survey. After completing the second round, the establishment of the ten best practices for reducing plagiarism and impersonation in online assignments will emerge. To further validate the 10 best practices, the researcher interviews 10 professors that participated in the original Delphi study. The original consent form includes a link for the participants to access if they select to participate in the interview. After verifying the professors’ intent to participate, a consent form will be obtained. The interviews will be conducted and recorded virtually through zoom. The recordings will be deleted once they are transcribed. This study potentially benefits all online degree programs by establishing the ten best practices for reducing plagiarism and impersonation in online assignments

Marshall University

Datasets for Large Language Models: A Comprehensive Survey

Author: Cao Jiahuan
Ding Kai
Jin Lianwen
Liu Chongyu
Liu Yang
Publication venue
Publication date: 27/02/2024
Field of study

This paper embarks on an exploration into the Large Language Model (LLM) datasets, which play a crucial role in the remarkable advancements of LLMs. The datasets serve as the foundational infrastructure analogous to a root system that sustains and nurtures the development of LLMs. Consequently, examination of these datasets emerges as a critical topic in research. In order to address the current lack of a comprehensive overview and thorough analysis of LLM datasets, and to gain insights into their current status and future trends, this survey consolidates and categorizes the fundamental aspects of LLM datasets from five perspectives: (1) Pre-training Corpora; (2) Instruction Fine-tuning Datasets; (3) Preference Datasets; (4) Evaluation Datasets; (5) Traditional Natural Language Processing (NLP) Datasets. The survey sheds light on the prevailing challenges and points out potential avenues for future investigation. Additionally, a comprehensive review of the existing available dataset resources is also provided, including statistics from 444 datasets, covering 8 language categories and spanning 32 domains. Information from 20 dimensions is incorporated into the dataset statistics. The total data size surveyed surpasses 774.5 TB for pre-training corpora and 700M instances for other datasets. We aim to present the entire landscape of LLM text datasets, serving as a comprehensive reference for researchers in this field and contributing to future studies. Related resources are available at: https://github.com/lmmlzn/Awesome-LLMs-Datasets.Comment: 181 pages, 21 figure

arXiv.org e-Print Archive

A comparison of statistical machine learning methods in heartbeat detection and classification

Author: A.L. Goldberger
G.J. McLachlan
H. Feichtinger
J.A. Freeman
P. Chazal de
R.A. Johnson
R.O. Duda
T. Ince
Y.H. Hu
Publication venue: Springer Berlin Heidelberg
Publication date: 01/01/2012
Field of study

In health care, patients with heart problems require quick responsiveness in a clinical setting or in the operating theatre. Towards that end, automated classification of heartbeats is vital as some heartbeat irregularities are time consuming to detect. Therefore, analysis of electro-cardiogram (ECG) signals is an active area of research. The methods proposed in the literature depend on the structure of a heartbeat cycle. In this paper, we use interval and amplitude based features together with a few samples from the ECG signal as a feature vector. We studied a variety of classification algorithms focused especially on a type of arrhythmia known as the ventricular ectopic fibrillation (VEB). We compare the performance of the classifiers against algorithms proposed in the literature and make recommendations regarding features, sampling rate, and choice of the classifier to apply in a real-time clinical setting. The extensive study is based on the MIT-BIH arrhythmia database. Our main contribution is the evaluation of existing classifiers over a range sampling rates, recommendation of a detection methodology to employ in a practical setting, and extend the notion of a mixture of experts to a larger class of algorithms

Crossref

Research Archive of Indian Institute of Technology Hyderabad

A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness

Author: Ewerth Ralph
Hoppe Anett
Navarrete Evelyn
Nehring Andreas
Schanze Sascha
Publication venue
Publication date: 11/08/2023
Field of study

People increasingly use videos on the Web as a source for learning. To support this way of learning, researchers and developers are continuously developing tools, proposing guidelines, analyzing data, and conducting experiments. However, it is still not clear what characteristics a video should have to be an effective learning medium. In this paper, we present a comprehensive review of 257 articles on video-based learning for the period from 2016 to 2021. One of the aims of the review is to identify the video characteristics that have been explored by previous work. Based on our analysis, we suggest a taxonomy which organizes the video characteristics and contextual aspects into eight categories: (1) audio features, (2) visual features, (3) textual features, (4) instructor behavior, (5) learners activities, (6) interactive features (quizzes, etc.), (7) production style, and (8) instructional design. Also, we identify four representative research directions: (1) proposals of tools to support video-based learning, (2) studies with controlled experiments, (3) data analysis studies, and (4) proposals of design guidelines for learning videos. We find that the most explored characteristics are textual features followed by visual features, learner activities, and interactive features. Text of transcripts, video frames, and images (figures and illustrations) are most frequently used by tools that support learning through videos. The learner activity is heavily explored through log files in data analysis studies, and interactive features have been frequently scrutinized in controlled experiments. We complement our review by contrasting research findings that investigate the impact of video characteristics on the learning effectiveness, report on tasks and technologies used to develop tools that support learning, and summarize trends of design guidelines to produce learning video

arXiv.org e-Print Archive