259,205 research outputs found
Halo: Estimation and Reduction of Hallucinations in Open-Source Weak Large Language Models
Large Language Models (LLMs) have revolutionized Natural Language Processing
(NLP). Although convenient for research and practical applications, open-source
LLMs with fewer parameters often suffer from severe hallucinations compared to
their larger counterparts. This paper focuses on measuring and reducing
hallucinations in BLOOM 7B, a representative of such weaker open-source LLMs
that are publicly available for research and commercial applications. We
introduce HaloCheck, a lightweight BlackBox knowledge-free framework designed
to quantify the severity of hallucinations in LLMs. Additionally, we explore
techniques like knowledge injection and teacher-student approaches to alleviate
hallucinations in low-parameter LLMs. Our experiments effectively demonstrate
the reduction of hallucinations in challenging domains for these LLMs
An overview of computer-based natural language processing
Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds
A Breadth of NLP Applications
The Center for Natural Language Processing (CNLP) was founded in September 1999 in the School of Information Studies, the “Original Information School”, at Syracuse University. CNLP’s mission is to advance the development of human-like, language understanding software capabilities for government, commercial, and consumer applications. The Center conducts both basic and applied research, building on its recognized capabilities in Natural Language Processing. The Center’s seventeen employees are a mix of doctoral students in information science or computer engineering, software engineers, linguistic analysts, and research engineers
Pathfinding in Games
Commercial games can be an excellent testbed to artificial intelligence (AI) research, being a middle ground between synthetic, highly abstracted academic benchmarks, and more intricate problems from real life. Among the many AI techniques and problems relevant to games, such as learning, planning, and natural language processing, pathfinding stands out as one of the most common applications of AI research to games. In this document we survey recent work in pathfinding in games. Then we identify some challenges and potential directions for future work. This chapter summarizes the discussions held in the pathfinding workgroup
Are fairness metric scores enough to assess discrimination biases in machine learning?
This paper presents novel experiments shedding light on the shortcomings of
current metrics for assessing biases of gender discrimination made by machine
learning algorithms on textual data. We focus on the Bios dataset, and our
learning task is to predict the occupation of individuals, based on their
biography. Such prediction tasks are common in commercial Natural Language
Processing (NLP) applications such as automatic job recommendations. We address
an important limitation of theoretical discussions dealing with group-wise
fairness metrics: they focus on large datasets, although the norm in many
industrial NLP applications is to use small to reasonably large linguistic
datasets for which the main practical constraint is to get a good prediction
accuracy. We then question how reliable are different popular measures of bias
when the size of the training set is simply sufficient to learn reasonably
accurate predictions. Our experiments sample the Bios dataset and learn more
than 200 models on different sample sizes. This allows us to statistically
study our results and to confirm that common gender bias indices provide
diverging and sometimes unreliable results when applied to relatively small
training and test samples. This highlights the crucial importance of variance
calculations for providing sound results in this field.Comment: Accepted for publication at Third Workshop on Trustworthy Natural
Language Processing, ACL 202
- …