Search CORE

126 research outputs found

Tamil-Llama: A New Tamil Language Model Based on Llama 2

Author: Balachandran Abhinand
Publication venue
Publication date: 09/11/2023
Field of study

Language modeling has witnessed remarkable advancements in recent years, with Large Language Models (LLMs) like ChatGPT setting unparalleled benchmarks in human-like text generation. However, a prevailing limitation is the underrepresentation of languages like Tamil in these cutting-edge models, leading to suboptimal performance in diverse linguistic contexts. This paper addresses this lacuna, enhancing the open-source LLaMA model with an addition of 16,000 Tamil tokens, aiming to achieve superior text generation and comprehension in the Tamil language. We strategically employ the LoRA methodology for efficient model training on a comprehensive Tamil corpus, ensuring computational feasibility and model robustness. Moreover, we introduce a Tamil-translated version of the Alpaca dataset and a subset of the OpenOrca dataset tailored for instruction fine-tuning. Our results showcase significant performance improvements in Tamil text generation, with potential implications for the broader landscape of LLMs in Indian languages. We further underscore our commitment to open research by making our models, datasets, and code publicly accessible, fostering further innovations in language modeling.Comment: 19 pages, 10 figure

arXiv.org e-Print Archive

Emerging Evaluation Paradigms in Natural Language Understanding: A Case Study in Machine Reading Comprehension

Author: Schlegel Viktor
Publication venue
Publication date: 31/12/2021
Field of study

The University of Manchester - Institutional Repository

Deep Understanding of Technical Documents : Automated Generation of Pseudocode from Digital Diagrams & Analysis/Synthesis of Mathematical Formulas

Author: Gkorgkolis Nikolaos
Publication venue: CORE Scholar
Publication date: 01/01/2022
Field of study

The technical document is an entity that consists of several essential and interconnected parts, often referred to as modalities. Despite the extensive attention that certain parts have already received, per say the textual information, there are several aspects that severely under researched. Two such modalities are the utility of diagram images and the deep automated understanding of mathematical formulas. Inspired by existing holistic approaches to the deep understanding of technical documents, we develop a novel formal scheme for the modelling of digital diagram images. This extends to a generative framework that allows for the creation of artificial images and their annotation. We contribute on the field with the creation of a novel synthetic dataset and its generation mechanism. We propose the conversion of the pseudocode generation problem to an image captioning task and provide a family of techniques based on adaptive image partitioning. We address the mathematical formulas’ semantic understanding by conducting an evaluating survey on the field, published in May 2021. We then propose a formal synthesis framework that utilized formula graphs as metadata, reaching for novel valuable formulas. The synthesis framework is validated by a deep geometric learning mechanism, that outsources formula data to simulate the missing a priori knowledge. We close with the proof of concept, the description of the overall pipeline and our future aims

CORE

Semantic Systems. The Power of AI and Knowledge Graphs

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book constitutes the refereed proceedings of the 15th International Conference on Semantic Systems, SEMANTiCS 2019, held in Karlsruhe, Germany, in September 2019. The 20 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 88 submissions. They cover topics such as: web semantics and linked (open) data; machine learning and deep learning techniques; semantic information management and knowledge integration; terminology, thesaurus and ontology management; data mining and knowledge discovery; semantics in blockchain and distributed ledger technologies

OAPEN Library

Transformer Neural Networks for Automated Story Generation

Author: Araz Kemal
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2020
Field of study

Towards the last two-decade Artificial Intelligence (AI) proved its use on tasks such as image recognition, natural language processing, automated driving. As discussed in the Moore’s law the computational power increased rapidly over the few decades (Moore, 1965) and made it possible to use the techniques which were computationally expensive. These techniques include Deep Learning (DL) changed the field of AI and outperformed other models in a lot of fields some of which mentioned above. However, in natural language generation especially for creative tasks that needs the artificial intelligent models to have not only a precise understanding of the given input, but an ability to be creative, fluent and, coherent within a content. One of these tasks is automated story generation which has been an open research area from the early days of artificial intelligence. This study investigates whether the transformer network can outperform state-of-the-art model for automated story generation. A large dataset gathered from Reddit’s WRITING PROMPTS sub forum and processed by the transformer network in order to compare the perplexity and two human evaluation metrics on transformer network and the state-of-the-art model. It was found that the transformer network cannot outperform the state-of-art model and even though it generated viable and novel stories it didn’t pay much attention to the prompts of the generated stories. Also, the results implied that there should be a better automated evaluation metric in order to assess the performance of story generation models

Arrow@TUDublin

Deep Neural Networks for Visual Bridge Inspections and Defect Visualisation in Civil Engineering

Author: Bush Julia
Corradi Tadeo
Ninic Jelena
Thermou Georgia
Publication venue: Universitätsverlag der Technischen Universität Berlin
Publication date: 03/01/2022
Field of study

University of Birmingham Research Portal

EG-ICE 2021 Workshop on Intelligent Computing in Engineering

Author
Publication venue
Publication date: 03/01/2022
Field of study

The 28th EG-ICE International Workshop 2021 brings together international experts working at the interface between advanced computing and modern engineering challenges. Many engineering tasks require open-world resolutions to support multi-actor collaboration, coping with approximate models, providing effective engineer-computer interaction, search in multi-dimensional solution spaces, accommodating uncertainty, including specialist domain knowledge, performing sensor-data interpretation and dealing with incomplete knowledge. While results from computer science provide much initial support for resolution, adaptation is unavoidable and most importantly, feedback from addressing engineering challenges drives fundamental computer-science research. Competence and knowledge transfer goes both ways

Directory of Open Access Books (DOAB)

SENTIMENT AND BEHAVIORAL ANALYSIS IN EDISCOVERY

Author: Krishnan Sundar
Publication venue
Publication date: 24/08/2022
Field of study

A suspect or person-of-interest during legal case review or forensic evidence review can exhibit signs of their individual personality through the digital evidence collected for the case. Such personality traits of interest can be analytically harvested for case investigators or case reviewers. However, manual review of evidence for such flags can take time and contribute to increased costs. This study focuses on certain use-case scenarios of behavior and sentiment analysis as a critical requirement for a legal case’s success. This study aims to quicken the review and analysis phase and offers a software prototype as a proof-of-concept. The study starts with the build and storage of Electronic Stored Information (ESI) datasets for three separate fictitious legal cases using publicly available data such as emails, Facebook posts, tweets, text messages and a few custom MS Word documents. The next step of this study leverages statistical algorithms and automation to propose approaches towards identifying human sentiments, behavior such as, evidence of financial fraud behavior, and evidence of sexual harassment behavior of a suspect or person-of-interest from the case ESI. The last stage of the study automates these approaches via a custom software and presents a user interface for eDiscovery teams and digital forensic investigators

Scholarly Works @ SHSU (Sam Houston State University)

EG-ICE 2021 Workshop on Intelligent Computing in Engineering

Author
Publication venue
Publication date
Field of study

OAPEN Library