Search CORE

276,423 research outputs found

Text-to-picture tools, systems, and approaches: a survey

Author: Al Ja’am J.
Saleh M.
Zakraoui J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Text-to-picture systems attempt to facilitate high-level, user-friendly communication between humans and computers while promoting understanding of natural language. These systems interpret a natural language text and transform it into a visual format as pictures or images that are either static or dynamic. In this paper, we aim to identify current difficulties and the main problems faced by prior systems, and in particular, we seek to investigate the feasibility of automatic visualization of Arabic story text through multimedia. Hence, we analyzed a number of well-known text-to-picture systems, tools, and approaches. We showed their constituent steps, such as knowledge extraction, mapping, and image layout, as well as their performance and limitations. We also compared these systems based on a set of criteria, mainly natural language processing, natural language understanding, and input/output modalities. Our survey showed that currently emerging techniques in natural language processing tools and computer vision have made promising advances in analyzing general text and understanding images and videos. Furthermore, important remarks and findings have been deduced from these prior works, which would help in developing an effective text-to-picture system for learning and educational purposes. - 2019, The Author(s).This work was made possible by NPRP grant #10-0205-170346 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors

Qatar University Institutional Repository

Sign language lexical recognition with Propositional Dynamic Logic

Author: Collet Christophe
Curiel Diaz Arturo Tlacaélel
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceThis paper explores the use of Propositional Dynamic Logic (PDL) as a suitable formal framework for describing Sign Language (SL), the language of deaf people, in the context of natural language processing. SLs are visual, complete, standalone languages which are just as expressive as oral languages. Signs in SL usually correspond to sequences of highly specific body postures interleaved with movements, which make reference to real world objects, characters or situations. Here we propose a formal representation of SL signs, that will help us with the analysis of automatically-collected hand tracking data from French Sign Language (FSL) video corpora. We further show how such a representation could help us with the design of computer aided SL verification tools, which in turn would bring us closer to the development of an automatic recognition system for these languages

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Towards a Generation of Artificially Intelligent Strategy Tools: The SWOT Bot

Author: Au Christian
Paul Herbert
Winkler Till J.
Publication venue: AIS Electronic Library (AISeL)
Publication date: 18/06/2022
Field of study

Strategy tools are widely used to inform the complex and unstructured decision-making of firms. Although software has evolved to support strategy analysis, such digital strategy tools still require heavy manual work especially on the data input and processing levels, making their use time-intensive, costly, and susceptible to biases. This design research presents the ‘SWOT Bot’, a digital strategy tool that exploits recent advances in natural language processing (NLP) to perform a SWOT (strengths, weaknesses, opportunities, threats) analysis. Our artifact uses a feed reader, an NLP pipeline, and a visual interface to automatically extract information from a text corpus (e.g., analyst reports) and present it to the user. We argue that the SWOT Bot reduces time and adds objectivity to strategy analyses while allowing the human-in-the-loop to focus on value-adding tasks. Besides providing a functioning prototype, our work provides three general design principles for the development of next-generation digital strategy tools

AIS Electronic Library (AISeL)

DisKnow: a social-driven disaster support knowledge extraction system

Author: Boné J.
Dias M.
Ferreira J. C.
Ribeiro R.
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

This research is aimed at creating and presenting DisKnow, a data extraction system with the capability of filtering and abstracting tweets, to improve community resilience and decision-making in disaster scenarios. Nowadays most people act as human sensors, exposing detailed information regarding occurring disasters, in social media. Through a pipeline of natural language processing (NLP) tools for text processing, convolutional neural networks (CNNs) for classifying and extracting disasters, and knowledge graphs (KG) for presenting connected insights, it is possible to generate real-time visual information about such disasters and affected stakeholders, to better the crisis management process, by disseminating such information to both relevant authorities and population alike. DisKnow has proved to be on par with the state-of-the-art Disaster Extraction systems, and it contributes with a way to easily manage and present such happenings.info:eu-repo/semantics/publishedVersio

Repositório Institucional do ISCTE-IUL

Recommended from our members

Towards Collaborative Generative AI for Vision-and-Language Studies

Author: Zhu Wanrong
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

In recent years, the field of vision-and-language studies has witnessed significant advancements, aiming to bridge the gap between visual perception and linguistic understanding. These studies have explored various approaches to enhance the capabilities of AI systems in generating natural language or visual content, understanding multimodal scenarios, and conducting commonsense reasoning. Despite these advancements, there remains a crucial need for further progress to enable more collaborative and comprehensive interactions between vision and language modalities. This dissertation addresses this need through three primary contributions:First, I introduce the concept of machine imagination for natural language processing studies. Specifically, I present the use of visual information generated by machines for the automatic evaluation of natural language generation, natural language understanding, and natural language generation.Second, I explore the utilization of large language models (LLMs) to enhance the performance of vision and multimodal tasks. In particular, I examine the effectiveness of applying LLMs for prompt editing in text-to-image generation, compositional layout planning and generation, and vision-and-language navigation.Third, I outline my contributions to publicly available open-source vision-and-language research. Specifically, we introduce Multimodal C4, a large-scale multimodal dataset containing interleaved images and text, which we used to train the large-scale multimodal model OpenFlamingo. Additionally, we introduce VisIT-Bench, a public benchmark for evaluating instruction-following vision-language models in real-world applications.This dissertation aims to push the boundaries of vision-and-language integration, providing new insights and tools for developing more sophisticated AI systems capable of seamless multimodal interactions

eScholarship - University of California

New Methods and Tools for the World Wide Web Search

Author: Vlatko Ceric
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2000
Field of study

Explosive growth of the World Wide Web as well as its heterogeneity call for powerful and easy to use search tools capable to provide the user with a moderate number of relevant answers. This paper presents analysis of key aspects of recently developed Web search methods and tools: visual representation of subject trees, interactive user interfaces, linguistic approaches, image search, ranking and grouping of search results, database search, and scientific information retrieval. Current trends in Web search include topics such as exploiting Web hyperlinking structure, natural language processing, software agents, influence of XML markup language on search efficiency, and WAP search engines

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Explainability of Vision Transformers: A Comprehensive Review and New Perspectives

Author: Aghaeipoor Fatemeh
Barekatain Leili
Kashefi Rojina
Sabokrou Mohammad
Publication venue
Publication date: 12/11/2023
Field of study

Transformers have had a significant impact on natural language processing and have recently demonstrated their potential in computer vision. They have shown promising results over convolution neural networks in fundamental computer vision tasks. However, the scientific community has not fully grasped the inner workings of vision transformers, nor the basis for their decision-making, which underscores the importance of explainability methods. Understanding how these models arrive at their decisions not only improves their performance but also builds trust in AI systems. This study explores different explainability methods proposed for visual transformers and presents a taxonomy for organizing them according to their motivations, structures, and application scenarios. In addition, it provides a comprehensive review of evaluation criteria that can be used for comparing explanation results, as well as explainability tools and frameworks. Finally, the paper highlights essential but unexplored aspects that can enhance the explainability of visual transformers, and promising research directions are suggested for future investment.Comment: 20 pages,5 figure

arXiv.org e-Print Archive

From ChatGPT-3 to GPT-4: A Significant Advancement in AI-Driven NLP Tools

Author: Ahsan M. M. Tahmid
Anjum Nishath
Rahaman Md. Saidur
Rahman Md. Mizanur
Terano Harold Jan R.
Publication venue: Camarines Sur Polytechnic Colleges
Publication date: 11/05/2023
Field of study

Recent improvements in Natural Language Processing (NLP) have led to the creation of powerful language models like Chat Generative Pre-training Transformer (ChatGPT), Google’s BARD, Ernie which has shown to be very good at many different language tasks. But as language tasks get more complicated, having even more advanced NLP tool is essential nowadays. In this study, researchers look at how the latest versions of the GPT language model(GPT-4 & 5) can help with these advancements. The research method for this paper is based on a narrative analysis of the literature, which makes use of secondary data gathered from previously published studies including articles, websites, blogs, and visual and numerical facts etc. Findings of this study revealed that GPT-4 improves the model's training data, the speed with which it can be computed, the flawless answers that it provides with, and its overall performance. This study also shows that GPT-4 does much better than GPT-3.5 at translating languages, answering questions, and figuring out how people feel about things. The study provides a solid basis for building even more advanced NLP tools and programmes like GPT-5. The study will help the AI & LLM researchers, NLP developers and academicians in exploring more into this particular field of study. As this is the first kind of research comparing two NLP tools, therefore researchers suggested going for a quantitative research in the near future to validate the findings of this research

Camarines Sur Polytechnic Colleges Research Journals

GeoAnnotator: A Collaborative Semi-Automatic Platform for Constructing Geo-Annotated Text Corpora

Author: Karimzadeh Morteza
MacEachren Alan M
Publication venue: 'Purdue University (bepress)'
Publication date: 01/03/2019
Field of study

Ground-truth datasets are essential for the training and evaluation of any automated algorithm. As such, gold-standard annotated corpora underlie most advances in natural language processing (NLP). However, only a few relatively small (geo-)annotated datasets are available for geoparsing, i.e., the automatic recognition and geolocation of place references in unstructured text. The creation of geoparsing corpora that include both the recognition of place names in text and matching of those names to toponyms in a geographic gazetteer (a process we call geo-annotation), is a laborious, time-consuming and expensive task. The field lacks efficient geo-annotation tools to support corpus building and lacks design guidelines for the development of such tools. Here, we present the iterative design of GeoAnnotator, a web-based, semi-automatic and collaborative visual analytics platform for geo-annotation. GeoAnnotator facilitates collaborative, multi-annotator creation of large corpora of geo-annotated text by generating computationally-generated pre-annotations that can be improved by human-annotator users. The resulting corpora can be used in improving and benchmarking geoparsing algorithms as well as various other spatial language-related methods. Further, the iterative design process and the resulting design decisions can be used in annotation platforms tailored for other application domains of NLP

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Purdue E-Pubs