412,710 research outputs found
Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models
Recently, Multimodal Large Language Models (MLLMs) that enable Large Language
Models (LLMs) to interpret images through visual instruction tuning have
achieved significant success. However, existing visual instruction tuning
methods only utilize image-language instruction data to align the language and
image modalities, lacking a more fine-grained cross-modal alignment. In this
paper, we propose Position-enhanced Visual Instruction Tuning (PVIT), which
extends the functionality of MLLMs by integrating an additional region-level
vision encoder. This integration promotes a more detailed comprehension of
images for the MLLM. In addition, to efficiently achieve a fine-grained
alignment between the vision modules and the LLM, we design multiple data
generation strategies to construct an image-region-language instruction
dataset. Finally, we present both quantitative experiments and qualitative
analysis that demonstrate the superiority of the proposed model. Code and data
will be released at https://github.com/PVIT-official/PVIT
Primary students' spatial visualization and spatial orientation: an evidence base for instruction
This paper reports on the performance of 58 11 to 12-year-olds on a spatial visualization task and a spatial orientation task. The students completed these tasks and explained their thinking during individual interviews. The qualitative data were analysed to inform pedagogical content knowledge for spatial activities. The study revealed that âmatchingâ or âmatching and eliminatingâ were the typical strategies that students employed on these spatial tasks. However, errors in making associations between parts of the same or different shapes were noted. Students also experienced general difficulties with visual memory and language use to explain their thinking. The studentsâ specific difficulties in spatial visualization related to obscured items, the perspective used, and the placement and orientation of shapes
SVIQUEL: A Spacial Visual Query and Exploration Language
Abstract The need to analyze and query spatial data is becoming increasingly important with the advent of applications such as Geographic Information Systems Image Databases and Remote Sensing The focus of our research is to support spatial data analysis by developing a direct manipulation environment to visually query as well as browse spatial data and to review the visual results for trend analysis In this paper we present a visual query language SVIQUELE which allows us to specify the relative spatial position both topology and directionE between objects using direct manipulation This query language builds upon the notion of dynamic query lters and signi cantly extends them to support integrated querying of both topological and directional types of spatial data In order to facilitate continuous querying as required by a direct manipulation environment we designed an integrated neighborhood model for both kinds of spatial relationships topology and directionE Our spatial query palette SVIQUEL allows us to query over any of the continuous sets of neighboring values SVIQUEL is complimented by a Spatial Query Disambiguation diagram SQUADE which gives qualitative visual representations of the quantitative query This increases the utility of the system for spatial browsing of data with no particular query in mind Mapping functions between the quantitative SVIQUEL and the qualitative SQUAD have been developed The resulting tight coupling between SVIQUEL and SQUAD allows the users to work with either qualitative query speci cations or at a quantitative level of detail depending on his particular needs as well as to freely switch between the two while working in a continuous data exploration mod
Visualising text-based data: Identifying the potential of visual knowledge production through design practice
An increase in the availability of digitised data coupled with the development of digital tools has enabled humanities scholars to visualise data in ways that were previously difficult, if not impossible. While digitisation has led to an increase in the use of methods that chart, graph and map text-based data, opportunities for visual methods that are non-aggregative remain underdeveloped. In this paper we use âWriting Rightsâ, a collaborative project between design and humanities scholars that examines the process of writing the âDĂ©claration des Droits de lâHomme et du Citoyenâ (1789), to explore this issue. Through a series of visual experiments we discuss how the production of knowledge is enacted textually, within the written language, and graphically with the visual arrangement of the text. We argue that by drawing on the domain expertise of design, with its commitment to the semantic potential of the visual, practices that more wholly account for the qualitative nature of humanities data can be developed
ANALISIS PEMANFAATAN MEDIA PEMBELAJARAN AUDIO VISUAL UNTUK MENINGKATKAN HASIL BELAJAR BAHASA INDONESIA SISWA SD
This study aims to evaluate the effect of using audio-visual learning media on student learning outcomes in Indonesian language learning at SD Negeri 1 Maron. Indonesian language has an important role in daily life in Indonesia, so good Indonesian language skills are needed, especially for elementary school students. Audio-visual learning media is considered as a tool that can help students understand learning materials more easily and fun. However, there are still many teachers who have not utilized this media optimally in the learning process. Therefore, this study was conducted to determine the extent to which the use of audio-visual learning media can improve the learning outcomes of Indonesian language elementary school students.The research method used was qualitative using the case study method. This research involved teachers (homeroom teachers) and 4th grade students of SD Negeri 1 Maron as research subjects. Data were collected through interviews, observations, and documentation. The results showed that the use of audio visual media in learning can help students understand the learning material better and attract students' attention. Teachers also stated that the use of audio visual media can increase students' interest in learning Indonesian. Thus, this research makes a valuable contribution to improving the quality of Indonesian language education in primary schools through the use of audio visual media
Valley: Video Assistant with Large Language model Enhanced abilitY
Large language models (LLMs), with their remarkable conversational
capabilities, have demonstrated impressive performance across various
applications and have emerged as formidable AI assistants. In view of this, it
raises an intuitive question: Can we harness the power of LLMs to build
multimodal AI assistants for visual applications? Recently, several multi-modal
models have been developed for this purpose. They typically pre-train an
adaptation module to align the semantics of the vision encoder and language
model, followed by fine-tuning on instruction-following data. However, despite
the success of this pipeline in image and language understanding, its
effectiveness in joint video and language understanding has not been widely
explored. In this paper, we aim to develop a novel multi-modal foundation model
capable of comprehending video, image, and language within a general framework.
To achieve this goal, we introduce Valley, a Video Assistant with Large
Language model Enhanced abilitY. The Valley consists of a LLM, a temporal
modeling module, a visual encoder, and a simple projection module designed to
bridge visual and textual modes. To empower Valley with video comprehension and
instruction-following capabilities, we construct a video instruction dataset
and adopt a two-stage tuning procedure to train it. Specifically, we employ
ChatGPT to facilitate the construction of task-oriented conversation data
encompassing various tasks, including multi-shot captions, long video
descriptions, action recognition, causal relationship inference, etc.
Subsequently, we adopt a pre-training-then-instructions-tuned pipeline to align
visual and textual modalities and improve the instruction-following capability
of Valley. Qualitative experiments demonstrate that Valley has the potential to
function as a highly effective video assistant that can make complex video
understanding scenarios easy
Profil Gaya Belajar pada Mahasiswa Program Studi Bahasa Jepang
The purpose of this study was to determine the learning style profile of students of the STBA YAPARI-ABA Japanese Language Studies Program in Bandung. The method used in this study is a qualitative descriptive method with observation and questionnaires as data collection instruments. The number of respondents in this study were 100 Japanese language study program students from 4 levels of study who were selected by random sampling technique. The results of this study are the mapping of student learning styles from level 1 to level 4. Although the tendency for learning styles in each dimension can be observed, there are several dimensions with respondents without a tendency showing a dominant number. When viewed from the profile of each respondent's study level, the profile combination is different. Level 1 has a learning style profile sensing - visual - reflective - sequential - introverted - inductive, level 2 has a learning style profile sensing - visual - active - global - extroverted - inductive, while levels 3 and level 4 have the same learning style profile, namely sensing â visual â reflective â global â introverted â deductive. In conclusion, from this study, the learning style profiles of students of the STBA YAPARI-ABA Bandung Japanese Language Study Program obtained a sensing â visual â reflective â global â introverted â inductive pattern.
Keywords: Learning Style, Japanese Language Students, Profil
Evidence-Based Dialogue Maps as a research tool to evaluate the quality of school pupilsâ scientific argumentation
This pilot study focuses on the potential of Evidence-based Dialogue Mapping as a participatory action research tool to investigate young teenagersâ scientific argumentation. Evidence-based Dialogue Mapping is a technique for representing graphically an argumentative dialogue through Questions, Ideas, Pros, Cons and Data. Our research objective is to better understand the usage of Compendium, a Dialogue Mapping software tool, as both (1) a learning strategy to scaffold school pupilsâ argumentation and (2) as a method to investigate the quality of their argumentative essays. The participants were a science teacher-researcher, a knowledge mapping researcher and 20 pupils, 12-13 years old, in a summer science course for âgifted and talentedâ children in the UK. This study draws on multiple data sources: discussion forum, science teacher-researcherâs and pupilsâ Dialogue Maps, pupil essays, and reflective comments about the uses of mapping for writing. Through qualitative analysis of two case studies, we examine the role of Evidence-based Dialogue Maps as a mediating tool in scientific reasoning: as conceptual bridges for linking and making knowledge intelligible; as support for the linearisation task of generating a coherent document outline; as a reflective aid to rethinking reasoning in response to teacher feedback; and as a visual language for making arguments tangible via cartographic conventions
CREATING VISUAL ASSETS OF KOREAN VOCABULARY CARD GAME NOLJA
The Korean vocabulary card game Nolja is a game designed to assist the learning process of Korean vocabulary in an interactive way and suitable for women aged 15-25 years. However, as a new game, Nolja has a fatal problem with its design, namely, the placement of illustrations on the player card which makes the game finish too quickly, there are no tools to assess language proficiency on the card, the typeface on the card is less legible, and strokes on the illustration are too thin, Therefore, it is necessary to design a visual asset for the Nolja vocabulary card game. Visual asset design for Nolja includes game theme design, illustration, educational game system, layout, and content arrangement. This design aims to produce a Nolja card game design that is interactive, educational, and suitable for women aged 15-25 years who have the motivation and function to convey Korean language subject matter well. The design is carried out by qualitative methods through interviews with experts and extreme users in the game and learning the Korean language, distributing surveys to 100 target markets. This design is also supported by data collection from literature studies from books, journals, and articles from the Internet. The design produces a visual asset for the Nolja vocabulary card game with a stronger educational game system and a design style that matches the target market's wishes so that it can be well received by the market.
Keyword: Korean language, Visual learning media, Flashcards, Picture card games, Visual assets
- âŠ