10,323 research outputs found
Crowdsourcing Multiple Choice Science Questions
We present a novel method for obtaining high-quality, domain-targeted
multiple choice questions from crowd workers. Generating these questions can be
difficult without trading away originality, relevance or diversity in the
answer options. Our method addresses these problems by leveraging a large
corpus of domain-specific text and a small set of existing questions. It
produces model suggestions for document selection and answer distractor choice
which aid the human question generation process. With this method we have
assembled SciQ, a dataset of 13.7K multiple choice science exam questions
(Dataset available at http://allenai.org/data.html). We demonstrate that the
method produces in-domain questions by providing an analysis of this new
dataset and by showing that humans cannot distinguish the crowdsourced
questions from original questions. When using SciQ as additional training data
to existing questions, we observe accuracy improvements on real science exams.Comment: accepted for the Workshop on Noisy User-generated Text (W-NUT) 201
DramaQA: Character-Centered Video Story Understanding with Hierarchical QA
Despite recent progress on computer vision and natural language processing,
developing video understanding intelligence is still hard to achieve due to the
intrinsic difficulty of story in video. Moreover, there is not a theoretical
metric for evaluating the degree of video understanding. In this paper, we
propose a novel video question answering (Video QA) task, DramaQA, for a
comprehensive understanding of the video story. The DramaQA focused on two
perspectives: 1) hierarchical QAs as an evaluation metric based on the
cognitive developmental stages of human intelligence. 2) character-centered
video annotations to model local coherence of the story. Our dataset is built
upon the TV drama "Another Miss Oh" and it contains 16,191 QA pairs from 23,928
various length video clips, with each QA pair belonging to one of four
difficulty levels. We provide 217,308 annotated images with rich
character-centered annotations, including visual bounding boxes, behaviors, and
emotions of main characters, and coreference resolved scripts. Additionally, we
provide analyses of the dataset as well as Dual Matching Multistream model
which effectively learns character-centered representations of video to answer
questions about the video. We are planning to release our dataset and model
publicly for research purposes and expect that our work will provide a new
perspective on video story understanding research.Comment: 21 pages, 10 figures, submitted to ECCV 202
Multimodal meaning making with culturally responsive images: designing tasks for 6th - 8th grade special education students
Thesis (M.A.) University of Alaska Fairbanks, 2019The following study describes the patterns that emerged from collaborative tasks among middle school students within a special education intervention class in rural Alaska. The study integrated the multiliteracies pedagogy, as well as multimodalities and task-based language teaching. The tasks utilized culturally appropriate illustrations to promote collaborative discussion throughout a structured set of five tasks. The research aims to answer the following question: How do sixth through eighth grade students co-construct meaning when doing tasks that incorporate culturally appropriate images? Three students native to the community participated in this study over a two-month period. The tasks were designed around culturally relevant illustrations allowing students to use their funds of knowledge as they collaborated to complete the tasks. The data collection included field notes, class artifacts, video and audio recordings, and student interviews. The data presented multimodal events where students utilized their semiotic resources and funds of knowledge to make meaning during each task. The analysis revealed telling incidents of multimodal meaning making moments where culturally relevant resources support the application of funds of knowledge. The analysis also uncovered critical insights for the task design variables which can impact the ending outcome and final product of a task. As a result, I encourage the use of open-ended tasks addressing multimodal teaching to encourage culturally relevant meaning making moments, particularly within special education settings
Facilitating Students’ Higher Order Thinking Skill (Hots) Through Three-Phase Reading Activity at Eis Class
The purpose of this research was to show how to facilitate students’ higher-order thinking skills (HOTS) in reading activities in English for Islamic Studies class and acknowledge students’ opinions on the activity. This was descriptive qualitative research, and its data were collected through observation and interview. Then the data were analyzed using interactive data analysis: data collection, data reduction, data display, and conclusion. The findings showed that the effort to facilitate students’ higher-order thinking skills (HOTS) is by proposing not only lower-level questions but also higher-level questions. To prevent the overuse of lower-level questions or higher-level questions, the lecturer employs the question-answer relationships (QAR) framework of the reading phase. The interview with students indicated that the students thought that the activity benefits them. Besides, the creative response questions in the post-reading phase enabled them to optimize their mind (schemata) proposing a solution to the problems found in the passage without feeling afraid of making mistakes as there is no wrong or right answer to such questions
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark
Large language models have become a potential pathway toward achieving
artificial general intelligence. Recent works on multi-modal large language
models have demonstrated their effectiveness in handling visual modalities. In
this work, we extend the research of MLLMs to point clouds and present the
LAMM-Dataset and LAMM-Benchmark for 2D image and 3D point cloud understanding.
We also establish an extensible framework to facilitate the extension of MLLMs
to additional modalities. Our main contribution is three-fold: 1) We present
the LAMM-Dataset and LAMM-Benchmark, which cover almost all high-level vision
tasks for 2D and 3D vision. Extensive experiments validate the effectiveness of
our dataset and benchmark. 2) We demonstrate the detailed methods of
constructing instruction-tuning datasets and benchmarks for MLLMs, which will
enable future research on MLLMs to scale up and extend to other domains, tasks,
and modalities faster. 3) We provide a primary but potential MLLM training
framework optimized for modalities' extension. We also provide baseline models,
comprehensive experimental observations, and analysis to accelerate future
research. Codes and datasets are now available at
https://github.com/OpenLAMM/LAMM.Comment: 37 pages, 33 figures. Code available at
https://github.com/OpenLAMM/LAMM ; Project page: https://openlamm.github.io
- …