272 research outputs found

    Comparative analysis between a respeaking captioning system and a captioning system without human intervention

    Get PDF
    People living with deafness or hearing impairment have limited access to information broadcast live on television. Live closed captioning is a currently active area of study; to our knowledge, there is no system developed thus far that produces high-quality captioning results without using scripts or human interaction. This paper presents a comparative analysis of the quality of captions generated for four Spanish news programs by two captioning systems: a semiautomatic system based on respeaking (system currently used by a Spanish TV station) and an automatic system without human interaction proposed and developed by the authors. The analysis is conducted by measuring and comparing the accuracy, latency and speed of the captions generated by both captioning systems. The captions generated by the system presented higher quality considering the accuracy in terms of Word Error Rate (WER between 3.76 and 7.29%) and latency of the captions (approximately 4 s) at an acceptable speed to access the information. We contribute a first study focused on the development and analysis of an automatic captioning system without human intervention with promising quality results. These results reinforce the importance of continuing to study these automatic systems

    Streaming Automatic Speech Recognition with Hybrid Architectures and Deep Neural Network Models

    Full text link
    Tesis por compendio[ES] Durante la última década, los medios de comunicación han experimentado una revolución, alejándose de la televisión convencional hacia las plataformas de contenido bajo demanda. Además, esta revolución no ha cambiado solamente la manera en la que nos entretenemos, si no también la manera en la que aprendemos. En este sentido, las plataformas de contenido educativo bajo demanda también han proliferado para proporcionar recursos educativos de diversos tipos. Estas nuevas vías de distribución de contenido han llegado con nuevos requisitos para mejorar la accesibilidad, en particular las relacionadas con las dificultades de audición y las barreras lingüísticas. Aquí radica la oportunidad para el reconocimiento automático del habla (RAH) para cumplir estos requisitos, proporcionando subtitulado automático de alta calidad. Este subtitulado proporciona una base sólida para reducir esta brecha de accesibilidad, especialmente para contenido en directo o streaming. Estos sistemas de streaming deben trabajar bajo estrictas condiciones de tiempo real, proporcionando la subtitulación tan rápido como sea posible, trabajando con un contexto limitado. Sin embargo, esta limitación puede conllevar una degradación de la calidad cuando se compara con los sistemas para contenido en diferido u offline. Esta tesis propone un sistema de RAH en streaming con baja latencia, con una calidad similar a un sistema offline. Concretamente, este trabajo describe el camino seguido desde el sistema offline híbrido inicial hasta el eficiente sistema final de reconocimiento en streaming. El primer paso es la adaptación del sistema para efectuar una sola iteración de reconocimiento haciendo uso de modelos de lenguaje estado del arte basados en redes neuronales. En los sistemas basados en múltiples iteraciones estos modelos son relegados a una segunda (o posterior) iteración por su gran coste computacional. Tras adaptar el modelo de lenguaje, el modelo acústico basado en redes neuronales también tiene que adaptarse para trabajar con un contexto limitado. La integración y la adaptación de estos modelos es ampliamente descrita en esta tesis, evaluando el sistema RAH resultante, completamente adaptado para streaming, en conjuntos de datos académicos extensamente utilizados y desafiantes tareas basadas en contenidos audiovisuales reales. Como resultado, el sistema proporciona bajas tasas de error con un reducido tiempo de respuesta, comparables al sistema offline.[CA] Durant l'última dècada, els mitjans de comunicació han experimentat una revolució, allunyant-se de la televisió convencional cap a les plataformes de contingut sota demanda. A més a més, aquesta revolució no ha canviat només la manera en la que ens entretenim, si no també la manera en la que aprenem. En aquest sentit, les plataformes de contingut educatiu sota demanda també han proliferat pera proporcionar recursos educatius de diversos tipus. Aquestes noves vies de distribució de contingut han arribat amb nous requisits per a millorar l'accessibilitat, en particular les relacionades amb les dificultats d'audició i les barreres lingüístiques. Aquí radica l'oportunitat per al reconeixement automàtic de la parla (RAH) per a complir aquests requisits, proporcionant subtitulat automàtic d'alta qualitat. Aquest subtitulat proporciona una base sòlida per a reduir aquesta bretxa d'accessibilitat, especialment per a contingut en directe o streaming. Aquests sistemes han de treballar sota estrictes condicions de temps real, proporcionant la subtitulació tan ràpid com sigui possible, treballant en un context limitat. Aquesta limitació, però, pot comportar una degradació de la qualitat quan es compara amb els sistemes per a contingut en diferit o offline. Aquesta tesi proposa un sistema de RAH en streaming amb baixa latència, amb una qualitat similar a un sistema offline. Concretament, aquest treball descriu el camí seguit des del sistema offline híbrid inicial fins l'eficient sistema final de reconeixement en streaming. El primer pas és l'adaptació del sistema per a efectuar una sola iteració de reconeixement fent servir els models de llenguatge de l'estat de l'art basat en xarxes neuronals. En els sistemes basats en múltiples iteracions aquests models son relegades a una segona (o posterior) iteració pel seu gran cost computacional. Un cop el model de llenguatge s'ha adaptat, el model acústic basat en xarxes neuronals també s'ha d'adaptar per a treballar amb un context limitat. La integració i l'adaptació d'aquests models és àmpliament descrita en aquesta tesi, avaluant el sistema RAH resultant, completament adaptat per streaming, en conjunts de dades acadèmiques àmpliament utilitzades i desafiants tasques basades en continguts audiovisuals reals. Com a resultat, el sistema proporciona baixes taxes d'error amb un reduït temps de resposta, comparables al sistema offline.[EN] Over the last decade, the media have experienced a revolution, turning away from the conventional TV in favor of on-demand platforms. In addition, this media revolution not only changed the way entertainment is conceived but also how learning is conducted. Indeed, on-demand educational platforms have also proliferated and are now providing educational resources on diverse topics. These new ways to distribute content have come along with requirements to improve accessibility, particularly related to hearing difficulties and language barriers. Here is the opportunity for automatic speech recognition (ASR) to comply with these requirements by providing high-quality automatic captioning. Automatic captioning provides a sound basis for diminishing the accessibility gap, especially for live or streaming content. To this end, streaming ASR must work under strict real-time conditions, providing captions as fast as possible, and working with limited context. However, this limited context usually leads to a quality degradation as compared to the pre-recorded or offline content. This thesis is aimed at developing low-latency streaming ASR with a quality similar to offline ASR. More precisely, it describes the path followed from an initial hybrid offline system to an efficient streaming-adapted system. The first step is to perform a single recognition pass using a state-of-the-art neural network-based language model. In conventional multi-pass systems, this model is often deferred to the second or later pass due to its computational complexity. As with the language model, the neural-based acoustic model is also properly adapted to work with limited context. The adaptation and integration of these models is thoroughly described and assessed using fully-fledged streaming systems on well-known academic and challenging real-world benchmarks. In brief, it is shown that the proposed adaptation of the language and acoustic models allows the streaming-adapted system to reach the accuracy of the initial offline system with low latency.Jorge Cano, J. (2022). Streaming Automatic Speech Recognition with Hybrid Architectures and Deep Neural Network Models [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/191001Compendi

    21st Century Teenagers and Young Adults who are Deaf or Hard of Hearing: Outcomes and Possibilities

    Get PDF
    The purpose of this study was to document demographics, characteristics, and long-term outcomes of teenagers and young adults who are deaf or hard of hearing (DHH) and who all attended the Moog Center for Deaf Education for preschool and/or a portion of elementary school. Data were obtained via an online survey that was designed to assess educational, personal, and professional outcomes of individuals who were currently in high school and beyond. Survey questions were about the lives of these individuals after leaving the Moog Center. Participants included 108 individuals who were DHH, ranging from 15 to 32 years of age. Results indicate this group of individuals obtained high levels of achievement in terms of educational attainment, employment experience, social involvement, and communication competence, which demonstrates what is possible for individuals who are DHH

    Image Description using Deep Neural Networks

    Get PDF
    Current research in computer vision and machine learning has demonstrated some great abilities at detecting and recognizing objects in natural images. Current state-of-the-art results in object detection, classification and localization in ImageNet Challenges have the validation accuracy for top 5 predictions for classification to be at 3.08% while similar classification experiments run by trained humans report an accuracy of 5.1%. While some people might argue that human accuracy is a function of training time it can be said with great confidence that automated classification models are at least as good as trained humans in classification problems. The ability of these models to analyze and describe complex images, however, is still an active area of research. Image description is a good starting point for imparting artificial intelligence to machines by allowing them to analyze and describe complex visual scenes. This thesis work introduces a generic end-to-end trainable Fusion-based Recurrent Multi-Modal (FRMM) architecture to address multi-modal applications. FRMM allows each input modality to be independent in terms of architecture, parameters and length of input sequences. FRMM image description models seamlessly blend convolutional neural network feature descriptors with sequential language data in a recurrent framework. In addition to introducing FRMMs, this work also analyzes the impact of varying activation functions and vocabulary size. For training and testing Flickr8k, Flickr30K and MSCOCO datasets have been used, demonstrating state-of-the-art description results

    Case Study in Modeling Accessibility for Online Instruction.

    Get PDF
    Ph.D. Thesis. University of Hawaiʻi at Mānoa 2017

    잠재 임베딩을 통한 시각적 스토리로부터의 서사 텍스트 생성기 학습

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2019. 2. 장병탁.The ability to understand the story is essential to make humans unique from other primates as well as animals. The capability of story understanding is crucial for AI agents to live with people in everyday life and understand their context. However, most research on story AI focuses on automated story generation based on closed worlds designed manually, which are widely used for computation authoring. Machine learning techniques on story corpora face similar problems of natural language processing such as omitting details and commonsense knowledge. Since the remarkable success of deep learning on computer vision field, increasing our interest in research on bridging between vision and language, vision-grounded story data will potentially improve the performance of story understanding and narrative text generation. Let us assume that AI agents lie in the environment in which the sensing information is input by the camera. Those agents observe the surroundings, translate them into the story in natural language, and predict the following event or multiple ones sequentially. This dissertation study on the related problems: learning stories or generating the narrative text from image streams or videos. The first problem is to generate a narrative text from a sequence of ordered images. As a solution, we introduce a GLAC Net (Global-local Attention Cascading Network). It translates from image sequences to narrative paragraphs in text as a encoder-decoder framework with sequence-to-sequence setting. It has convolutional neural networks for extracting information from images, and recurrent neural networks for text generation. We introduce visual cue encoders with stacked bidirectional LSTMs, and all of the outputs of each layer are aggregated as contextualized image vectors to extract visual clues. The coherency of the generated text is further improved by conveying (cascading) the information of the previous sentence to the next sentence serially in the decoders. We evaluate the performance of it on the Visual storytelling (VIST) dataset. It outperforms other state-of-the-art results and shows the best scores in total score and all of 6 aspects in the visual storytelling challenge with evaluation of human judges. The second is to predict the following events or narrative texts with the former parts of stories. It should be possible to predict at any step with an arbitrary length. We propose recurrent event retrieval models as a solution. They train a context accumulation function and two embedding functions, where make close the distance between the cumulative context at current time and the next probable events on a latent space. They update the cumulative context with a new event as a input using bilinear operations, and we can find the next event candidates with the updated cumulative context. We evaluate them for Story Cloze Test, they show competitive performance and the best in open-ended generation setting. Also, it demonstrates the working examples in an interactive setting. The third deals with the study on composite representation learning for semantics and order for video stories. We embed each episode as a trajectory-like sequence of events on the latent space, and propose a ViStoryNet to regenerate video stories with them (tasks of story completion). We convert event sentences to thought vectors, and train functions to make successive event embed close each other to form episodes as trajectories. Bi-directional LSTMs are trained as sequence models, and decoders to generate event sentences with GRUs. We test them experimentally with PororoQA dataset, and observe that most of episodes show the form of trajectories. We use them to complete the blocked part of stories, and they show not perfect but overall similar result. Those results above can be applied to AI agents in the living area sensing with their cameras, explain the situation as stories, infer some unobserved parts, and predict the future story.스토리를 이해하는 능력은 동물들 뿐만 아니라 다른 유인원과 인류를 구별짓는 중요한 능력이다. 인공지능이 일상생활 속에서 사람들과 함께 지내면서 그들의 생활 속 맥락을 이해하기 위해서는 스토리를 이해하는 능력이 매우 중요하다. 하지만, 기존의 스토리에 관한 연구는 언어처리의 어려움으로 인해 사전에 정의된 세계 모델 하에서 좋은 품질의 저작물을 생성하려는 기술이 주로 연구되어 왔다. 기계학습 기법을 통해 스토리를 다루려는 시도들은 대체로 자연어로 표현된 데이터에 기반할 수 밖에 없어 자연어 처리에서 겪는 문제들을 동일하게 겪는다. 이를 극복하기 위해서는 시각적 정보가 함께 연동된 데이터가 도움이 될 수 있다. 최근 딥러닝의 눈부신 발전에 힘입어 시각과 언어 사이의 관계를 다루는 연구들이 늘어나고 있다. 연구의 비전으로서, 인공지능 에이전트가 주변 정보를 카메라로 입력받는 환경 속에 놓여있는 상황을 생각해 볼 수 있다. 이 안에서 인공지능 에이전트는 주변을 관찰하면서 그에 대한 스토리를 자연어 형태로 생성하고, 생성된 스토리를 바탕으로 다음에 일어날 스토리를 한 단계에서 여러 단계까지 예측할 수 있다. 본 학위 논문에서는 사진 및 비디오 속에 나타나는 스토리(visual story)를 학습하는 방법, 내러티브 텍스트로의 변환, 가려진 사건 및 다음 사건을 추론하는 연구들을 다룬다. 첫 번째로, 여러 장의 사진이 주어졌을 때 이를 바탕으로 스토리 텍스트를 생성하는 문제(비주얼 스토리텔링)를 다룬다. 이 문제 해결을 위해 글랙넷(GLAC Net)을 제안하였다. 먼저, 사진들로부터 정보를 추출하기 위한 컨볼루션 신경망, 문장을 생성하기 위해 순환신경망을 이용한다. 시퀀스-시퀀스 구조의 인코더로서, 전체적인 이야기 구조의 표현을 위해 다계층 양방향 순환신경망을 배치하되 각 사진 별 정보를 함께 이용하기 위해 전역적-국부적 주의집중 모델을 제안하였다. 또한, 여러 문장을 생성하는 동안 맥락정보와 국부정보를 잃지 않게 하기 위해 앞선 문장 정보를 전달하는 메커니즘을 제안하였다. 위 제안 방법으로 비스트(VIST) 데이터 집합을 학습하였고, 제 1 회 시각적 스토리텔링 대회(visual storytelling challenge)에서 사람 평가를 기준으로 전체 점수 및 6 항목 별로 모두 최고점을 받았다. 두 번째로, 스토리의 일부가 문장들로 주어졌을 때 이를 바탕으로 다음 문장을 예측하는 문제를 다룬다. 임의의 길이의 스토리에 대해 임의의 위치에서 예측이 가능해야 하고, 예측하려는 단계 수에 무관하게 작동해야 한다. 이를 위한 방법으로 순환 사건 인출 모델(Recurrent Event Retrieval Models)을 제안하였다. 이 방법은 은닉 공간 상에서 현재까지 누적된 맥락과 다음에 발생할 유력 사건 사이의 거리를 가깝게 하도록 맥락누적함수와 두 개의 임베딩 함수를 학습한다. 이를 통해 이미 입력되어 있던 스토리에 새로운 사건이 입력되면 쌍선형적 연산을 통해 기존의 맥락을 개선하여 다음에 발생할 유력한 사건들을 찾는다. 이 방법으로 락스토리(ROCStories) 데이터집합을 학습하였고, 스토리 클로즈 테스트(Story Cloze Test)를 통해 평가한 결과 경쟁력 있는 성능을 보였으며, 특히 임의의 길이로 추론할 수 있는 기법 중에 최고성능을 보였다. 세 번째로, 비디오 스토리에서 사건 시퀀스 중 일부가 가려졌을 때 이를 복구하는 문제를 다룬다. 특히, 각 사건의 의미 정보와 순서를 모델의 표현 학습에 반영하고자 하였다. 이를 위해 은닉 공간 상에 각 에피소드들을 궤적 형태로 임베딩하고, 이를 바탕으로 스토리를 재생성을 하여 스토리 완성을 할 수 있는 모델인 비스토리넷(ViStoryNet)을 제안하였다. 각 에피소드를 궤적 형태를 가지게 하기 위해 사건 문장을 사고벡터(thought vector)로 변환하고, 연속 이벤트 순서 임베딩을 통해 전후 사건들이 서로 가깝게 임베딩되도록 하여 하나의 에피소드가 궤적의 모양을 가지도록 학습하였다. 뽀로로QA 데이터집합을 통해 실험적으로 결과를 확인하였다. 임베딩 된 에피소드들은 궤적 형태로 잘 나타났으며, 에피소드들을 재생성 해본 결과 전체적인 측면에서 유사한 결과를 보였다. 위 결과물들은 카메라로 입력되는 주변 정보를 바탕으로 스토리를 이해하고 일부 관측되지 않은 부분을 추론하며, 향후 스토리를 예측하는 방법들에 대응된다.Abstract i Chapter 1 Introduction 1 1.1 Story of Everyday lives in Videos and Story Understanding . . . 1 1.2 Problems to be addressed . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Approach and Contribution . . . . . . . . . . . . . . . . . . . . . 6 1.4 Organization of Dissertation . . . . . . . . . . . . . . . . . . . . . 9 Chapter 2 Background and Related Work 10 2.1 Why We Study Stories . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Latent Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Order Embedding and Ordinal Embedding . . . . . . . . . . . . 14 2.4 Comparison to Story Understanding . . . . . . . . . . . . . . . . 15 2.5 Story Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5.1 Abstract Event Representations . . . . . . . . . . . . . . . 17 2.5.2 Seq-to-seq Attentional Models . . . . . . . . . . . . . . . . 18 2.5.3 Story Generation from Images . . . . . . . . . . . . . . . 19 Chapter 3 Visual Storytelling via Global-local Attention Cascading Networks 21 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Evaluation for Visual Storytelling . . . . . . . . . . . . . . . . . . 26 3.3 Global-local Attention Cascading Networks (GLAC Net) . . . . . 27 3.3.1 Encoder: Contextualized Image Vector Extractor . . . . . 28 3.3.2 Decoder: Story Generator with Attention and Cascading Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4.1 VIST Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4.2 Experiment Settings . . . . . . . . . . . . . . . . . . . . . 33 3.4.3 Network Training Details . . . . . . . . . . . . . . . . . . 36 3.4.4 Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . 38 3.4.5 Quantitative Analysis . . . . . . . . . . . . . . . . . . . . 38 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Chapter 4 Common Space Learning on Cumulative Contexts and the Next Events: Recurrent Event Retrieval Models 44 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2 Problems of Context Accumulation . . . . . . . . . . . . . . . . . 45 4.3 Recurrent Event Retrieval Models for Next Event Prediction . . 46 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.4.2 Story Cloze Test . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.3 Open-ended Story Generation . . . . . . . . . . . . . . . . 53 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Chapter 5 ViStoryNet: Order Embedding of Successive Events and the Networks for Story Regeneration 58 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.2 Order Embedding with Triple Learning . . . . . . . . . . . . . . 60 5.2.1 Embedding Ordered Objects in Sequences . . . . . . . . . 62 5.3 Problems and Contextual Events . . . . . . . . . . . . . . . . . . 62 5.3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 62 5.3.2 Contextual Event Vectors from Kids Videos . . . . . . . . 64 5.4 Architectures for the Story Regeneration Task . . . . . . . . . . . 67 5.4.1 Two Sentence Generators as Decoders . . . . . . . . . . . 68 5.4.2 Successive Event Order Embedding (SEOE) . . . . . . . . 68 5.4.3 Sequence Models of the Event Space . . . . . . . . . . . . 72 5.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.5.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . 73 5.5.2 Quantitative Analysis . . . . . . . . . . . . . . . . . . . . 73 5.5.3 Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . 74 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Chapter 6 Concluding Remarks 80 6.1 Summary of Methods and Contributions . . . . . . . . . . . . . . 80 6.2 Limitation and Outlook . . . . . . . . . . . . . . . . . . . . . . . 81 6.3 Suggestions for Future Research . . . . . . . . . . . . . . . . . . . 81 초록 101Docto

    An Evaluation of the Impact of Professional Development on Accessibility to Online Courses by Students With Special Needs at a Regional Four-Year Public Institution of Higher Education in West Texas

    Get PDF
    Evaluation research, including qualitative and quantitative data, was used in this study to determine the impact of professional development on online courses’ accessibility by students with disabilities. The study focused on online courses and course content. Data collection took place in phases and included self-paced, online professional development and one-on-one support sessions, a pre- and postsurvey, and a focus group interview. The study took place at a regional 4-year public institution of higher education in West Texas. Change theory emerged as the primary theoretical lens guiding the research as the study unfolded

    Automatic Image Captioning with Style

    Get PDF
    This thesis connects two core topics in machine learning, vision and language. The problem of choice is image caption generation: automatically constructing natural language descriptions of image content. Previous research into image caption generation has focused on generating purely descriptive captions; I focus on generating visually relevant captions with a distinct linguistic style. Captions with style have the potential to ease communication and add a new layer of personalisation. First, I consider naming variations in image captions, and propose a method for predicting context-dependent names that takes into account visual and linguistic information. This method makes use of a large-scale image caption dataset, which I also use to explore naming conventions and report naming conventions for hundreds of animal classes. Next I propose the SentiCap model, which relies on recent advances in artificial neural networks to generate visually relevant image captions with positive or negative sentiment. To balance descriptiveness and sentiment, the SentiCap model dynamically switches between two recurrent neural networks, one tuned for descriptive words and one for sentiment words. As the first published model for generating captions with sentiment, SentiCap has influenced a number of subsequent works. I then investigate the sub-task of modelling styled sentences without images. The specific task chosen is sentence simplification: rewriting news article sentences to make them easier to understand. For this task I design a neural sequence-to-sequence model that can work with limited training data, using novel adaptations for word copying and sharing word embeddings. Finally, I present SemStyle, a system for generating visually relevant image captions in the style of an arbitrary text corpus. A shared term space allows a neural network for vision and content planning to communicate with a network for styled language generation. SemStyle achieves competitive results in human and automatic evaluations of descriptiveness and style. As a whole, this thesis presents two complete systems for styled caption generation that are first of their kind and demonstrate, for the first time, that automatic style transfer for image captions is achievable. Contributions also include novel ideas for object naming and sentence simplification. This thesis opens up inquiries into highly personalised image captions; large scale visually grounded concept naming; and more generally, styled text generation with content control

    A Mixed-Methods Study Examining the Difference Between Closed Captioning and Lexile Levels

    Get PDF
    This experimental mixed-methods study explores what happens to student Lexile scores when they use closed captioning. Since the emergence of closed captioning tools in the 1980s, closed captioning has become more mainstream and easier to access today than at any other time in history (Rickelman et al., 1991). Thus, it is through harnessing this technology and bringing it into the classroom setting that the researcher of this study hopes to provide new approaches for educators that want to improve their student Lexile levels, while also incorporating the SAMR model within our increasingly technologically-focused classrooms (Crompton & Burke, 2018). The quantitative data analysis procedures involved in this experimental study consisted of utilizing two-sample t-tests to compare the iReady Lexile scores of the participants [n=38] to that of the researched district students [n=810] that were not using closed captioning in this study. The researcher required participants to complete a baseline iReady test to determine their preexisting Lexile levels. Then after the study, participants both in the researched district and in the study, itself were required to complete an iReady post-test to determine their respective Lexile growth in the four areas of reading, which are overall growth, vocabulary, comprehension of literary text, and comprehension of informational text. The independent variable in this study was the use of the enabled closed captioning tool found on the participants\u27 devices. The dependent variable was the Lexile scores that were computed using the iReady Lexile exam. The researcher collected the qualitative data using a variety of observational logs personal interviews, and pre- and post-surveys that the researcher disseminated to students using the Qualtrics system. Once these data were collected, theming and phenomenology analysis were used to identify themes and student emotions/reactions that emerged throughout this study. The themes that emerged from participants involved in the study included the belief in increasing Lexile levels, no effect on vocabulary, and enjoyment of using closed captioning
    corecore