213 research outputs found
Improving Introductory Computer Science Education with DRaCO
Today, many introductory computer science courses rely heavily on a specific programming language to convey fundamental programming concepts. For beginning students, the cognitive capacity required to operate with the syntactic forms of this language may overwhelm their ability to formulate a solution to a program.
We recognize that the introductory computer science courses can be more effective if they convey fundamental concepts without requiring the students to focus on the syntax of a programming language. To achieve this, we propose a new teaching method based on the Design Recipe and Code Outlining (DRaCO) processes. Our new pedagogy capitalizes on the algorithmic intuitions of novice students and provides a tool for students to externalize their intuitions using techniques they are already familiar with, rather than with the syntax of a specific programming language. We validate the effectiveness of our new pedagogy by integrating it into an existing CS1 course at California Polytechnic State University, San Luis Obispo. We find that the our newly proposed pedagogy shows strong potential to improve students’ ability to program
Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations
This work explores the use of self-generated natural language explanations as
an intermediate step for code-to-code translation with language models. Across
three types of explanations and 19 programming languages constructed from the
MultiPL-E dataset, we find the explanations to be particularly effective in the
zero-shot case, improving performance by 12% on average. Improvements with
natural language explanations are particularly pronounced on difficult
programs. We release our dataset, code, and canonical solutions in all 19
languages.Comment: 9 pages, 4 figures, 5 tables, 48 pages total. To be published in
EMNLP Findings 202
Indoor Risks Assessment Using Video Captioning
The progress of automatic scene analysis techniques for homes and the development of ambient assisted living systems is vital to help different kinds of people, such as the elderly or visually impaired individuals, who require special care in their daily lives. In this bachelor’s thesis we are going to develop a study of the most promising used techniques inside the Video Captioning and scene analysis scope and we will propose a Deep Learning pipeline aimed at performing Risks Assessment on input videos using the knowledge acquired during the study. This can be potentially applied to create systems aimed to help aforementioned people. Moreover, we will propose different evaluation architectures to test each of the stages involved in the Risks Assessment pipeline in order to observe its effectiveness and limitations. In this work we will introduce SwinBERT, a powerful and recent Video Captioning model, complemented with YOLOv7, a model aimed at the Object Recognition task, for the analysis of home scenes. Moreover, we will use various lexical transformations and linguistic models to maximize the semantic similarity of descriptions generated and objects detected, aligning them with the annotations provided by the datasets used. This approach will allow us to achieve more accurate matches from a human perspective. In the experiments we will outstand the usage of the large-scale dataset Charades, which was created with the goal of producing a vast dataset designed for the visual analysis, while preserving the naturalness and spontaneity of household and daily activities
In search of greener grass : finding the path from English hegemony to multilingualism
This thesis investigates the meaning of English language hegemony as I, the researcher, have experienced it. Using an autoethnographic method, I recount stories of multilingual language learning that uncover the themes of hegemony (Gramsci, 1992), unilateral power (Loomer, 1976) and privilege as they relate to the English language in the world today. These stories are drawn from a lifetime of language learning in different multilingual envi-ronments: from experiences of informal language learning in the home, formal education in different languages throughout childhood and adolescence, and finally adult experiences of language learning as an English language teacher and member of a bilingual household.
With the narrative material as a basis, I highlight the interrelated concepts of he-gemony, unilateral power and privilege in these experiences of language learning. I take a critical stance in my investigation and analysis of the hegemony, unilateral power and privi-lege that the English language enjoys at the expense of other languages. I examine the meaning of these concepts and how they have affected my understanding of language as a native English speaker, language learner and English language teacher, in Canada and abroad.
As an alternative to the hegemony of English, I propose a counter-hegemonic ap-proach: learning about language and culture in relationship with others in communities where linguistic diversity and multilingualism are genuinely accepted, and not merely per-ceived, as valuable. I suggest that multilingualism and language learning are vital for native English speakers to understand alternative perspectives of our world, and in order for them to experience a transformation in their grasp of linguistic and cultural diversity
Automatic Image Captioning with Style
This thesis connects two core topics in machine learning, vision
and language. The problem of choice is image caption generation:
automatically constructing natural language descriptions of image
content. Previous research into image caption generation has
focused on generating purely descriptive captions; I focus on
generating visually relevant captions with a distinct linguistic
style. Captions with style have the potential to ease
communication and add a new layer of personalisation.
First, I consider naming variations in image captions, and
propose a method for predicting context-dependent names that
takes into account visual and linguistic information. This method
makes use of a large-scale image caption dataset, which I also
use to explore naming conventions and report naming conventions
for hundreds of animal classes. Next I propose the SentiCap
model, which relies on recent advances in artificial neural
networks to generate visually relevant image captions with
positive or negative sentiment. To balance descriptiveness and
sentiment, the SentiCap model dynamically switches between two
recurrent neural networks, one tuned for descriptive words and
one for sentiment words. As the first published model for
generating captions with sentiment, SentiCap has influenced a
number of subsequent works. I then investigate the sub-task of
modelling styled sentences without images. The specific task
chosen is sentence simplification: rewriting news article
sentences to make them easier to understand.
For this task I design a neural sequence-to-sequence model that
can work with
limited training data, using novel adaptations for word copying
and sharing
word embeddings. Finally, I present SemStyle, a system for
generating visually
relevant image captions in the style of an arbitrary text corpus.
A shared term
space allows a neural network for vision and content planning to
communicate
with a network for styled language generation. SemStyle achieves
competitive
results in human and automatic evaluations of descriptiveness and
style.
As a whole, this thesis presents two complete systems for styled
caption generation that are first of their kind and demonstrate,
for the first time, that automatic style transfer for image
captions is achievable. Contributions also include novel ideas
for object naming and sentence simplification. This thesis opens
up inquiries into highly personalised image captions; large scale
visually grounded concept naming; and more generally, styled text
generation with content control
Knowledge and Reasoning for Image Understanding
abstract: Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning.
Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.Dissertation/ThesisDoctoral Dissertation Computer Science 201
- …