773 research outputs found
Model Checking Parse Trees
Parse trees are fundamental syntactic structures in both computational
linguistics and compilers construction. We argue in this paper that, in both
fields, there are good incentives for model-checking sets of parse trees for
some word according to a context-free grammar. We put forward the adequacy of
propositional dynamic logic (PDL) on trees in these applications, and study as
a sanity check the complexity of the corresponding model-checking problem:
although complete for exponential time in the general case, we find natural
restrictions on grammars for our applications and establish complexities
ranging from nondeterministic polynomial time to polynomial space in the
relevant cases.Comment: 21 + x page
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
Large language models exhibit enhanced zero-shot performance on various tasks
when fine-tuned with instruction-following data. Multimodal
instruction-following models extend these capabilities by integrating both text
and images. However, existing models such as MiniGPT-4 face challenges in
maintaining dialogue coherence in scenarios involving multiple images. A
primary reason is the lack of a specialized dataset for this critical
application. To bridge these gaps, we present SparklesChat, a multimodal
instruction-following model for open-ended dialogues across multiple images. To
support the training, we introduce SparklesDialogue, the first
machine-generated dialogue dataset tailored for word-level interleaved
multi-image and text interactions. Furthermore, we construct SparklesEval, a
GPT-assisted benchmark for quantitatively assessing a model's conversational
competence across multiple images and dialogue turns. Our experiments validate
the effectiveness of SparklesChat in understanding and reasoning across
multiple images and dialogue turns. Specifically, SparklesChat outperformed
MiniGPT-4 on established vision-and-language benchmarks, including the BISON
binary image selection task and the NLVR2 visual reasoning task. Moreover,
SparklesChat scored 8.56 out of 10 on SparklesEval, substantially exceeding
MiniGPT-4's score of 3.91 and nearing GPT-4's score of 9.26. Qualitative
evaluations further demonstrate SparklesChat's generality in handling
real-world applications. All resources will be available at
https://github.com/HYPJUDY/Sparkles
Computational modeling of semantic change
In this chapter we provide an overview of computational modeling for semantic
change using large and semi-large textual corpora. We aim to provide a key for
the interpretation of relevant methods and evaluation techniques, and also
provide insights into important aspects of the computational study of semantic
change. We discuss the pros and cons of different classes of models with
respect to the properties of the data from which one wishes to model semantic
change, and which avenues are available to evaluate the results.Comment: This chapter is submitted to Routledge Handbook of Historical
Linguistics, 2nd Editio
CODE - SWITCHING OF ENGLISH IN THE ENTERTAINMENT PROGRAM OF SARAH SECHAN SHOW IN NET TV
The main purpose of the study was to identify the code â switching that used by Sarah Sechan in the entertainment program of Sarah Sechan Show in NET TV episode â Joe Taslim â and â Coky Sitohangâ. What are the factors that can influence the presenter in using code-switching to her quest . Qualitative study based on participant observation, recording technique and note â talking. In this study, reserchers found 2 ( two ) kinds of code-switching, such as : methaporical code-switcing and situational code-switching.Key words : Code-switching, Entertainment program, Sarah Sechan
Non-indexical contextualism, relativism and retraction
It is commonly held that retraction data, if they exist, show that assessment relativism is preferable to non-indexical contextualism. I argue that this is not the case. Whether retraction data have the suggested probative force depends on substantive questions about the proper treatment of tense and location. Oneâs preferred
account in these domains should determine whether one accepts assessment relativism or non-indexical contextualism
Language learning motivation : current insights and implications
The issue of learner motivation has long exercised researchers and practitioners in the field of
language education. However, it is only within the past decade or so that we have witnessed
productive interaction between the interests of researchers and teachers. Up until the early
1990s, research interest focused primarily on describing, measuring and classifying language
learner motivation and exploring its role in theoretical models of the language learning
process. The findings from such research offered little to teachers concerned with the practical
question of how to motivate their learners and keep them motivated. Moreover, this research
agenda was powerfully shaped by social-psychological perspectives on learner attitudes to
target language cultures and people (Gardner 1985; Gardner and Lambert 1972), while
motivational influences and processes within the social environment of the language
classroom remained relatively unexplored. In a seminal critique of the social-psychological
tradition, Crookes and Schmidt (1991) set forth a new agenda for research on a more
âpractitioner-validatedâ classroom-based concept of language learning motivation. The need
to establish closer links between theory and practice and to develop what Dörnyei
(2001a:103) has called more âeducation-friendlyâ approaches to language learning motivation
research stimulated an unprecedented wave of discussion during the mid-1990s (for a detailed
summary, see Dörnyei 1998), and has considerably reshaped the direction of theory and
research in the field
Popularity Prediction of Reddit Texts
Popularity prediction is a useful technique for marketers to anticipate the success of marketing campaigns, to build recommendation systems that suggest new products to consumers, and to develop targeted advertising. Researchers likewise use popularity prediction to measure how popularity changes within a community or within a given timespan. In this paper, I explore ways to predict popularity of posts in reddit.com, which is a blend of news aggregator and community forum. I frame popularity prediction as a text classification problem and attempt to solve it by first identifying topics in the text and then classifying whether the topics identified are more characteristic of popular or unpopular texts. This classifier is then used to label unseen texts as popular or not dependent on the topics found in these new posts. I explore the use of Latent Dirichlet Allocation and term frequency-inverse document frequency for topic identification and naĂŻve Bayes classifiers and support vector machines for classification. The relation between topics and popularity is dynamic -- topics in Reddit communities can wax and wane in popularity. Despite the inherent variability, the methods explored in the paper are effective, showing prediction accuracy between 60% and 75%. The study contributes to the field in various ways. For example, it provides novel data for research and development, not only for text classification but also for the study of relation between topics and popularity in general. The study also helps us better understand different topic identification and classification methods by illustrating their effectiveness on real-life data from a fast-changing and multi-purpose websit
Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Large-scale pre-trained language models such as GPT-3 have shown remarkable
performance across various natural language processing tasks. However, applying
prompt-based methods with GPT-3 for Grammatical Error Correction (GEC) tasks
and their controllability remains underexplored. Controllability in GEC is
crucial for real-world applications, particularly in educational settings,
where the ability to tailor feedback according to learner levels and specific
error types can significantly enhance the learning process. This paper
investigates the performance and controllability of prompt-based methods with
GPT-3 for GEC tasks using zero-shot and few-shot setting. We explore the impact
of task instructions and examples on GPT-3's output, focusing on controlling
aspects such as minimal edits, fluency edits, and learner levels. Our findings
demonstrate that GPT-3 could effectively perform GEC tasks, outperforming
existing supervised and unsupervised approaches. We also showed that GPT-3
could achieve controllability when appropriate task instructions and examples
are given.Comment: Accepted in BEA 202
- âŠ