15 research outputs found
On \u3ci\u3e L\u3c/i\u3e-functions and the 1-Level Density
We begin with the classical study of the Riemann zeta function and Dirichlet L-functions. This includes a full exposition on one of the most useful ways of exploiting their connection with primes, namely, explicit formulae. We then proceed to introduce statistics of low-lying zeros of Dirichlet L-functions, discussing prior results of Fiorilli and Miller (2015) on the 1-level density of Dirichlet L-functions and their achievement in surpassing the prediction of the powerful Ratios Conjecture. Finally, we present our original work partially generalizing these results to the case of Hecke L-functions over imaginary quadratic fields
Representation of convex geometries of convex dimension 3 by spheres
A convex geometry is a closure system satisfying the anti-exchange property.
This paper, following the work of Adaricheva and Bolat (2019) and the Polymath
REU (2020), continues to investigate representations of convex geometries with
small convex dimension by convex shapes on the plane and in spaces of higher
dimension. In particular, we answer in the negative the question raised by
Polymath REU (2020): whether every convex geometry of is representable
by the circles on the plane. We show there are geometries of that
cannot be represented by spheres in any , and this connects to
posets not representable by spheres from the paper of Felsner, Fishburn and
Trotter (1999). On the positive side, we use the result of Kincses (2015) to
show that every finite poset is an ellipsoid order.Comment: 11 pages. 3 figure
Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code
We analyzed effectiveness of three generative pre-trained transformer (GPT)
models in answering multiple-choice question (MCQ) assessments, often involving
short snippets of code, from introductory and intermediate programming courses
at the postsecondary level. This emerging technology stirs countless
discussions of its potential uses (e.g., exercise generation, code explanation)
as well as misuses in programming education (e.g., cheating). However, the
capabilities of GPT models and their limitations to reason about and/or analyze
code in educational settings have been under-explored. We evaluated several
OpenAI's GPT models on formative and summative MCQ assessments from three
Python courses (530 questions). We found that MCQs containing code snippets are
not answered as successfully as those that only contain natural language. While
questions requiring to fill-in a blank in the code or completing a natural
language statement about the snippet are handled rather successfully, MCQs that
require analysis and/or reasoning about the code (e.g., what is true/false
about the snippet, or what is its output) appear to be the most challenging.
These findings can be leveraged by educators to adapt their instructional
practices and assessments in programming courses, so that GPT becomes a
valuable assistant for a learner as opposed to a source of confusion and/or
potential hindrance in the learning process.Comment: 12 page
Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses
This paper studies recent developments in large language models' (LLM)
abilities to pass assessments in introductory and intermediate Python
programming courses at the postsecondary level. The emergence of ChatGPT
resulted in heated debates of its potential uses (e.g., exercise generation,
code explanation) as well as misuses in programming classes (e.g., cheating).
Recent studies show that while the technology performs surprisingly well on
diverse sets of assessment instruments employed in typical programming classes
the performance is usually not sufficient to pass the courses. The release of
GPT-4 largely emphasized notable improvements in the capabilities related to
handling assessments originally designed for human test-takers. This study is
the necessary analysis in the context of this ongoing transition towards mature
generative AI systems. Specifically, we report the performance of GPT-4,
comparing it to the previous generations of GPT models, on three Python courses
with assessments ranging from simple multiple-choice questions (no code
involved) to complex programming projects with code bases distributed into
multiple files (599 exercises overall). Additionally, we analyze the
assessments that were not handled well by GPT-4 to understand the current
limitations of the model, as well as its capabilities to leverage feedback
provided by an auto-grader. We found that the GPT models evolved from
completely failing the typical programming class' assessments (the original
GPT-3) to confidently passing the courses with no human involvement (GPT-4).
While we identified certain limitations in GPT-4's handling of MCQs and coding
exercises, the rate of improvement across the recent generations of GPT models
strongly suggests their potential to handle almost any type of assessment
widely used in higher education programming courses. These findings could be
leveraged by educators and institutions to adapt the design of programming
assessments as well as to fuel the necessary discussions into how programming
classes should be updated to reflect the recent technological developments.
This study provides evidence that programming instructors need to prepare for a
world in which there is an easy-to-use widely accessible technology that can be
utilized by learners to collect passing scores, with no effort whatsoever, on
what today counts as viable programming knowledge and skills assessments
Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?
We evaluated the capability of generative pre-trained transformers (GPT), to
pass assessments in introductory and intermediate Python programming courses at
the postsecondary level. Discussions of potential uses (e.g., exercise
generation, code explanation) and misuses (e.g., cheating) of this emerging
technology in programming education have intensified, but to date there has not
been a rigorous analysis of the models' capabilities in the realistic context
of a full-fledged programming course with diverse set of assessment
instruments. We evaluated GPT on three Python courses that employ assessments
ranging from simple multiple-choice questions (no code involved) to complex
programming projects with code bases distributed into multiple files (599
exercises overall). Further, we studied if and how successfully GPT models
leverage feedback provided by an auto-grader. We found that the current models
are not capable of passing the full spectrum of assessments typically involved
in a Python programming course (<70% on even entry-level modules). Yet, it is
clear that a straightforward application of these easily accessible models
could enable a learner to obtain a non-trivial portion of the overall available
score (>55%) in introductory and intermediate courses alike. While the models
exhibit remarkable capabilities, including correcting solutions based on
auto-grader's feedback, some limitations exist (e.g., poor handling of
exercises requiring complex chains of reasoning steps). These findings can be
leveraged by instructors wishing to adapt their assessments so that GPT becomes
a valuable assistant for a learner as opposed to an end-to-end solution.Comment: 7 pages. arXiv admin note: text overlap with arXiv:2303.0803
MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning
Learning multimodal representations involves integrating information from
multiple heterogeneous sources of data. In order to accelerate progress towards
understudied modalities and tasks while ensuring real-world robustness, we
release MultiZoo, a public toolkit consisting of standardized implementations
of > 20 core multimodal algorithms and MultiBench, a large-scale benchmark
spanning 15 datasets, 10 modalities, 20 prediction tasks, and 6 research areas.
Together, these provide an automated end-to-end machine learning pipeline that
simplifies and standardizes data loading, experimental setup, and model
evaluation. To enable holistic evaluation, we offer a comprehensive methodology
to assess (1) generalization, (2) time and space complexity, and (3) modality
robustness. MultiBench paves the way towards a better understanding of the
capabilities and limitations of multimodal models, while ensuring ease of use,
accessibility, and reproducibility. Our toolkits are publicly available, will
be regularly updated, and welcome inputs from the community.Comment: JMLR Open Source Software 2023, Code available at
https://github.com/pliang279/MultiBenc
FibroDB: Expression Analysis of Protein-Coding and Long Non-Coding RNA Genes in Fibrosis
Most long non-coding RNAs (lncRNAs) are expressed at lower levels than protein-coding genes and their expression is often restricted to specific cell types, certain time points during development, and various stress and disease conditions, respectively. To revisit this long-held concept, we focused on fibroblasts, a common cell type in various organs and tissues. Using fibroblasts and changes in their expression profiles during fibrosis as a model system, we show that the overall expression level of lncRNA genes is significantly lower than that of protein-coding genes. Furthermore, we identified lncRNA genes whose expression is upregulated during fibrosis. Using dermal fibroblasts as a model, we performed loss-of-function experiments and show that the knockdown of the lncRNAs LINC00622 and LINC01711 result in gene expression changes associated with cellular and inflammatory responses, respectively. Since there are no lncRNA databases focused on fibroblasts and fibrosis, we built a web application, FibroDB, to further promote functional and mechanistic studies of fibrotic lncRNAs