INRIA a CCSD electronic archive server
Not a member yet
118189 research outputs found
Sort by
Evaluating and leveraging large language models in clinical pharmacology and therapeutics assessment: From exam takers to exam shapers
International audienceAims In medical education, the ability of large language models (LLMs) to match human performance raises questions about their potential as educational tools. This study evaluates LLMs' performance on Clinical Pharmacology and Therapeutics (CPT) exams, comparing their results to medical students and exploring their ability to identify poorly formulated multiple‐choice questions (MCQs). Methods ChatGPT‐4 Omni, Gemini Advanced, Le Chat and DeepSeek R1 were tested on local CPT exams (third year of bachelor's degree, first/second year of master's degree) and the European Prescribing Exam (EuroPE + ). The exams included MCQs and open‐ended questions assessing knowledge and prescribing skills. LLM results were analysed using the same scoring system as students. A confusion matrix was used to evaluate the ability of ChatGPT and Gemini to identify ambiguous/erroneous MCQs. Results LLMs achieved comparable or superior results to medical students across all levels. For local exams, LLMs outperformed M1 students and matched L3 and M2 students. In EuroPE + , LLMs significantly outperformed students in both the knowledge and prescribing skills sections. All LLM errors in EuroPE + were genuine (100%), whereas local exam errors were frequently due to ambiguities or correction flaws (24.3%). When both ChatGPT and Gemini provided the same incorrect answer to an MCQ, the specificity for detecting ambiguous questions was 92.9%, with a negative predictive value of 85.5%. Conclusion LLMs demonstrate capabilities comparable to or exceeding medical students in CPT exams. Their ability to flag potentially flawed MCQs highlights their value not only as educational tools but also as quality control instruments in exam preparation
Curvature-Guided Optimal Transport for Rigid Point Cloud Registration
International audienceThe rigid registration of pairs of point sets is a fundamental step for many downstream tasks including shape analysis, reconstruction and localization. There has been a growing interest in the use of Optimal Transport (OT) for point cloud registration problems. However, these techniques face limited adoption due to scalability issues—rendering them impractical—and their sensitivity to missing data commonly encountered in real-world scans. We consider how geometric information may be incorporated into an OT registration framework for improved accuracy and scalability. In this work, we guide mini-batch selection by binning shape features based on local curvature estimates. We demonstrate that our method achieves better results than other OT-based methods and is comparable to the state-of-the-art in terms of successful registrations
Analyse exploratoire des traces numériques clavier pour la prédiction des niveauxd’apprenants
International audienceThis paper describes a series of metrics developed to analyze writing strategies of learners of English and to possibly classify learner essays in relation to their level of English, expressed in terms of CEFR levels.Cet article présente une typologie des métriques des traces numériques clavier en vue d’uneanalyse des stratégies d’écriture des différents profils d’apprenants appliquée à une tâche deprédiction du niveau CECRL
Joint Deep Missing Value Imputation and Clustering of Satellite Image Time Series
International audienceUnsupervised classification of satellite image time series (SITS) is prominent in numerous multitemporal remote sensing applications. However, when optical images are concerned, a missing value reconstruction task becomes pivotal, due to the impact of cloud cover on the input SITS. This task is usually addressed as a pre-processing step before feeding the time series to a clustering model. In this paper, we propose a pixelwise SITS clustering algorithm which integrates missing value imputation jointly with the clustering task. Moreover, the clustering process is designed to jointly perform both representation learning and cluster assignment, enabling the proposed model to simultaneously tackle three duties (missing value imputation, representation learning, and cluster assignment), while being trained in an end-to-end manner. The experimental results show that the proposed model performs well compared to other models that address time series imputation and clustering independently. Furthermore, a visualization analysis suggests that the proposed model learns both the imputation and clustering effectively despite being trained simultaneously
Brain-wide decoding of numbers and letters: Converging evidence from multivariate fMRI analysis and probabilistic meta-analysis
International audienc
Does genicular artery embolization compromise future knee surgery in patients with knee osteoarthritis? A strategic call to the community
International audienc
Fast Inference with Kronecker-Sparse Matrices
International audienceKronecker-sparse (KS) matrices—whose supports are Kronecker products of identity and all-ones blocks—underpin the structure of Butterfly and Monarch matrices and offer the promise of more efficient models. However, existing GPU kernels for KS matrix multiplication suffer from high data movement costs, with up to 50% of time spent on memory-bound tensor permutations. We propose a fused, output-stationary GPU kernel that eliminates these overheads, reducing global memory traffic threefold. Across 600 KS patterns, our kernel achieves in FP32 a median speedup of x1.4 and lowers energy consumption by 15%. A simple heuristic based on KS pattern parameters predicts when our method outperforms existing ones. We release all code at [github.com/PascalCarrivain/ksmm](https://github.com/PascalCarrivain/ksmm), including a PyTorch-compatible *KSLinear* layer, and demonstrate in FP32 end-to-end latency reductions of up to 22% in ViT-S/16 and 16% in GPT-2 medium
Cryptography from Lossy Reductions: Towards OWFs from ETH, and Beyond
56 pagesOne-way functions (OWFs) form the foundation of modern cryptography, yet their unconditional existence remains a major open question. In this work, we study this question by exploring its relation to lossy reductions, i.e., reductions~ for which it holds that for all distributions~ over inputs of size . Our main result is that either OWFs exist or any lossy reduction for any promise problem~ runs in time~, where is the infimum of the runtime of all (worst-case) solvers of~ on instances of size~. In fact, our result requires a milder condition, that is lossy for sparse uniform distributions (which we call mild-lossiness). It also extends to -reductions as long as is a non-constant permutation-invariant Boolean function, which includes And-, Or-, Maj-, Parity-, Modulo , and Threshold -reductions. Additionally, we show that worst-case to average-case Karp reductions and randomized encodings are special cases of mildly-lossy reductions and improve the runtime above as when these mappings are considered. Restricting to weak fine-grained OWFs, this runtime can be further improved as~. Taking~ as~, our results provide sufficient conditions under which (fine-grained) OWFs exist assuming the Exponential Time Hypothesis (ETH). Conversely, if (fine-grained) OWFs do not exist, we obtain impossibilities on instance compressions (Harnik and Naor, FOCS 2006) and instance randomizations of~ under the ETH. Finally, we partially extend these findings to the quantum setting; the existence of a pure quantum mildly-lossy reduction for within the runtime~ implies the existence of one-way state generators
Learning a neural solver for parametric PDEs to enhance physics-informed methods
International audiencePhysics-informed deep learning often faces optimization challenges due to the complexity of solving partial differential equations (PDEs), which involve exploring large solution spaces, require numerous iterations, and can lead to unstable training. These challenges arise particularly from the ill-conditioning of the optimization problem caused by the differential terms in the loss function. To address these issues, we propose learning a solver, i.e., solving PDEs using a physics-informed iterative algorithm trained on data. Our method learns to condition a gradient descent algorithm that automatically adapts to each PDE instance, significantly accelerating and stabilizing the optimization process and enabling faster convergence of physics-aware models. Furthermore, while traditional physics-informed methods solve for a single PDE instance, our approach extends to parametric PDEs. Specifically, we integrate the physical loss gradient with PDE parameters, allowing our method to solve over a distribution of PDE parameters, including coefficients, initial conditions, and boundary conditions. We demonstrate the effectiveness of our approach through empirical experiments on multiple datasets, comparing both training and test-time optimization performance. The code is available at https://github.com/2ailesB/neural-parametric-solver
Evaluating the Energy Profile of Tasks Managed by Build Automation Tools in Continuous Integration Workflows: The Case of Apache Maven and Gradle
International audienceThere is a growing interest in the energy impact of computing-related activities, which is expected to increase in the coming years. Modern software development usually relies on Continuous Integration (CI) to support short iterations, where code changes are integrated on a daily basis. The implementation of CI workflows usually relies on build automation tools, (e.g. Apache Maven or Gradle), to automate several activities such as testing or compiling. The large adoption of CI practices raises concerns about the impact of such tasks usually run on cloud environments, where the underlying hardware and its associated energy consumption remain intangible to developers. To better understand the energy footprint related to modern software development, it is essential to investigate the energy profile of the tasks managed by Apache Maven and Gradle. To achieve such goal, we performed a large-scale study with 1167 CI workflows implemented through GitHub Actions and mined from popular Java projects hosted on GitHub. After executing them locally, in a controlled environment, we analyzed in depth the energy profile of 183 355 tasks managed by Apache Maven and Gradle. These tasks represent a quarter of the total energy consumption associated with the CI workflows. We found that tasks from workflows of small-sized projects do not necessarily consume less energy than tasks from workflows of medium-sized and largesized projects. We also found that testing-related tasks consume the most energy, and that the larger the project, the higher the percentage of energy consumption related to testing. Moreover, tasks of different categories have a different profile regarding energy consumption per task and per unit of time