Multimodal Benchmarking for NCAA Basketball

Abstract

We present the first multimodal, multitask benchmark for NCAA basketball, synthesizing structured statistical features with large language model (LLM)-generated game summaries across 19,739 games spanning four NCAA Division I seasons (2021--2025). We evaluate three model families---XGBoost, deep neural networks, and Transformers---under tabular-only and early-fusion settings to measure the impact of LLM-derived textual embeddings. To assess practical utility, we simulate fixed-stake and Kelly criterion-based betting strategies using historical bookmaker odds, analyzing both profitability and downside risk via Monte Carlo simulation. Our results show that XGBoost with early-fusion achieves the highest return on investment and the lowest risk of loss. This work is, to our knowledge, the first to integrate LLM-generated narrative data with structured inputs for calibrated forecasting in sports, offering a reproducible benchmark for multimodal decision-making under uncertainty

Similar works

Full text

thumbnail-image

DigitalCommons@UConn

redirect
Last time updated on 25/08/2025

This paper was published in DigitalCommons@UConn.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.