484 research outputs found
LEGOBench: Scientific Leaderboard Generation Benchmark
The ever-increasing volume of paper submissions makes it difficult to stay
informed about the latest state-of-the-art research. To address this challenge,
we introduce LEGOBench, a benchmark for evaluating systems that generate
scientific leaderboards. LEGOBench is curated from 22 years of preprint
submission data on arXiv and more than 11k machine learning leaderboards on the
PapersWithCode portal. We present four graph-based and two language model-based
leaderboard generation task configurations. We evaluate popular encoder-only
scientific language models as well as decoder-only large language models across
these task configurations. State-of-the-art models showcase significant
performance gaps in automatic leaderboard generation on LEGOBench. The code is
available on GitHub ( https://github.com/lingo-iitgn/LEGOBench ) and the
dataset is hosted on OSF (
https://osf.io/9v2py/?view_only=6f91b0b510df498ba01595f8f278f94c )
Spider4SPARQL: A Complex Benchmark for Evaluating Knowledge Graph Question Answering Systems
With the recent spike in the number and availability of Large Language Models
(LLMs), it has become increasingly important to provide large and realistic
benchmarks for evaluating Knowledge Graph Question Answering (KBQA) systems. So
far the majority of benchmarks rely on pattern-based SPARQL query generation
approaches. The subsequent natural language (NL) question generation is
conducted through crowdsourcing or other automated methods, such as rule-based
paraphrasing or NL question templates. Although some of these datasets are of
considerable size, their pitfall lies in their pattern-based generation
approaches, which do not always generalize well to the vague and linguistically
diverse questions asked by humans in real-world contexts.
In this paper, we introduce Spider4SPARQL - a new SPARQL benchmark dataset
featuring 9,693 previously existing manually generated NL questions and 4,721
unique, novel, and complex SPARQL queries of varying complexity. In addition to
the NL/SPARQL pairs, we also provide their corresponding 166 knowledge graphs
and ontologies, which cover 138 different domains. Our complex benchmark
enables novel ways of evaluating the strengths and weaknesses of modern KGQA
systems. We evaluate the system with state-of-the-art KGQA systems as well as
LLMs, which achieve only up to 45\% execution accuracy, demonstrating that
Spider4SPARQL is a challenging benchmark for future research
Recommended from our members
Tailored gamification and serious game framework based on fuzzy logic for saving energy in connected thermostats
Connected thermostats (CTs) often save less energy than predicted because consumers may not know how to use them and may not be engaged in saving energy. Additionally, several models perform contrary to consumers’ expectations and are thus not used the way they are intended to. As a result, CTs save less energy and are underused in households. This paper reviews aspects of gamification and serious games focused on engaging consumers. A gamification and serious games framework is proposed for saving energy that is tailored by a fuzzy logic system to motivate connected thermostat consumers. This intelligent gamification framework can be used to customize the gamification and serious game strategy to each consumer so that fuzzy logic systems can be adapted according to the requirements of each consumer. The framework is designed to teach, engage, and motivate consumers while helping them save electrical energy when using their thermostats. It is described the proposed framework as well as a mockup that can be run on a cellphone. Although this framework is designed to be implemented in CTs, it can be translated to their energy devices in smart homes
ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals
Motion forecasting is a key module in an autonomous driving system. Due to
the heterogeneous nature of multi-sourced input, multimodality in agent
behavior, and low latency required by onboard deployment, this task is
notoriously challenging. To cope with these difficulties, this paper proposes a
novel agent-centric model with anchor-informed proposals for efficient
multimodal motion prediction. We design a modality-agnostic strategy to
concisely encode the complex input in a unified manner. We generate diverse
proposals, fused with anchors bearing goal-oriented scene context, to induce
multimodal prediction that covers a wide range of future trajectories. Our
network architecture is highly uniform and succinct, leading to an efficient
model amenable for real-world driving deployment. Experiments reveal that our
agent-centric network compares favorably with the state-of-the-art methods in
prediction accuracy, while achieving scene-centric level inference latency.Comment: CVPR 2023 (Highlight
Recommended from our members
Open Research Knowledge Graph
As we mark the fifth anniversary of the alpha release of the Open Research
Knowledge Graph (ORKG), it is both timely and exhilarating to celebrate the significant
strides made in this pioneering project. We designed this book as a tribute
to the evolution and achievements of the ORKG and as a practical guide encapsulating
its essence in a form that resonates with both the general reader and the
specialist.
The ORKG has opened a new era in the way scholarly knowledge is curated, managed,
and disseminated. By transforming vast arrays of unstructured narrative text
into structured, machine-processable knowledge, the ORKG has emerged as an
essential service with sophisticated functionalities. Over the past five years, our
team has developed the ORKG into a vibrant platform that enhances the accessibility
and visibility of scientific research. This book serves as a non-technical guide
and a comprehensive reference for new and existing users that outlines the
ORKG’s approach, technologies, and its role in revolutionizing scholarly communication.
By elucidating how the ORKG facilitates the collection, enhancement, and
sharing of knowledge, we invite readers to appreciate the value and potential of
this groundbreaking digital tool presented in a tangible form.
Looking ahead, we are thrilled to announce the upcoming unveiling of promising
new features and tools at the fifth-year celebration of the ORKG’s alpha release.
These innovations are set to redefine the boundaries of machine assistance enabled
by research knowledge graphs. Among these enhancements, you can expect
more intuitive interfaces that simplify the user experience, and enhanced machine learning
models that improve the automation and accuracy of data curation.
We also included a glossary tailored to clarifying key terms and concepts associated
with the ORKG to ensure that all readers, regardless of their technical background,
can fully engage with and understand the content presented. This book
transcends the boundaries of a typical technical report. We crafted this as an inspiration
for future applications, a testament to the ongoing evolution in scholarly
communication that invites further collaboration and innovation. Let this book serve
as both your guide and invitation to explore the ORKG as it continues to grow and
shape the landscape of scientific inquiry and communication
OpenAssistant Conversations -- Democratizing Large Language Model Alignment
Aligning large language models (LLMs) with human preferences has proven to
drastically improve usability and has driven rapid adoption as demonstrated by
ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and
reinforcement learning from human feedback (RLHF) greatly reduce the required
skill and domain knowledge to effectively harness the capabilities of LLMs,
increasing their accessibility and utility across various domains. However,
state-of-the-art alignment techniques like RLHF rely on high-quality human
feedback data, which is expensive to create and often remains proprietary. In
an effort to democratize research on large-scale alignment, we release
OpenAssistant Conversations, a human-generated, human-annotated assistant-style
conversation corpus consisting of 161,443 messages in 35 different languages,
annotated with 461,292 quality ratings, resulting in over 10,000 complete and
fully annotated conversation trees. The corpus is a product of a worldwide
crowd-sourcing effort involving over 13,500 volunteers. Models trained on
OpenAssistant Conversations show consistent improvements on standard benchmarks
over respective base models. We release our code and data under a fully
permissive licence.Comment: Published in NeurIPS 2023 Datasets and Benchmark
Unsupervised learning for vascular heterogeneity assessment of glioblastoma based on magnetic resonance imaging: The Hemodynamic Tissue Signature
[ES] El futuro de la imagen médica está ligado a la inteligencia artificial. El análisis manual de imágenes médicas es hoy en día una tarea ardua, propensa a errores y a menudo inasequible para los humanos, que ha llamado la atención de la comunidad de Aprendizaje Automático (AA). La Imagen por Resonancia Magnética (IRM) nos proporciona una rica variedad de representaciones de la morfología y el comportamiento de lesiones inaccesibles sin una intervención invasiva arriesgada. Sin embargo, explotar la potente pero a menudo latente información contenida en la IRM es una tarea muy complicada, que requiere técnicas de análisis computacional inteligente.
Los tumores del sistema nervioso central son una de las enfermedades más críticas estudiadas a través de IRM. Específicamente, el glioblastoma representa un gran desafío, ya que, hasta la fecha, continua siendo un cáncer letal que carece de una terapia satisfactoria. Del conjunto de características que hacen del glioblastoma un tumor tan agresivo, un aspecto particular que ha sido ampliamente estudiado es su heterogeneidad vascular. La fuerte proliferación vascular del glioblastoma, así como su robusta angiogénesis han sido consideradas responsables de la alta letalidad de esta neoplasia.
Esta tesis se centra en la investigación y desarrollo del método Hemodynamic Tissue Signature (HTS): un método de AA no supervisado para describir la heterogeneidad vascular de los glioblastomas mediante el análisis de perfusión por IRM. El método HTS se basa en el concepto de hábitat, que se define como una subregión de la lesión con un perfil de IRM que describe un comportamiento fisiológico concreto. El método HTS delinea cuatro hábitats en el glioblastoma: el hábitat HAT, como la región más perfundida del tumor con captación de contraste; el hábitat LAT, como la región del tumor con un perfil angiogénico más bajo; el hábitat IPE, como la región adyacente al tumor con índices de perfusión elevados; y el hábitat VPE, como el edema restante de la lesión con el perfil de perfusión más bajo. La investigación y desarrollo de este método ha originado una serie de contribuciones enmarcadas en esta tesis.
Primero, para verificar la fiabilidad de los métodos de AA no supervisados en la extracción de patrones de IRM, se realizó una comparativa para la tarea de segmentación de gliomas de grado alto. Segundo, se propuso un algoritmo de AA no supervisado dentro de la familia de los Spatially Varying Finite Mixture Models. El algoritmo propone una densidad a priori basada en un Markov Random Field combinado con la función probabilística Non-Local Means, para codificar la idea de que píxeles vecinos tienden a pertenecer al mismo objeto. Tercero, se presenta el método HTS para describir la heterogeneidad vascular del glioblastoma. El método se ha aplicado a casos reales en una cohorte local de un solo centro y en una cohorte internacional de más de 180 pacientes de 7 centros europeos. Se llevó a cabo una evaluación exhaustiva del método para medir el potencial pronóstico de los hábitats HTS. Finalmente, la tecnología desarrollada en la tesis se ha integrado en la plataforma online ONCOhabitats (https://www.oncohabitats.upv.es). La plataforma ofrece dos servicios: 1) segmentación de tejidos de glioblastoma, y 2) evaluación de la heterogeneidad vascular del tumor mediante el método HTS.
Los resultados de esta tesis han sido publicados en diez contribuciones científicas, incluyendo revistas y conferencias de alto impacto en las áreas de Informática Médica, Estadística y Probabilidad, Radiología y Medicina Nuclear y Aprendizaje Automático. También se emitió una patente industrial registrada en España, Europa y EEUU. Finalmente, las ideas originales concebidas en esta tesis dieron lugar a la creación de ONCOANALYTICS CDX, una empresa enmarcada en el modelo de negocio de los companion diagnostics de compuestos farmacéuticos.[EN] The future of medical imaging is linked to Artificial Intelligence (AI). The manual analysis of medical images is nowadays an arduous, error-prone and often unaffordable task for humans, which has caught the attention of the Machine Learning (ML) community. Magnetic Resonance Imaging (MRI) provides us with a wide variety of rich representations of the morphology and behavior of lesions completely inaccessible without a risky invasive intervention. Nevertheless, harnessing the powerful but often latent information contained in MRI acquisitions is a very complicated task, which requires computational intelligent analysis techniques.
Central nervous system tumors are one of the most critical diseases studied through MRI. Specifically, glioblastoma represents a major challenge, as it remains a lethal cancer that, to date, lacks a satisfactory therapy. Of the entire set of characteristics that make glioblastoma so aggressive, a particular aspect that has been widely studied is its vascular heterogeneity. The strong vascular proliferation of glioblastomas, as well as their robust angiogenesis and extensive microvasculature heterogeneity have been claimed responsible for the high lethality of the neoplasm.
This thesis focuses on the research and development of the Hemodynamic Tissue Signature (HTS) method: an unsupervised ML approach to describe the vascular heterogeneity of glioblastomas by means of perfusion MRI analysis. The HTS builds on the concept of habitats. A habitat is defined as a sub-region of the lesion with a particular MRI profile describing a specific physiological behavior. The HTS method delineates four habitats within the glioblastoma: the HAT habitat, as the most perfused region of the enhancing tumor; the LAT habitat, as the region of the enhancing tumor with a lower angiogenic profile; the potentially IPE habitat, as the non-enhancing region adjacent to the tumor with elevated perfusion indexes; and the VPE habitat, as the remaining edema of the lesion with the lowest perfusion profile. The research and development of the HTS method has generated a number of contributions to this thesis.
First, in order to verify that unsupervised learning methods are reliable to extract MRI patterns to describe the heterogeneity of a lesion, a comparison among several unsupervised learning methods was conducted for the task of high grade glioma segmentation. Second, a Bayesian unsupervised learning algorithm from the family of Spatially Varying Finite Mixture Models is proposed. The algorithm integrates a Markov Random Field prior density weighted by the probabilistic Non-Local Means function, to codify the idea that neighboring pixels tend to belong to the same semantic object. Third, the HTS method to describe the vascular heterogeneity of glioblastomas is presented. The HTS method has been applied to real cases, both in a local single-center cohort of patients, and in an international retrospective cohort of more than 180 patients from 7 European centers. A comprehensive evaluation of the method was conducted to measure the prognostic potential of the HTS habitats. Finally, the technology developed in this thesis has been integrated into an online open-access platform for its academic use. The ONCOhabitats platform is hosted at https://www.oncohabitats.upv.es, and provides two main services: 1) glioblastoma tissue segmentation, and 2) vascular heterogeneity assessment of glioblastomas by means of the HTS method.
The results of this thesis have been published in ten scientific contributions, including top-ranked journals and conferences in the areas of Medical Informatics, Statistics and Probability, Radiology & Nuclear Medicine and Machine Learning. An industrial patent registered in Spain, Europe and EEUU was also issued. Finally, the original ideas conceived in this thesis led to the foundation of ONCOANALYTICS CDX, a company framed into the business model of companion diagnostics for pharmaceutical compounds.[CA] El futur de la imatge mèdica està lligat a la intel·ligència artificial. L'anàlisi manual d'imatges mèdiques és hui dia una tasca àrdua, propensa a errors i sovint inassequible per als humans, que ha cridat l'atenció de la comunitat d'Aprenentatge Automàtic (AA). La Imatge per Ressonància Magnètica (IRM) ens proporciona una àmplia varietat de representacions de la morfologia i el comportament de lesions inaccessibles sense una intervenció invasiva arriscada. Tanmateix, explotar la potent però sovint latent informació continguda a les adquisicions de IRM esdevé una tasca molt complicada, que requereix tècniques d'anàlisi computacional intel·ligent.
Els tumors del sistema nerviós central són una de les malalties més crítiques estudiades a través de IRM. Específicament, el glioblastoma representa un gran repte, ja que, fins hui, continua siguent un càncer letal que manca d'una teràpia satisfactòria. Del conjunt de característiques que fan del glioblastoma un tumor tan agressiu, un aspecte particular que ha sigut àmpliament estudiat és la seua heterogeneïtat vascular. La forta proliferació vascular dels glioblastomes, així com la seua robusta angiogènesi han sigut considerades responsables de l'alta letalitat d'aquesta neoplàsia.
Aquesta tesi es centra en la recerca i desenvolupament del mètode Hemodynamic Tissue Signature (HTS): un mètode d'AA no supervisat per descriure l'heterogeneïtat vascular dels glioblastomas mitjançant l'anàlisi de perfusió per IRM. El mètode HTS es basa en el concepte d'hàbitat, que es defineix com una subregió de la lesió amb un perfil particular d'IRM, que descriu un comportament fisiològic concret. El mètode HTS delinea quatre hàbitats dins del glioblastoma: l'hàbitat HAT, com la regió més perfosa del tumor amb captació de contrast; l'hàbitat LAT, com la regió del tumor amb un perfil angiogènic més baix; l'hàbitat IPE, com la regió adjacent al tumor amb índexs de perfusió elevats, i l'hàbitat VPE, com l'edema restant de la lesió amb el perfil de perfusió més baix. La recerca i desenvolupament del mètode HTS ha originat una sèrie de contribucions emmarcades a aquesta tesi.
Primer, per verificar la fiabilitat dels mètodes d'AA no supervisats en l'extracció de patrons d'IRM, es va realitzar una comparativa en la tasca de segmentació de gliomes de grau alt. Segon, s'ha proposat un algorisme d'AA no supervisat dintre de la família dels Spatially Varying Finite Mixture Models. L'algorisme proposa un densitat a priori basada en un Markov Random Field combinat amb la funció probabilística Non-Local Means, per a codificar la idea que els píxels veïns tendeixen a pertànyer al mateix objecte semàntic. Tercer, es presenta el mètode HTS per descriure l'heterogeneïtat vascular dels glioblastomas. El mètode HTS s'ha aplicat a casos reals en una cohort local d'un sol centre i en una cohort internacional de més de 180 pacients de 7 centres europeus. Es va dur a terme una avaluació exhaustiva del mètode per mesurar el potencial pronòstic dels hàbitats HTS. Finalment, la tecnologia desenvolupada en aquesta tesi s'ha integrat en una plataforma online ONCOhabitats (https://www.oncohabitats.upv.es). La plataforma ofereix dos serveis: 1) segmentació dels teixits del glioblastoma, i 2) avaluació de l'heterogeneïtat vascular dels glioblastomes mitjançant el mètode HTS.
Els resultats d'aquesta tesi han sigut publicats en deu contribucions científiques, incloent revistes i conferències de primer nivell a les àrees d'Informàtica Mèdica, Estadística i Probabilitat, Radiologia i Medicina Nuclear i Aprenentatge Automàtic. També es va emetre una patent industrial registrada a Espanya, Europa i els EEUU. Finalment, les idees originals concebudes en aquesta tesi van donar lloc a la creació d'ONCOANALYTICS CDX, una empresa emmarcada en el model de negoci dels companion diagnostics de compostos farmacèutics.En este sentido quiero agradecer a las diferentes instituciones y estructuras de financiación de investigación que han contribuido al desarrollo de esta tesis. En especial quiero agradecer a la Universitat Politècnica de València, donde he desarrollado toda mi carrera acadèmica y científica, así como al Ministerio de Ciencia e Innovación, al Ministerio de Economía y Competitividad, a la Comisión Europea, al EIT Health Programme y a la fundación Caixa ImpulseJuan Albarracín, J. (2020). Unsupervised learning for vascular heterogeneity assessment of glioblastoma based on magnetic resonance imaging: The Hemodynamic Tissue Signature [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/149560TESI
Spider4SPARQL : a complex benchmark for evaluating knowledge graph question answering systems
With the recent spike in the number and availability of Large Language Models (LLMs), it has become increasingly important to provide large and realistic benchmarks for evaluating Knowledge Graph Question Answering (KBQA) systems. So far the majority of benchmarks rely on pattern-based SPARQL query generation approaches. The subsequent natural language (NL) question generation is conducted through crowdsourcing or other automated methods, such as rule-based paraphrasing or NL question templates. Although some of these datasets are of considerable size, their pitfall lies in their pattern-based generation approaches, which do not always generalize well to the vague and linguistically diverse questions asked by humans in real-world contexts. In this paper, we introduce Spider4SPARQL - a new SPARQL benchmark dataset featuring 9,693 previously existing manually generated NL questions and 4,721 unique, novel, and complex SPARQL queries of varying complexity. In addition to the NL/SPARQL pairs, we also provide their corresponding 166 knowledge graphs and ontologies, which cover 138 different domains. Our complex benchmark enables novel ways of evaluating the strengths and weaknesses of modern KGQA systems. We evaluate the system with state-of-the-art KGQA systems as well as LLMs, which achieve only up to 45% execution accuracy, demonstrating that Spider4SPARQL is a challenging benchmark for future research
Translating Natural Language Queries to SQL Using the T5 Model
This paper presents the development process of a natural language to SQL
model using the T5 model as the basis. The models, developed in August 2022 for
an online transaction processing system and a data warehouse, have a 73\% and
84\% exact match accuracy respectively. These models, in conjunction with other
work completed in the research project, were implemented for several companies
and used successfully on a daily basis. The approach used in the model
development could be implemented in a similar fashion for other database
environments and with a more powerful pre-trained language model
- …