8,536 research outputs found
XQ2P: Efficient XQuery P2P Time Series Processing
In this demonstration, we propose a model for the management of XML time
series (TS), using the new XQuery 1.1 window operator. We argue that
centralized computation is slow, and demonstrate XQ2P, our prototype of
efficient XQuery P2P TS computation in the context of financial analysis of
large data sets (>1M values)
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
The WebStand Project
In this paper we present the state of advancement of the French ANR WebStand
project. The objective of this project is to construct a customizable XML based
warehouse platform to acquire, transform, analyze, store, query and export data
from the web, in particular mailing lists, with the final intension of using
this data to perform sociological studies focused on social groups of World
Wide Web, with a specific emphasis on the temporal aspects of this data. We are
currently using this system to analyze the standardization process of the W3C,
through its social network of standard setters
The supply chain for electric car batteries is changing the world's geopolitics
The rising demand for electric vehicles is changing the geopolitical landscape, as the world pivots away from fossil fuels towards the materials critical to the EV supply chain. As manufacturers and countries race to secure the supply of raw materials for EV batteries, new opportunities and geopolitical risks are emerging. Benjamin Jones, Viet Nguyen-Tien, Robert Elliott and Gavin Harper write about the implications of the race for battery-critical resources
Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A
Although large language models (LLMs) perform impressively on many tasks,
overconfidence remains a problem. We hypothesized that on multiple-choice Q&A
tasks, wrong answers would be associated with smaller maximum softmax
probabilities (MSPs) compared to correct answers. We comprehensively evaluate
this hypothesis on ten open-source LLMs and five datasets, and find strong
evidence for our hypothesis among models which perform well on the original Q&A
task. For the six LLMs with the best Q&A performance, the AUROC derived from
the MSP was better than random chance with p < 10^{-4} in 59/60 instances.
Among those six LLMs, the average AUROC ranged from 60% to 69%. Leveraging
these findings, we propose a multiple-choice Q&A task with an option to abstain
and show that performance can be improved by selectively abstaining based on
the MSP of the initial model response. We also run the same experiments with
pre-softmax logits instead of softmax probabilities and find similar (but not
identical) results
- …