315 research outputs found

    Spatial-temporal data mining procedure: LASR

    Full text link
    This paper is concerned with the statistical development of our spatial-temporal data mining procedure, LASR (pronounced ``laser''). LASR is the abbreviation for Longitudinal Analysis with Self-Registration of large-pp-small-nn data. It was motivated by a study of ``Neuromuscular Electrical Stimulation'' experiments, where the data are noisy and heterogeneous, might not align from one session to another, and involve a large number of multiple comparisons. The three main components of LASR are: (1) data segmentation for separating heterogeneous data and for distinguishing outliers, (2) automatic approaches for spatial and temporal data registration, and (3) statistical smoothing mapping for identifying ``activated'' regions based on false-discovery-rate controlled pp-maps and movies. Each of the components is of interest in its own right. As a statistical ensemble, the idea of LASR is applicable to other types of spatial-temporal data sets beyond those from the NMES experiments.Comment: Published at http://dx.doi.org/10.1214/074921706000000707 in the IMS Lecture Notes--Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Wardrop Equilibrium Can Be Boundedly Rational: A New Behavioral Theory of Route Choice

    Full text link
    As one of the most fundamental concepts in transportation science, Wardrop equilibrium (WE) has always had a relatively weak behavioral underpinning. To strengthen this foundation, one must reckon with bounded rationality in human decision-making processes, such as the lack of accurate information, limited computing power, and sub-optimal choices. This retreat from behavioral perfectionism in the literature, however, was typically accompanied by a conceptual modification of WE. Here we show that giving up perfect rationality need not force a departure from WE. On the contrary, WE can be reached with global stability in a routing game played by boundedly rational travelers. We achieve this result by developing a day-to-day (DTD) dynamical model that mimics how travelers gradually adjust their route valuations, hence choice probabilities, based on past experiences. Our model, called cumulative logit (CULO), resembles the classical DTD models but makes a crucial change: whereas the classical models assume routes are valued based on the cost averaged over historical data, ours values the routes based on the cost accumulated. To describe route choice behaviors, the CULO model only uses two parameters, one accounting for the rate at which the future route cost is discounted in the valuation relative to the past ones and the other describing the sensitivity of route choice probabilities to valuation differences. We prove that the CULO model always converges to WE, regardless of the initial point, as long as the behavioral parameters satisfy certain mild conditions. Our theory thus upholds WE's role as a benchmark in transportation systems analysis. It also resolves the theoretical challenge posed by Harsanyi's instability problem by explaining why equally good routes at WE are selected with different probabilities

    Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models

    Full text link
    The recent performance leap of Large Language Models (LLMs) opens up new opportunities across numerous industrial applications and domains. However, erroneous generations, such as false predictions, misinformation, and hallucination made by LLMs, have also raised severe concerns for the trustworthiness of LLMs', especially in safety-, security- and reliability-sensitive scenarios, potentially hindering real-world adoptions. While uncertainty estimation has shown its potential for interpreting the prediction risks made by general machine learning (ML) models, little is known about whether and to what extent it can help explore an LLM's capabilities and counteract its undesired behavior. To bridge the gap, in this paper, we initiate an exploratory study on the risk assessment of LLMs from the lens of uncertainty. In particular, we experiment with twelve uncertainty estimation methods and four LLMs on four prominent natural language processing (NLP) tasks to investigate to what extent uncertainty estimation techniques could help characterize the prediction risks of LLMs. Our findings validate the effectiveness of uncertainty estimation for revealing LLMs' uncertain/non-factual predictions. In addition to general NLP tasks, we extensively conduct experiments with four LLMs for code generation on two datasets. We find that uncertainty estimation can potentially uncover buggy programs generated by LLMs. Insights from our study shed light on future design and development for reliable LLMs, facilitating further research toward enhancing the trustworthiness of LLMs.Comment: 20 pages, 4 figure

    Ammonia Nitrogen Pollution Characteristics of Natural Rainfall in Urban Business District in Southern China: A Case Study of Chengdu City

    Get PDF
    Chengdu city was chosen as the representative of southern cities in China in this work, characteristics of ammonia nitrogen (NH3-N) pollution in natural rainfall were analyzed by measuring the concentration in 15 natural rainfalls from April to September in 2017. The influence of ammonia emission from toilet vent of building on NH3-N pollution in rainfall was investigated, and the variation of total NH3-N pollutants and its influencing factors were expounded. The results showed that the average concentration of NH3-N in first rainfall was the highest, reaching 18.2mg/L, the average concentration of NH3-N in the subsequent 14 rainfalls was between 2.0 and 5.0mg/L, which is higher than Grade V (?2mg/L) of Environmental Quality Standards of Surface Water (GB 3838-2002), and was an important source of NH3-N pollution in water. The concentration of NH3-N in natural rainfalls decreased with the increase of the distance between the sampling point and the toilet vent, indicating that the ammonia discharged from toilet exhaust is a major source of NH3-N pollution in urban atmosphere. The main factors affecting total NH3-N pollutants in natural precipitation include rainfall intensity, rainfall duration and drought days. The total amount of NH3-N pollutants in surface runoff is less than that in natural rainfall

    Modeling lightcurves for improved classification of astronomical objects

    Get PDF
    Many synoptic surveys are observing large parts of the sky multiple times. The resulting time series of light measurements, called lightcurves, provide a wonderful window to the dynamic nature of the Universe. However, there are many significant challenges in analyzing these lightcurves. We describe a modeling-based approach using Gaussian process regression for generating critical measures for the classification of such lightcurves. This method has key advantages over other popular nonparametric regression methods in its ability to deal with censoring, a mixture of sparsely and densely sampled curves, the presence of annual gaps caused by objects not being visible throughout the year from a given position on Earth and known but variable measurement errors. We demonstrate that our approach performs better by showing it has a higher correct classification rate than past methods popular in astronomy. Finally, we provide future directions for use in sky-surveys that are getting even bigger by the day
    • …
    corecore