2,809 research outputs found

    Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology

    Full text link
    The rise of internet-based services and products in the late 1990's brought about an unprecedented opportunity for online businesses to engage in large scale data-driven decision making. Over the past two decades, organizations such as Airbnb, Alibaba, Amazon, Baidu, Booking, Alphabet's Google, LinkedIn, Lyft, Meta's Facebook, Microsoft, Netflix, Twitter, Uber, and Yandex have invested tremendous resources in online controlled experiments (OCEs) to assess the impact of innovation on their customers and businesses. Running OCEs at scale has presented a host of challenges requiring solutions from many domains. In this paper we review challenges that require new statistical methodologies to address them. In particular, we discuss the practice and culture of online experimentation, as well as its statistics literature, placing the current methodologies within their relevant statistical lineages and providing illustrative examples of OCE applications. Our goal is to raise academic statisticians' awareness of these new research opportunities to increase collaboration between academia and the online industry

    Upgrading of the decision-making process for system development

    Get PDF
    The planning horizon for a system development planner is long and a system planner must take into account potential changes to generation capacity and demand during that horizon. The uncertainties that affect system development are greater than those that apply, for example, in system operations. Thus, it is important to build up a credible set of operating states that is informed by various uncertainties and adequately represents the range of conditions that might reasonably be expected to arise. The next step in the system development process is to assess these operating states for system operability. If a power system is not operable on some (probable) operating states, then it identifies a potential need for investment in the system. However, given the number and range of uncertainties relevant to system development, pragmatic approaches must also be developed allowing their assessment. In this report, a comprehensive system development approach is presented

    Lost in optimisation of water distribution systems? A literature review of system design

    Get PDF
    This is the final version of the article. Available from MDPI via the DOI in this record.Optimisation of water distribution system design is a well-established research field, which has been extremely productive since the end of the 1980s. Its primary focus is to minimise the cost of a proposed pipe network infrastructure. This paper reviews in a systematic manner articles published over the past three decades, which are relevant to the design of new water distribution systems, and the strengthening, expansion and rehabilitation of existing water distribution systems, inclusive of design timing, parameter uncertainty, water quality, and operational considerations. It identifies trends and limits in the field, and provides future research directions. Exclusively, this review paper also contains comprehensive information from over one hundred and twenty publications in a tabular form, including optimisation model formulations, solution methodologies used, and other important details

    Adaptive swarm optimisation assisted surrogate model for pipeline leak detection and characterisation.

    Get PDF
    Pipelines are often subject to leakage due to ageing, corrosion and weld defects. It is difficult to avoid pipeline leakage as the sources of leaks are diverse. Various pipeline leakage detection methods, including fibre optic, pressure point analysis and numerical modelling, have been proposed during the last decades. One major issue of these methods is distinguishing the leak signal without giving false alarms. Considering that the data obtained by these traditional methods are digital in nature, the machine learning model has been adopted to improve the accuracy of pipeline leakage detection. However, most of these methods rely on a large training dataset for accurate training models. It is difficult to obtain experimental data for accurate model training. Some of the reasons include the huge cost of an experimental setup for data collection to cover all possible scenarios, poor accessibility to the remote pipeline, and labour-intensive experiments. Moreover, datasets constructed from data acquired in laboratory or field tests are usually imbalanced, as leakage data samples are generated from artificial leaks. Computational fluid dynamics (CFD) offers the benefits of providing detailed and accurate pipeline leakage modelling, which may be difficult to obtain experimentally or with the aid of analytical approach. However, CFD simulation is typically time-consuming and computationally expensive, limiting its pertinence in real-time applications. In order to alleviate the high computational cost of CFD modelling, this study proposed a novel data sampling optimisation algorithm, called Adaptive Particle Swarm Optimisation Assisted Surrogate Model (PSOASM), to systematically select simulation scenarios for simulation in an adaptive and optimised manner. The algorithm was designed to place a new sample in a poorly sampled region or regions in parameter space of parametrised leakage scenarios, which the uniform sampling methods may easily miss. This was achieved using two criteria: population density of the training dataset and model prediction fitness value. The model prediction fitness value was used to enhance the global exploration capability of the surrogate model, while the population density of training data samples is beneficial to the local accuracy of the surrogate model. The proposed PSOASM was compared with four conventional sequential sampling approaches and tested on six commonly used benchmark functions in the literature. Different machine learning algorithms are explored with the developed model. The effect of the initial sample size on surrogate model performance was evaluated. Next, pipeline leakage detection analysis - with much emphasis on a multiphase flow system - was investigated in order to find the flow field parameters that provide pertinent indicators in pipeline leakage detection and characterisation. Plausible leak scenarios which may occur in the field were performed for the gas-liquid pipeline using a three-dimensional RANS CFD model. The perturbation of the pertinent flow field indicators for different leak scenarios is reported, which is expected to help in improving the understanding of multiphase flow behaviour induced by leaks. The results of the simulations were validated against the latest experimental and numerical data reported in the literature. The proposed surrogate model was later applied to pipeline leak detection and characterisation. The CFD modelling results showed that fluid flow parameters are pertinent indicators in pipeline leak detection. It was observed that upstream pipeline pressure could serve as a critical indicator for detecting leakage, even if the leak size is small. In contrast, the downstream flow rate is a dominant leakage indicator if the flow rate monitoring is chosen for leak detection. The results also reveal that when two leaks of different sizes co-occur in a single pipe, detecting the small leak becomes difficult if its size is below 25% of the large leak size. However, in the event of a double leak with equal dimensions, the leak closer to the pipe upstream is easier to detect. The results from all the analyses demonstrate the PSOASM algorithm's superiority over the well-known sequential sampling schemes employed for evaluation. The test results show that the PSOASM algorithm can be applied for pipeline leak detection with limited training datasets and provides a general framework for improving computational efficiency using adaptive surrogate modelling in various real-life applications

    Toward an Understanding of the Progenitors of Gamma-Ray Bursts

    Get PDF
    The various possibilities for the progenitors of gamma-ray bursts (GRBs) manifest in differing observable properties. Through deep spectroscopic and high-resolution imaging observations of some GRB hosts, I demonstrate that well-localized long-duration GRBs are connected with otherwise normal star-forming galaxies at moderate redshifts of order unity. I test various progenitor scenarios by examining the offset distribution of GRBs about their apparent hosts, making extensive use of ground-based optical data from Keck and Palomar and space-based imaging from the Hubble Space Telescope. The offset distribution appears to be inconsistent with the coalescing neutron star binary hypothesis but statistically consistent with a population of progenitors that closely traces the ultra-violet light of galaxies. This is naturally explained by bursts which originate from the collapse of massive stars. This claim is further supported by the unambiguous detections of emission ''bumps'' which can be explained as supernovae that occur at approximately the same time as the associated GRB; if true, GRB 980326 and GRB 011121 provide strong observational evidence connecting cosmological GRBs to high-redshift supernovae and implicate massive stars as the progenitors of some long-duration GRBs. Interestingly, most alternative models of these bumps require wind-stratified circumburst media; this too, implicates massive stars. In addition to this work, I also constructed the Jacobs Camera (JCAM), a dual-beam optical camera for the Palomar 200-inch Telescope designed to follow-up rapid GRB localizations (abridged).Comment: Ph.D. thesis, Caltech. 196 pages including low-resolution figures. Abstract to be published in PASP, February 2003. Defended April 1, 2002. A high-resolution PDF version may be found at http://www-cfa.harvard.edu/~jbloom/thesis.htm
    • …
    corecore