232 research outputs found
Laterally constrained low-rank seismic data completion via cyclic-shear transform
A crucial step in seismic data processing consists in reconstructing the
wavefields at spatial locations where faulty or absent sources and/or receivers
result in missing data. Several developments in seismic acquisition and
interpolation strive to restore signals fragmented by sampling limitations;
still, seismic data frequently remain poorly sampled in the source, receiver,
or both coordinates. An intrinsic limitation of real-life dense acquisition
systems, which are often exceedingly expensive, is that they remain unable to
circumvent various physical and environmental obstacles, ultimately hindering a
proper recording scheme. In many situations, when the preferred reconstruction
method fails to render the actual continuous signals, subsequent imaging
studies are negatively affected by sampling artefacts. A recent alternative
builds on low-rank completion techniques to deliver superior restoration
results on seismic data, paving the way for data kernel compression that can
potentially unlock multiple modern processing methods so far prohibited in 3D
field scenarios. In this work, we propose a novel transform domain revealing
the low-rank character of seismic data that prevents the inherent matrix
enlargement introduced when the data are sorted in the midpoint-offset domain
and develop a robust extension of the current matrix completion framework to
account for lateral physical constraints that ensure a degree of proximity
similarity among neighbouring points. Our strategy successfully interpolates
missing sources and receivers simultaneously in synthetic and field data
Voltage sag state estimation based on l_1-norm minimization methods in radial electric power distribution system
Los hundimientos de tensión tienen un alto impacto sobre la correcta operación de equipos y en la continuidad de los procesos en el usuario final de energía eléctrica. Las pérdidas económicas son un problema en crecimiento para las empresas operadoras, para los mismos reguladores y por supuesto para los consumidores finales del servicio de energía eléctrica; es así como se hace necesario la formulación de nuevos métodos matemáticos para el diagnóstico de los hundimientos de tensión. En este sentido, los métodos de estimación de estado buscan determinar la frecuencia o el número de hundimientos de tensión que experimenta un usuario final. En esta área de investigación se han formulado problemas de optimización basados en técnicas como la descomposición en valores singulares, el ajuste de perfiles de tensión y la localización de las fuentes generadoras de los hundimientos de tensión. Los resultados obtenidos usando estas técnicas son imprecisas cuando se consideran las corrientes pre-falla, las fallas con impedancia diferente de cero y los desbalances. Es así como en este artículo se evidenciará que, al considerar estas condiciones reales de las fallas, se obtienen resultados imprecisos para el caso del método de descomposición en valores singulares. A su vez, en este trabajo se propone una nueva formulación matemática del problema de estimación de estado de hundimientos de tensión usando la minimización de la norma-ℓ1. Esta propuesta matemática es aplicada y validada en la red de distribución de prueba de 33 nodos del IEEE. Únicamente los hundimientos de tensión causados por fallos en la red de distribución serán considerados. Los resultados obtenidos validan una notable mejora en comparación con el método de descomposición en valores singulares y resaltan una innovadora herramienta para la estimación de estado de los hundimientos de tensión en redes radiales de distribución.Voltage sags have a high impact on the proper equipment operation and the electric power end-user processes continuity. Economic losses are a growing problem for the electric utilities, regulators and electric energy final customers and therefore, the formulation of new mathematical methods for voltage sags diagnosis are needed. In this sense, the state estimation methods seek the determination of the frequency or the number of voltage sags that an end-user would experience. In this research area, optimization problems based on techniques such as singular value decomposition, voltage profile curve fitting and voltage sag source location have been formulated. The results of these approaches may be inaccurate when the pre-fault currents, non-zero fault impedances and unbalanced conditions are considered. We will evidence that the results from singular value decomposition method are inaccurate considering these real fault conditions. Also, a new mathematical formulation of the voltage sag state estimation problem based on ℓ1-norm minimization is proposed in this work. The proposed method is applied and validated to the IEEE 33-node test distribution network. Voltage sags caused for network faults are only considered. The results validate a remarkable improvement in comparison with the singular value decomposition method and show an innovative tool for voltage sags state estimation in radial electric power distribution system
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Integrated load and state estimation using domestic smart meter measurements
The UK Government is promoting the decarbonisation of the power sector. The electrification of transport and heating, installation of distributed generators, development of smart grids and creation of an electricity and gas smart metering system are in progress.
Higher penetrations of distributed generation and low carbon loads may lead to operational difficulties in distribution networks. Therefore, increased real-time monitoring and control becomes a necessary requirement. Distribution network operators will have available to them smart meter measurements to facilitate safe and cost-effective operation of distribution networks. This thesis investigates the application of smart meter measurements to extend the observability of distribution networks.
Three main aspects were covered in this work:
1. The development of a cluster analysis algorithm to extract consumption patterns from smart meter measurements. Th
Ground-based synthetic aperture radar (GBSAR) interferometry for deformation monitoring
Ph. D ThesisGround-based synthetic aperture radar (GBSAR), together with interferometry, represents a powerful tool for deformation monitoring. GBSAR has inherent flexibility, allowing data to be collected with adjustable temporal resolutions through either continuous or discontinuous mode. The goal of this research is to develop a framework to effectively utilise GBSAR for deformation monitoring in both modes, with the emphasis on accuracy, robustness, and real-time capability.
To achieve this goal, advanced Interferometric SAR (InSAR) processing algorithms have been proposed to address existing issues in conventional interferometry for GBSAR deformation monitoring. The proposed interferometric algorithms include a new non-local method for the accurate estimation of coherence and interferometric phase, a new approach to selecting coherent pixels with the aim of maximising the density of selected pixels and optimizing the reliability of time series analysis, and a rigorous model for the correction of atmospheric and repositioning errors.
On the basis of these algorithms, two complete interferometric processing chains have been developed: one for continuous and the other for discontinuous GBSAR deformation monitoring. The continuous chain is able to process infinite incoming images in real time and extract the evolution of surface movements through temporally coherent pixels. The discontinuous chain integrates additional automatic coregistration of images and correction of repositioning errors between different campaigns.
Successful deformation monitoring applications have been completed, including three continuous (a dune, a bridge, and a coastal cliff) and one discontinuous (a hillside), which have demonstrated the feasibility and effectiveness of the presented algorithms and chains for high-accuracy GBSAR interferometric measurement. Significant deformation signals were detected from the three continuous applications and no deformation from the discontinuous. The achieved results are justified quantitatively via a defined precision indicator for the time series estimation and validated qualitatively via a priori knowledge of these observing sites.China Scholarship Council (CSC), Newcastle Universit
Statistical Methods for Analyzing Large Scale Biological Data
With the development of high-throughput biomedical technologies in recent years, the size of a typical biological dataset is increasing at a fast pace, especially in the genomics, proteomics and metabolomics literatures. Typically, these large datasets contain a huge amount of information on each subject, where the number of subjects can range from small to often extremely large. The challenges of analyzing these large datasets are twofold, namely the problem of high-dimensionality, and the heavy computational burden associated with analyzing them. The goal of this dissertation is to develop statistical and computational methods to address some of these challenges in order to provide researchers with analytical tools that are scalable to handle these large datasets, as well as able to solve the issues arising from high-dimensionality.
In Chapter II, we study the asymptotic behaviors of principal component analysis (PCA) in high-dimensional data under the generalized spiked population model. We propose a series of methods for the consistent estimation of the population eigenvalues, angles between the sample and population eigenvectors, correlation coefficients between the sample and population principal component (PC) scores, and the shrinkage-bias adjustment for the predicted PC scores.
In Chapter III, we investigate the over-fitting problem of partial least squares (PLS) regression with high-dimensional predictors, which can result in the predicted and observed outcomes being almost identical, even when the outcome is independent of the predictor. We further discuss a shrinkage-bias problem similar to the shrinkage-bias in high-dimensional PCA, and propose a two-stage PLS (TPLS) method that can address both of these problems.
In Chapter IV, we focus on the large-scale genome-wide or phenome-wide association studies (GWASs or PheWASs) of the electronic health records (EHR) or biobank-based binary phenotypes. Due to the severe case-control imbalance in most of the EHR or biobank-based binary phenotypes, the existing methods cannot provide a scalable and accurate way to analyze them. We develop a computationally efficient single-variant test, that is ~100 times faster than the state of the art Firth's test, and can provide well-calibrated p values even for phenotypes with extremely unbalanced case-control ratios. Further, our test can adjust for non-genetic covariates, and can retain similar power as the Firth's test.
In Chapter V, we show that due to the severe case-control imbalance in most of the biobank-based binary phenotypes, applying the traditional Z-score-based method to meta-analyze the association results across multiple biobank-based association studies, can result in conservative or anti-conservative p values. We propose two alternative meta-analysis methods that can provide well-calibrated meta-analysis p values, even when the individual studies are extremely unbalanced in their case-control ratios. Our first method involves sharing an approximation of the distribution of the score test statistic from each study using cubic Hermite splines, and the second method involves sharing the overall genotype counts from each study.
In summary, the purpose of this dissertation is to develop statistical and computational methods that can efficiently utilize the ever-growing nature of modern biological datasets, and facilitate researchers by addressing some of the problems associated with the high-dimensionality of the datasets, as well as by reducing the heavy computational burden of analyzing these large datasets.PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/146022/1/deyrnk_1.pd
- …