3,532 research outputs found
Towards provably efficient quantum algorithms for large-scale machine-learning models
Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as O(T2 x polylog(n)), where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems
Energy storage design and integration in power systems by system-value optimization
Energy storage can play a crucial role in decarbonising power systems by balancing
power and energy in time. Wider power system benefits that arise from these
balancing technologies include lower grid expansion, renewable curtailment, and
average electricity costs. However, with the proliferation of new energy storage
technologies, it becomes increasingly difficult to identify which technologies are
economically viable and how to design and integrate them effectively.
Using large-scale energy system models in Europe, the dissertation shows that solely
relying on Levelized Cost of Storage (LCOS) metrics for technology assessments can
mislead and that traditional system-value methods raise important questions about
how to assess multiple energy storage technologies. Further, the work introduces a
new complementary system-value assessment method called the market-potential
method, which provides a systematic deployment analysis for assessing multiple
storage technologies under competition. However, integrating energy storage in
system models can lead to the unintended storage cycling effect, which occurs in
approximately two-thirds of models and significantly distorts results. The thesis
finds that traditional approaches to deal with the issue, such as multi-stage optimization
or mixed integer linear programming approaches, are either ineffective
or computationally inefficient. A new approach is suggested that only requires
appropriate model parameterization with variable costs while keeping the model
convex to reduce the risk of misleading results.
In addition, to enable energy storage assessments and energy system research around
the world, the thesis extended the geographical scope of an existing European opensource
model to global coverage. The new build energy system model ‘PyPSA-Earth’
is thereby demonstrated and validated in Africa. Using PyPSA-Earth, the thesis
assesses for the first time the system value of 20 energy storage technologies across
multiple scenarios in a representative future power system in Africa. The results offer
insights into approaches for assessing multiple energy storage technologies under
competition in large-scale energy system models. In particular, the dissertation
addresses extreme cost uncertainty through a comprehensive scenario tree and finds
that, apart from lithium and hydrogen, only seven energy storage are optimizationrelevant
technologies. The work also discovers that a heterogeneous storage design
can increase power system benefits and that some energy storage are more important
than others. Finally, in contrast to traditional methods that only consider single
energy storage, the thesis finds that optimizing multiple energy storage options
tends to significantly reduce total system costs by up to 29%.
The presented research findings have the potential to inform decision-making processes
for the sizing, integration, and deployment of energy storage systems in
decarbonized power systems, contributing to a paradigm shift in scientific methodology
and advancing efforts towards a sustainable future
Backpropagation Beyond the Gradient
Automatic differentiation is a key enabler of deep learning: previously, practitioners were limited to models
for which they could manually compute derivatives. Now, they can create sophisticated models with almost
no restrictions and train them using first-order, i. e. gradient, information. Popular libraries like PyTorch
and TensorFlow compute this gradient efficiently, automatically, and conveniently with a single line of
code. Under the hood, reverse-mode automatic differentiation, or gradient backpropagation, powers the
gradient computation in these libraries. Their entire design centers around gradient backpropagation.
These frameworks are specialized around one specific task—computing the average gradient in a mini-batch.
This specialization often complicates the extraction of other information like higher-order statistical moments
of the gradient, or higher-order derivatives like the Hessian. It limits practitioners and researchers to methods
that rely on the gradient. Arguably, this hampers the field from exploring the potential of higher-order
information and there is evidence that focusing solely on the gradient has not lead to significant recent
advances in deep learning optimization.
To advance algorithmic research and inspire novel ideas, information beyond the batch-averaged gradient
must be made available at the same level of computational efficiency, automation, and convenience.
This thesis presents approaches to simplify experimentation with rich information beyond the gradient
by making it more readily accessible. We present an implementation of these ideas as an extension to the
backpropagation procedure in PyTorch. Using this newly accessible information, we demonstrate possible use
cases by (i) showing how it can inform our understanding of neural network training by building a diagnostic
tool, and (ii) enabling novel methods to efficiently compute and approximate curvature information.
First, we extend gradient backpropagation for sequential feedforward models to Hessian backpropagation
which enables computing approximate per-layer curvature. This perspective unifies recently proposed block-
diagonal curvature approximations. Like gradient backpropagation, the computation of these second-order
derivatives is modular, and therefore simple to automate and extend to new operations.
Based on the insight that rich information beyond the gradient can be computed efficiently and at the
same time, we extend the backpropagation in PyTorch with the BackPACK library. It provides efficient and
convenient access to statistical moments of the gradient and approximate curvature information, often at a
small overhead compared to computing just the gradient.
Next, we showcase the utility of such information to better understand neural network training. We build
the Cockpit library that visualizes what is happening inside the model during training through various
instruments that rely on BackPACK’s statistics. We show how Cockpit provides a meaningful statistical
summary report to the deep learning engineer to identify bugs in their machine learning pipeline, guide
hyperparameter tuning, and study deep learning phenomena.
Finally, we use BackPACK’s extended automatic differentiation functionality to develop ViViT, an approach
to efficiently compute curvature information, in particular curvature noise. It uses the low-rank structure
of the generalized Gauss-Newton approximation to the Hessian and addresses shortcomings in existing
curvature approximations. Through monitoring curvature noise, we demonstrate how ViViT’s information
helps in understanding challenges to make second-order optimization methods work in practice.
This work develops new tools to experiment more easily with higher-order information in complex deep
learning models. These tools have impacted works on Bayesian applications with Laplace approximations,
out-of-distribution generalization, differential privacy, and the design of automatic differentia-
tion systems. They constitute one important step towards developing and establishing more efficient deep
learning algorithms
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
2023-2024 Catalog
The 2023-2024 Governors State University Undergraduate and Graduate Catalog is a comprehensive listing of current information regarding:Degree RequirementsCourse OfferingsUndergraduate and Graduate Rules and Regulation
Development, Implementation, and Optimization of a Modern, Subsonic/Supersonic Panel Method
In the early stages of aircraft design, engineers consider many different design concepts, examining the trade-offs between different component arrangements and sizes, thrust and power requirements, etc. Because so many different designs are considered, it is best in the early stages of design to use simulation tools that are fast; accuracy is secondary. A common simulation tool for early design and analysis is the panel method. Panel methods were first developed in the 1950s and 1960s with the advent of modern computers. Despite being reasonably accurate and very fast, their development was abandoned in the late 1980s in favor of more complex and accurate simulation methods. The panel methods developed in the 1980s are still in use by aircraft designers today because of their accuracy and speed. However, they are cumbersome to use and limited in applicability. The purpose of this work is to reexamine panel methods in a modern context. In particular, this work focuses on the application of panel methods to supersonic aircraft (a supersonic aircraft is one that flies faster than the speed of sound). Various aspects of the panel method, including the distributions of the unknown flow variables on the surface of the aircraft and efficiently solving for these unknowns, are discussed. Trade-offs between alternative formulations are examined and recommendations given. This work also serves to bring together, clarify, and condense much of the literature previously published regarding panel methods so as to assist future developers of panel methods
Accurate quantum transport modelling and epitaxial structure design of high-speed and high-power In0.53Ga0.47As/AlAs double-barrier resonant tunnelling diodes for 300-GHz oscillator sources
Terahertz (THz) wave technology is envisioned as an appealing and conceivable solution in the context of several potential high-impact applications, including sixth generation (6G) and beyond consumer-oriented ultra-broadband multi-gigabit wireless data-links, as well as highresolution imaging, radar, and spectroscopy apparatuses employable in biomedicine, industrial processes, security/defence, and material science. Despite the technological challenges posed by the THz gap, recent scientific advancements suggest the practical viability of THz systems. However, the development of transmitters (Tx) and receivers (Rx) based on compact semiconductor devices operating at THz frequencies is urgently demanded to meet the performance requirements calling from emerging THz applications.
Although several are the promising candidates, including high-speed III-V transistors and photo-diodes, resonant tunnelling diode (RTD) technology offers a compact and high performance option in many practical scenarios. However, the main weakness of the technology is currently represented by the low output power capability of RTD THz Tx, which is mainly caused by the underdeveloped and non-optimal device, as well as circuit, design implementation approaches. Indeed, indium phosphide (InP) RTD devices can nowadays deliver only up to around 1 mW of radio-frequency (RF) power at around 300 GHz. In the context of THz wireless data-links, this severely impacts the Tx performance, limiting communication distance and data transfer capabilities which, at the current time, are of the order of few tens of gigabit per second below around 1 m.
However, recent research studies suggest that several milliwatt of output power are required to achieve bit-rate capabilities of several tens of gigabits per second and beyond, and to reach several metres of communication distance in common operating conditions. Currently, the shortterm target is set to 5−10 mW of output power at around 300 GHz carrier waves, which would allow bit-rates in excess of 100 Gb/s, as well as wireless communications well above 5 m distance, in first-stage short-range scenarios. In order to reach it, maximisation of the RTD highfrequency RF power capability is of utmost importance. Despite that, reliable epitaxial structure design approaches, as well as accurate physical-based numerical simulation tools, aimed at RF power maximisation in the 300 GHz-band are lacking at the current time.
This work aims at proposing practical solutions to address the aforementioned issues. First, a physical-based simulation methodology was developed to accurately and reliably simulate the static current-voltage (IV ) characteristic of indium gallium arsenide/aluminium arsenide (In-GaAs/AlAs) double-barrier RTD devices. The approach relies on the non-equilibrium Green’s function (NEGF) formalism implemented in Silvaco Atlas technology computer-aided design (TCAD) simulation package, requires low computational budget, and allows to correctly model In0.53Ga0.47As/AlAs RTD devices, which are pseudomorphically-grown on lattice-matched to InP substrates, and are commonly employed in oscillators working at around 300 GHz. By selecting the appropriate physical models, and by retrieving the correct materials parameters, together with a suitable discretisation of the associated heterostructure spatial domain through finite-elements, it is shown, by comparing simulation data with experimental results, that the developed numerical approach can reliably compute several quantities of interest that characterise the DC IV curve negative differential resistance (NDR) region, including peak current, peak voltage, and voltage swing, all of which are key parameters in RTD oscillator design.
The demonstrated simulation approach was then used to study the impact of epitaxial structure design parameters, including those characterising the double-barrier quantum well, as well as emitter and collector regions, on the electrical properties of the RTD device. In particular, a comprehensive simulation analysis was conducted, and the retrieved output trends discussed based on the heterostructure band diagram, transmission coefficient energy spectrum, charge distribution, and DC current-density voltage (JV) curve. General design guidelines aimed at enhancing the RTD device maximum RF power gain capability are then deduced and discussed.
To validate the proposed epitaxial design approach, an In0.53Ga0.47As/AlAs double-barrier RTD epitaxial structure providing several milliwatt of RF power was designed by employing the developed simulation methodology, and experimentally-investigated through the microfabrication of RTD devices and subsequent high-frequency characterisation up to 110 GHz. The analysis, which included fabrication optimisation, reveals an expected RF power performance of up to around 5 mW and 10 mW at 300 GHz for 25 μm2 and 49 μm2-large RTD devices, respectively, which is up to five times higher compared to the current state-of-the-art. Finally, in order to prove the practical employability of the proposed RTDs in oscillator circuits realised employing low-cost photo-lithography, both coplanar waveguide and microstrip inductive stubs are designed through a full three-dimensional electromagnetic simulation analysis.
In summary, this work makes and important contribution to the rapidly evolving field of THz RTD technology, and demonstrates the practical feasibility of 300-GHz high-power RTD devices realisation, which will underpin the future development of Tx systems capable of the power levels required in the forthcoming THz applications
SUTMS - Unified Threat Management Framework for Home Networks
Home networks were initially designed for web browsing and non-business critical applications. As infrastructure improved, internet broadband costs decreased, and home internet usage transferred to e-commerce and business-critical applications. Today’s home computers host personnel identifiable information and financial data and act as a bridge to corporate networks via remote access technologies like VPN. The expansion of remote work and the transition to cloud computing have broadened the attack surface for potential threats. Home networks have become the extension of critical networks and services, hackers can get access to corporate data by compromising devices attacked to broad- band routers. All these challenges depict the importance of home-based Unified Threat Management (UTM) systems. There is a need of unified threat management framework that is developed specifically for home and small networks to address emerging security challenges. In this research, the proposed Smart Unified Threat Management (SUTMS) framework serves as a comprehensive solution for implementing home network security, incorporating firewall, anti-bot, intrusion detection, and anomaly detection engines into a unified system. SUTMS is able to provide 99.99% accuracy with 56.83% memory improvements. IPS stands out as the most resource-intensive UTM service, SUTMS successfully reduces the performance overhead of IDS by integrating it with the flow detection mod- ule. The artifact employs flow analysis to identify network anomalies and categorizes encrypted traffic according to its abnormalities. SUTMS can be scaled by introducing optional functions, i.e., routing and smart logging (utilizing Apriori algorithms). The research also tackles one of the limitations identified by SUTMS through the introduction of a second artifact called Secure Centralized Management System (SCMS). SCMS is a lightweight asset management platform with built-in security intelligence that can seamlessly integrate with a cloud for real-time updates
- …