52 research outputs found
Complex Noise-Resistant Zeroing Neural Network for Computing Complex Time-Dependent Lyapunov Equation
Complex time-dependent Lyapunov equation (CTDLE), as an important means of stability analysis of control systems, has been extensively employed in mathematics and engineering application fields. Recursive neural networks (RNNs) have been reported as an effective method for solving CTDLE. In the previous work, zeroing neural networks (ZNNs) have been established to find the accurate solution of time-dependent Lyapunov equation (TDLE) in the noise-free conditions. However, noises are inevitable in the actual implementation process. In order to suppress the interference of various noises in practical applications, in this paper, a complex noise-resistant ZNN (CNRZNN) model is proposed and employed for the CTDLE solution. Additionally, the convergence and robustness of the CNRZNN model are analyzed and proved theoretically. For verification and comparison, three experiments and the existing noise-tolerant ZNN (NTZNN) model are introduced to investigate the effectiveness, convergence and robustness of the CNRZNN model. Compared with the NTZNN model, the CNRZNN model has more generality and stronger robustness. Specifically, the NTZNN model is a special form of the CNRZNN model, and the residual error of CNRZNN can converge rapidly and stably to order 10−5 when solving CTDLE under complex linear noises, which is much lower than order 10−1 of the NTZNN model. Analogously, under complex quadratic noises, the residual error of the CNRZNN model can converge to 2∥A∥F/ζ3 quickly and stably, while the residual error of the NTZNN model is divergent
A novel quaternion linear matrix equation solver through zeroing neural networks with applications to acoustic source tracking
Due to its significance in science and engineering, time-varying linear matrix equation (LME) problems have received a lot of attention from scholars. It is for this reason that the issue of finding the minimum-norm least-squares solution of the time-varying quaternion LME (ML-TQ-LME) is addressed in this study. This is accomplished using the zeroing neural network (ZNN) technique, which has achieved considerable success in tackling time-varying issues. In light of that, two new ZNN models are introduced to solve the ML-TQ-LME problem for time-varying quaternion matrices of arbitrary dimension. Two simulation experiments and two practical acoustic source tracking applications show that the models function superbly
Zeroing neural networks for computing quaternion linear matrix equation with application to color restoration of images
The importance of quaternions in a variety of fields, such as physics, engineering and computer science, renders the effective solution of the time-varying quaternion matrix linear equation (TV-QLME) an equally important and interesting task. Zeroing neural networks (ZNN) have seen great success in solving TV problems in the real and complex domains, while quaternions and matrices of quaternions may be readily represented as either a complex or a real matrix, of magnified size. On that account, three new ZNN models are developed and the TV-QLME is solved directly in the quaternion domain as well as indirectly in the complex and real domains for matrices of arbitrary dimension. The models perform admirably in four simulation experiments and two practical applications concerning color restoration of images
Applying fixed point techniques for obtaining a positive definite solution to nonlinear matrix equations
In this manuscript, the concept of rational-type multivalued F−contraction mappings is investigated. In addition, some nice fixed point results are obtained using this concept in the setting of MM−spaces and ordered MM−spaces. Our findings extend, unify, and generalize a large body of work along the same lines. Moreover, to support and strengthen our results, non-trivial and extensive examples are presented. Ultimately, the theoretical results are involved in obtaining a positive, definite solution to nonlinear matrix equations as an application
Visual Steering for One-Shot Deep Neural Network Synthesis
Recent advancements in the area of deep learning have shown the effectiveness
of very large neural networks in several applications. However, as these deep
neural networks continue to grow in size, it becomes more and more difficult to
configure their many parameters to obtain good results. Presently, analysts
must experiment with many different configurations and parameter settings,
which is labor-intensive and time-consuming. On the other hand, the capacity of
fully automated techniques for neural network architecture search is limited
without the domain knowledge of human experts. To deal with the problem, we
formulate the task of neural network architecture optimization as a graph space
exploration, based on the one-shot architecture search technique. In this
approach, a super-graph of all candidate architectures is trained in one-shot
and the optimal neural network is identified as a sub-graph. In this paper, we
present a framework that allows analysts to effectively build the solution
sub-graph space and guide the network search by injecting their domain
knowledge. Starting with the network architecture space composed of basic
neural network components, analysts are empowered to effectively select the
most promising components via our one-shot search scheme. Applying this
technique in an iterative manner allows analysts to converge to the best
performing neural network architecture for a given application. During the
exploration, analysts can use their domain knowledge aided by cues provided
from a scatterplot visualization of the search space to edit different
components and guide the search for faster convergence. We designed our
interface in collaboration with several deep learning researchers and its final
effectiveness is evaluated with a user study and two case studies.Comment: 9 pages, submitted to IEEE Transactions on Visualization and Computer
Graphics, 202
A Unified Framework for Gradient-based Hyperparameter Optimization and Meta-learning
Machine learning algorithms and systems are progressively becoming part of our societies, leading to a growing need of building a vast multitude of accurate, reliable and interpretable models which should possibly exploit similarities among tasks. Automating segments of machine learning itself seems to be a natural step to undertake to deliver increasingly capable systems able to perform well in both the big-data and the few-shot learning regimes. Hyperparameter optimization (HPO) and meta-learning (MTL) constitute two building blocks of this growing effort. We explore these two topics under a unifying perspective, presenting a mathematical framework linked to bilevel programming that captures existing similarities and translates into procedures of practical interest rooted in algorithmic differentiation. We discuss the derivation, applicability and computational complexity of these methods and establish several approximation properties for a class of objective functions of the underlying bilevel programs. In HPO, these algorithms generalize and extend previous work on gradient-based methods. In MTL, the resulting framework subsumes classic and emerging strategies and provides a starting basis from which to build and analyze novel techniques. A series of examples and numerical simulations offer insight and highlight some limitations of these approaches. Experiments on larger-scale problems show the potential gains of the proposed methods in real-world applications. Finally, we develop two extensions of the basic algorithms apt to optimize a class of discrete hyperparameters (graph edges) in an application to relational learning and to tune online learning rate schedules for training neural network models, an old but crucially important issue in machine learning
Recommended from our members
Continuous learning of analytical and machine learning rate of penetration (ROP) models for real-time drilling optimization
Oil and gas operators strive to reach hydrocarbon reserves by drilling wells in the safest and fastest possible manner, providing indispensable energy to society at reduced costs while maintaining environmental sustainability. Real-time drilling optimization consists of selecting operational drilling parameters that maximize a desirable measure of drilling performance. Drilling optimization efforts often aspire to improve drilling speed, commonly referred to as rate of penetration (ROP). ROP is a function of the forces and moments applied to the bit, in addition to mud, formation, bit and hydraulic properties. Three operational drilling parameters may be constantly adjusted at surface to influence ROP towards a drilling objective: weight on bit (WOB), drillstring rotational speed (RPM), and drilling fluid (mud) flow rate. In the traditional, analytical approach to ROP modeling, inflexible equations relate WOB, RPM, flow rate and/or other measurable drilling parameters to ROP and empirical model coefficients are computed for each rock formation to best fit field data. Over the last decade, enhanced data acquisition technology and widespread cheap computational power have driven a surge in applications of machine learning (ML) techniques to ROP prediction. Machine learning algorithms leverage statistics to uncover relations between any prescribed inputs (features/predictors) and the quantity of interest (response). The biggest advantage of ML algorithms over analytical models is their flexibility in model form. With no set equation, ML models permit segmentation of the drilling operational parameter space. However, increased model complexity diminishes interpretability of how an adjustment to the inputs will affect the output. There is no single ROP model applicable in every situation. This study investigates all stages of the drilling optimization workflow, with emphasis on real-time continuous model learning. Sensors constantly record data as wells are drilled, and it is postulated that ROP models can be retrained in real-time to adapt to changing drilling conditions. Cross-validation is assessed as a methodology to select the best performing ROP model for each drilling optimization interval in real-time. Constrained to rig equipment and operational limitations, drilling parameters are optimized in intervals with the most accurate ROP model determined by cross-validation. Dynamic range and full range training data segmentation techniques contest the classical lithology-dependent approach to ROP modeling. Spatial proximity and parameter similarity sample weighting expand data partitioning capabilities during model training. The prescribed ROP modeling and drilling parameter optimization scenarios are evaluated according to model performance, ROP improvements and computational expensePetroleum and Geosystems Engineerin
Reinforcement Learning and Planning for Preference Balancing Tasks
Robots are often highly non-linear dynamical systems with many degrees of freedom, making solving motion problems computationally challenging. One solution has been reinforcement learning (RL), which learns through experimentation to automatically perform the near-optimal motions that complete a task. However, high-dimensional problems and task formulation often prove challenging for RL. We address these problems with PrEference Appraisal Reinforcement Learning (PEARL), which solves Preference Balancing Tasks (PBTs). PBTs define a problem as a set of preferences that the system must balance to achieve a goal. The method is appropriate for acceleration-controlled systems with continuous state-space and either discrete or continuous action spaces with unknown system dynamics. We show that PEARL learns a sub-optimal policy on a subset of states and actions, and transfers the policy to the expanded domain to produce a more refined plan on a class of robotic problems. We establish convergence to task goal conditions, and even when preconditions are not verifiable, show that this is a valuable method to use before other more expensive approaches. Evaluation is done on several robotic problems, such as Aerial Cargo Delivery, Multi-Agent Pursuit, Rendezvous, and Inverted Flying Pendulum both in simulation and experimentally. Additionally, PEARL is leveraged outside of robotics as an array sorting agent. The results demonstrate high accuracy and fast learning times on a large set of practical applications
Recommended from our members
Mean-field models in network inference and epidemic control
Systems that are comprised of agents and pairwise interactions between agents can be studied through the lenses of Network theory. As a general framework, Network Theory has applications in various disciplines, including Statistical Physics, Economics, and Biology. The interplay between the contact structure of a population and epidemic spreading is one of the most studied research areas in Epidemiology, where network based research has offered many breakthroughs in recent years. Since an individual based description is computationally intractable, as state spaces scale exponentially with the number of agents modelled, many mathematical approximations have been developed to describe the system in terms of low dimensional aggregate statistics, such as the average number of infected people. This thesis is focused on the application of such approximation techniques, in particular the well known mean-field models, to two key problems in Epidemiology: inference and epidemic control.
In the first part of this work, the theme is the inference of network properties from the observation of outbreaks at a population-level. Typically, readily available information during an outbreak is (daily) case counts. With this in mind, a new mean-field like model is introduced to approximate epidemics on networks via Birthand-Death processes, whose rates are random variables which depend implicitly on the structure of the underlying network and disease dynamics. By using Bayesian model selection, it is possible to recover the most likely underlying network class from datasets that consist only of discrete-time observations from one single epidemic. Further, having a description in terms of Birth-and-Death processes allows to study the large N limit of the process as a one-dimensional Fokker-Planck equation, that implies an even greater reduction in dimensionality.
In the second part of this thesis more standard mean-field models are adopted to perform epidemic control. The aim is to reduce the burden of an outbreak on a target population. Intervention policies may consist of one time interventions either to minimise the epidemic peak or the final size, or to maximise the average time to infection. Homogeneous mixing models are a nice tool to showcase how interventions that achieve such goals can be optimised. A network perspective is introduced to study the so-called disease-induced herd immunity: in principle, epidemics act like targeted vaccinations, preferentially immunising higher-risk individuals. This means that the herd-immunity threshold might be reached at lower levels compared to that derived from homogeneous mixing models, and this might be relevant for epidemic control. However, it is shown that the magnitude of this effect depends heavily on how both the topology of the contact network and the way non-pharmaceutical interventions are modelled. Finally, epidemic response can be thought of as a feedback process, that is, social distancing policies might be deployed depending on the observed epidemic curve, rather than being pre-determined from theoretical arguments.In this case, the goal is to maintain the epidemic at manageable levels throughout its course, by tailoring interventions that aim to be as less disruptive as possible. This possibility is investigated on a high dimensional network model, by deriving a feedback-loop control action that at its core is based on a mean-field approximation
- …