1,197 research outputs found
Penalized Orthogonal Iteration for Sparse Estimation of Generalized Eigenvalue Problem
We propose a new algorithm for sparse estimation of eigenvectors in
generalized eigenvalue problems (GEP). The GEP arises in a number of modern
data-analytic situations and statistical methods, including principal component
analysis (PCA), multiclass linear discriminant analysis (LDA), canonical
correlation analysis (CCA), sufficient dimension reduction (SDR) and invariant
co-ordinate selection. We propose to modify the standard generalized orthogonal
iteration with a sparsity-inducing penalty for the eigenvectors. To achieve
this goal, we generalize the equation-solving step of orthogonal iteration to a
penalized convex optimization problem. The resulting algorithm, called
penalized orthogonal iteration, provides accurate estimation of the true
eigenspace, when it is sparse. Also proposed is a computationally more
efficient alternative, which works well for PCA and LDA problems. Numerical
studies reveal that the proposed algorithms are competitive, and that our
tuning procedure works well. We demonstrate applications of the proposed
algorithm to obtain sparse estimates for PCA, multiclass LDA, CCA and SDR.
Supplementary materials are available online
Analysis of group evolution prediction in complex networks
In the world, in which acceptance and the identification with social
communities are highly desired, the ability to predict evolution of groups over
time appears to be a vital but very complex research problem. Therefore, we
propose a new, adaptable, generic and mutli-stage method for Group Evolution
Prediction (GEP) in complex networks, that facilitates reasoning about the
future states of the recently discovered groups. The precise GEP modularity
enabled us to carry out extensive and versatile empirical studies on many
real-world complex / social networks to analyze the impact of numerous setups
and parameters like time window type and size, group detection method,
evolution chain length, prediction models, etc. Additionally, many new
predictive features reflecting the group state at a given time have been
identified and tested. Some other research problems like enriching learning
evolution chains with external data have been analyzed as well
Recommended from our members
Hadoop performance modeling and job optimization for big data analytics
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonBig data has received a momentum from both academia and industry. The MapReduce model has emerged into a major computing model in support of big data analytics. Hadoop, which is an open source implementation of the MapReduce model, has been widely taken up by the community. Cloud service providers such as Amazon EC2 cloud have now supported Hadoop user applications. However, a key challenge is that the cloud service providers do not a have resource provisioning mechanism to satisfy user jobs with deadline requirements. Currently, it is solely the user responsibility to estimate the require amount of resources for their job running in a public cloud. This thesis presents a Hadoop performance model that accurately estimates the execution duration of a job and further provisions the required amount of resources for a job to be completed within a deadline. The proposed model employs Locally Weighted Linear Regression (LWLR) model to estimate execution time of a job and Lagrange Multiplier technique for resource provisioning to satisfy user job with a given deadline. The performance of the propose model is extensively evaluated in both in-house Hadoop cluster and Amazon EC2 Cloud. Experimental results show that the proposed model is highly accurate in job execution estimation and jobs are completed within the required deadlines following on the resource provisioning scheme of the proposed model. In addition, the Hadoop framework has over 190 configuration parameters and some of them have significant effects on the performance of a Hadoop job. Manually setting the optimum values for these parameters is a challenging task and also a time consuming process. This thesis presents optimization works that enhances the performance of Hadoop by automatically tuning its parameter values. It employs Gene Expression Programming (GEP) technique to build an objective function that represents the performance of a job and the correlation among the configuration parameters. For the purpose of optimization, Particle Swarm Optimization (PSO) is employed to find automatically an optimal or a near optimal configuration settings. The performance of the proposed work is intensively evaluated on a Hadoop cluster and the experimental results show that the proposed work enhances the performance of Hadoop significantly compared with the default settings.Abdul Wali Khan University Marda
A multi-objective framework for long-term generation expansion planning with variable renewables
The growing importance of operational flexibility in generation expansion planning with increased integration of variable renewables has been regularly highlighted in recent research. Yet, operational flexibility has been largely overlooked in order to reduce the prohibitive problem size that results when operational details at small timescales are included in this long-term exercise. In this work, we present a multi-objective optimization framework that effectively and tractably incorporates flexibility screening of candidate generation portfolios in long-term generation expansion planning. Operational flexibility is considered as a separate objective along with the traditional economic and environmental objectives. The ability of the proposed methodology to provide valuable insights into the correlations between flexibility, total costs and carbon emissions is demonstrated using a case study. The results clearly reveal that omission of flexibility from the framework gives rise to deficient generation mixes that are unable to match the more frequent and steeper variations in net load. A high-level evaluation of the flexibility needed in generation portfolios to balance net loads with different degrees of variability is also provided. Finally, a procedure is proposed to support the decision-making process for selecting the most appropriate investment plan among the many solution options provided by the multi-objective optimization framework
Recommended from our members
Methodology for identifying alternative solutions in a population based data generation approach applied to synthetic biology
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonDesign is an essential component of sustainable development. Computational modelling has
become a useful technique that facilitates the design of complex systems. Variables that characterises
a complex system are encoded into a computational model using mathematical concepts
and through simulation each of these variables alone or in combination are modified to observe
the changes in the outcome. This allows the researchers to make predictions on the behaviour
of the real system that is being studied in response to the changes. The ultimate goal of any
design process is to come up with the best design; as resources are limited, to minimize the cost
and resource consumption, and to maximize the performance, profits and efficiency. To optimize
means to find the best solution, the best compromise among several conflicting demands subject
to predefined requirements. Therefore, computational optimization, modelling and simulation
forms an integrated part of the modern design practice.
This thesis defines a data analytics driven methodology which enables the identification of
alternative solutions of computational design by analysing the generational history of the population
based heuristic search used to generate the templates. While optimisation is focused on
obtaining the optimal solution this methodology focuses on alternative solutions which are sub
optimal by fitness or solutions with similar fitness but different structures. When the optimal
design solution is less robust, alternative solutions can offer a sufficiently good accuracy and an
achievable resource requirement. The main advantage of the methodology is that it exploits the
exploration process of the solution space during a single run, by focusing also on suboptimal
solutions, which usually get neglected in the search for an optimal one. The history of the
heuristic search is analysed for the emergence of alternative solutions and evolving of a solution.
By examining how an initial solution converts to an optimal solution core design patterns are
identified, and these were used to improve the design process. Further, this method limits the
number of runs of the heuristic search as more solution space is covered. The methodology is
generic because it can be used to any instance where a population based heuristic search is applied
to generate optimal designs. The applicability of the methodology is demonstrated using
three case studies from mathematics (building of a mathematical function for a set target) and
biology (obtaining alternative designs for genomic metabolic models [GEM] and DNA walker
circuits). In each case a different heuristic search method was used: Gene expression programming
(mathematical expressions), genetic algorithms (GEM models) and simulated annealing
(DNA walker circuits). Descriptive analytics, visual analytics and clustering was mainly used to build the data analytics driven approach in identifying alternative solutions. This data analytics
driven methodology is useful in optimising the computational design of complex systems
Renewable electricity generation and transmission network developments in light of public opposition: Insights from Ireland. ESRI Working Paper No. 653 March 2020
This paper analyses how people’s attitudes towards onshore wind power and overhead transmission lines affect the costoptimal
development of electricity generation mixes, under a high renewable energy policy. For that purpose, we use a power
systems generation and transmission expansion planning model, combined with information on public attitudes towards energy
infrastructure on the island of Ireland. Overall, households have a positive attitude towards onshore wind power but their
willingness to accept wind farms near their homes tends to be low. Opposition to overhead transmission lines is even greater. This
can lead to a substantial increase in the costs of expanding the power system. In the Irish case, costs escalate by more than 4.3%
when public opposition is factored into the constrained optimisation of power generation and grid expansion planning across the
island. This is mainly driven by the compounded effects of higher capacity investments in more expensive technologies such as
offshore wind and solar photovoltaic to compensate for lower levels of onshore wind generation and grid reinforcements. The
results also reveal the effect of public opposition on the value of onshore wind, via shadow prices. The higher the level of public
opposition, the higher the shadow value of onshore wind. And, this starkly differs across regions: regions with more wind resource
or closest to major demand centres have the highest shadow prices. The shadow costs can guide policy makers when designing
incentive mechanisms to garner public support for onshore wind installations
- …