5 research outputs found
k-Means Clustering via the Frank-Wolfe Algorithm
Abstract. We show that k-means clustering is a matrix factorization problem. Seen from this point of view, k-means clustering can be computed using alternating least squares techniques and we show how the constrained optimization steps involved in this procedure can be solved efficiently using the Frank-Wolfe algorithm
Optimizing persistent currents in a ring-shaped Bose-Einstein condensate using machine learning
We demonstrate a method for generating persistent currents in Bose-Einstein
condensates by using a Gaussian process learner to experimentally control the
stirring of the superfluid. The learner optimizes four different outcomes of
the stirring process: (O.I) targeting and (O.II) maximization of the persistent
current winding number; and (O.III) targeting and (O.IV) maximization with time
constraints. The learner optimizations are determined based on the achieved
winding number and the number of spurious vortices introduced by stirring. We
find that the learner is successful in optimizing the stirring protocols,
although the optimal stirring profiles vary significantly depending strongly on
the choice of cost function and scenario. These results suggest that stirring
is robust and persistent currents can be reliably generated through a variety
of stirring approaches.Comment: 11 pages, 8 figures, 1 tabl
Design and operation of energy systems under uncertainty: a comparison between deterministic and stochastic approach
openL’obiettivo di questa tesi è l’analisi delle fasi di design e operation di un sistema energetico con incertezze. Nel dettaglio, i risultati devono spiegare in quale misura la modellazione dell’incertezza associata all’irradianza solare e alla temperatura ambiente possa consentire il miglioramento delle scelte di design, come ad esempio una maggiore precisione riguardo la taglia di un’unità .
L’introduzione delle incertezze risulta importante a causa di diversi fattori, quali il cambiamento climatico, condizioni di mercato inaspettate, evoluzione della richiesta energetica o pianificazioni interattive. Molti studi hanno evidenziato i vantaggi legati all’analisi delle incertezze rispetto ad un tradizionale approccio deterministico. Per le fasi di design e operation di un sistema energetico, le condizioni climatiche, il prezzo dei vettori energetici e la domanda energetica sono i principali parametri incerti da tenere in considerazione.
In questo lavoro, solo le condizioni climatiche sono considerate fonti di incertezza, in modo da vedere quanto esse possano influenzare le soluzioni di design. Per affrontare il problema, si è analizzato un sistema multi-energy residenziale: l’idea è di essere nell’anno 2010, con l’obiettivo di trovare la miglior soluzione per il “futuro”, corrispondente al periodo 2010-2020, usando dati storici inerenti al periodo 2005-2009. Diversi modelli deterministici e stocastici a due stadi, con riferimento a tale sistema, sono stati sviluppati per comparare le soluzioni ottimizzate con quella di riferimento per il periodo 2010-2020.
Per prima cosa, viene discusso il peso della temperatura ambiente nel processo di clustering: questo parametro è raramente considerato nella Letteratura, ma consente di migliorare la qualità della rappresentazione del dataset iniziale. Infatti, la considerazione della sola irradianza solare presenta, in media, il 10% in meno di elementi ben posizionati rispetto al processo che utilizza sia irradianza che temperatura come attributi.
In seguito, l’attenzione è posta sui diversi metodi per la generazione di giornate rappresentative, corrispondenti al periodo di ottimizzazione, per vedere qual è il più adatto ad essere utilizzato per la fase di design di un sistema energetico. Tecniche di clustering sono comparate con profili stagionali o mensili medi. La generazione di cluster stagionali è altresì discussa. I profili medi sono dimostrati essere i peggiori, presentando errori relativi fino al 13% per la funzione obiettivo, paragonata alla soluzione di riferimento. I cluster annuali performano meglio se il numero di giorni rappresentativi è basso, uguale a 4 o 8, o alto, pari a 28.
Infine, è presentata una procedura innovativa di clustering a due stadi per la generazione di scenari stocastici per i diversi giorni rappresentativi. L’idea è quella di assegnare un set di scenari di irradianza e temperatura a ciascun giorno rappresentativo. In ogni caso, le soluzioni ottenute sono troppo conservative, il che è coerente con la teoria dello stochastic programming, ma comporta costi totali elevati.The aim of this thesis is to study the design and operation phases of an energy system under uncertainty. In particular, results should explain whether modelling the uncertainty associated with global solar irradiance and air temperature helps improving design choices, such as components sizes.
The importance of introducing uncertainty is related to many aspects, such as climate change, unexpected market conditions, evolution of energy demand, interactive planning. Many studies highlight the advantages of uncertainty analysis with respect to traditional deterministic approaches. For the design and operation of an energy system, climate conditions, price of energy carriers and energy demand are the main uncertain parameters.
In the following work, only climate conditions are considered as a source of uncertainty, to see how much they can affect design solutions. To address the problem, a residential multi-energy system is considered: the idea is to be in the year 2010, trying to find the best solution for the “future”, the period 2010-2020, using historical data from the period 2005-2009. Deterministic and two-stage stochastic models are developed, with respect to such system, to compare the optimised solutions with the reference one for the period 2010-2020.
First, the relevance of the air temperature in the clustering process is discussed: this parameter is rarely considered in the Literature, but it allows to improve the quality of the dataset representation. In fact, the clustering process with just global solar irradiance presents, as average, 10% fewer well-positioned elements than the process using irradiance and air temperature.
Then, attention is put on different methods for generating representative days as optimisation period, to see which is the most suitable to use for the design phase of an energy system. Clustering techniques are compared with average seasonal and monthly profiles. Generation of seasonal clusters is also discussed. Average profiles are proved the worst ones, presenting relative errors up to 13% for the objective function, with respect to the reference solution. Annual clusters are better than seasonal ones when the number of representative days is low, equal to 4 or 8, or high, equal to 28.
Finally, an innovative two-step clustering procedure to generate scenarios for representative days is presented. The idea is to assign a set of scenarios to each representative day. However, obtained solutions are too conservative, which is consistent with stochastic programming theory, but entails higher total costs
Computational Intelligence Techniques for OES Data Analysis
Semiconductor manufacturers are forced by market demand to continually
deliver lower cost and faster devices. This results in complex industrial processes
that, with continuous evolution, aim to improve quality and reduce
costs. Plasma etching processes have been identified as a critical part of the
production of semiconductor devices. It is therefore important to have good
control over plasma etching but this is a challenging task due to the complex
physics involved.
Optical Emission Spectroscopy (OES) measurements can be collected
non-intrusively during wafer processing and are being used more and more
in semiconductor manufacturing as they provide real time plasma chemical
information. However, the use of OES measurements is challenging due to
its complexity, high dimension and the presence of many redundant variables.
The development of advanced analysis algorithms for virtual metrology,
anomaly detection and variables selection is fundamental in order to
effectively use OES measurements in a production process.
This thesis focuses on computational intelligence techniques for OES data
analysis in semiconductor manufacturing presenting both theoretical results
and industrial application studies. To begin with, a spectrum alignment
algorithm is developed to align OES measurements from different sensors.
Then supervised variables selection algorithms are developed. These are defined
as improved versions of the LASSO estimator with the view to selecting
a more stable set of variables and better prediction performance in virtual
metrology applications. After this, the focus of the thesis moves to the unsupervised
variables selection problem. The Forward Selection Component
Analysis (FSCA) algorithm is improved with the introduction of computationally
efficient implementations and different refinement procedures. Nonlinear
extensions of FSCA are also proposed. Finally, the fundamental topic
of anomaly detection is investigated and an unsupervised variables selection
algorithm tailored to anomaly detection is developed. In addition, it is shown
how OES data can be effectively used for semi-supervised anomaly detection
in a semiconductor manufacturing process.
The developed algorithms open up opportunities for the effective use of
OES data for advanced process control. All the developed methodologies
require minimal user intervention and provide easy to interpret models. This
makes them practical for engineers to use during production for process monitoring
and for in-line detection and diagnosis of process issues, thereby resulting
in an overall improvement in production performance