363 research outputs found

    Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering

    Full text link
    Despite the empirical success and practical significance of (relational) knowledge distillation that matches (the relations of) features between teacher and student models, the corresponding theoretical interpretations remain limited for various knowledge distillation paradigms. In this work, we take an initial step toward a theoretical understanding of relational knowledge distillation (RKD), with a focus on semi-supervised classification problems. We start by casting RKD as spectral clustering on a population-induced graph unveiled by a teacher model. Via a notion of clustering error that quantifies the discrepancy between the predicted and ground truth clusterings, we illustrate that RKD over the population provably leads to low clustering error. Moreover, we provide a sample complexity bound for RKD with limited unlabeled samples. For semi-supervised learning, we further demonstrate the label efficiency of RKD through a general framework of cluster-aware semi-supervised learning that assumes low clustering errors. Finally, by unifying data augmentation consistency regularization into this cluster-aware framework, we show that despite the common effect of learning accurate clusterings, RKD facilitates a "global" perspective through spectral clustering, whereas consistency regularization focuses on a "local" perspective via expansion

    Traction force microscopy with optimized regularization and automated Bayesian parameter selection for comparing cells

    Full text link
    Adherent cells exert traction forces on to their environment, which allows them to migrate, to maintain tissue integrity, and to form complex multicellular structures. This traction can be measured in a perturbation-free manner with traction force microscopy (TFM). In TFM, traction is usually calculated via the solution of a linear system, which is complicated by undersampled input data, acquisition noise, and large condition numbers for some methods. Therefore, standard TFM algorithms either employ data filtering or regularization. However, these approaches require a manual selection of filter- or regularization parameters and consequently exhibit a substantial degree of subjectiveness. This shortcoming is particularly serious when cells in different conditions are to be compared because optimal noise suppression needs to be adapted for every situation, which invariably results in systematic errors. Here, we systematically test the performance of new methods from computer vision and Bayesian inference for solving the inverse problem in TFM. We compare two classical schemes, L1- and L2-regularization, with three previously untested schemes, namely Elastic Net regularization, Proximal Gradient Lasso, and Proximal Gradient Elastic Net. Overall, we find that Elastic Net regularization, which combines L1 and L2 regularization, outperforms all other methods with regard to accuracy of traction reconstruction. Next, we develop two methods, Bayesian L2 regularization and Advanced Bayesian L2 regularization, for automatic, optimal L2 regularization. Using artificial data and experimental data, we show that these methods enable robust reconstruction of traction without requiring a difficult selection of regularization parameters specifically for each data set. Thus, Bayesian methods can mitigate the considerable uncertainty inherent in comparing cellular traction forces

    A Data-Driven Approach for Generating Vortex Shedding Regime Maps for an Oscillating Cylinder

    Get PDF
    Recent developments in wind energy extraction methods from vortex-induced vibration (VIV) have fueled the research into vortex shedding behaviour. The vortex shedding map is vital for the consistent use of normalized amplitude and wavelength to validate the predicting power of forced vibration experiments. However, there is a lack of demonstrated methods of generating this map at Reynolds numbers feasible for energy generation due to the high computational cost and complex dynamics. Leveraging data-driven methods addresses the limitations of the traditional experimental vortex shedding map generation, which requires large amounts of data and intensive supervision that is unsuitable for many applications and Reynolds numbers. This thesis presents a data-driven approach for generating vortex shedding maps of a cylinder undergoing forced vibration that requires less data and supervision while accurately extracting the underlying vortex structure patterns. The quantitative analysis in this dissertation requires the univariate time series signatures of local fluid flow measurements in the wake of an oscillating cylinder experiencing forced vibration. The datasets were extracted from a 2-dimensional computational fluid dynamic (CFD) simulation of a cylinder oscillating at various normalized amplitude and wavelength parameters conducted at two discrete Reynolds numbers of 4000 and 10,000. First, the validity of clustering local flow measurements was demonstrated by proposing a vortex shedding mode classification strategy using supervised machine learning models of random forest and -nearest neighbour models, which achieved 99.3% and 99.8% classification accuracy using the velocity sensors orientated transverse to the pre-dominant flow (), respectively. Next, the dataset of local flow measurement of the -component of velocity was used to develop the procedure of generating vortex shedding maps using unsupervised clustering techniques. The clustering task was conducted on subsequences of repeated patterns from the whole time series extracted using the novel matrix profile method. The vortex shedding map was validated by reproducing a benchmark map produced at a low Reynolds number. The method was extended to a higher Reynolds number case of vortex shedding and demonstrated the insight gained into the underlying dynamical regimes of the physical system. The proposed multi-step clustering methods denoted Hybrid Method B, combining Density-Based Clustering Based on Connected Regions with High Density (DBSCAN) and Agglomerative algorithms, and Hybrid Method C, combining -Means and Agglomerative algorithms demonstrated the ability to extract meaningful clusters from more complex vortex structures that become increasingly indistinguishable. The data-driven methods yield exceptional performance and versatility, which significantly improves the map generation method while reducing the data input and supervision required

    Reduced Order Models for Hydrodynamic Analysis of Pipelines based on Modal Analysis and Machine Learning

    Get PDF
    Bluff bodies have extensive implementation in engineering. For instance, marine risers, jumpers, umbilicals, and bundle flowlines are the examples of the circular bluff bodies which are used in the field of offshore engineering. They can be designed both to be fixed or flexible supported and to be a single or tandem configurations. The subsea structures are subjected to the external hydrodynamic loads such as cyclic loads, vibrations, high pressure, etc. For example, one of the commonly observed damaging flow features associated with hydrodynamics of the flexible supported bluff bodies is vortex shedding. Therefore, investigation of the instantaneous flow structures and the hydrodynamic forces acting on the bluff bodies at the operational conditions need to be performed to prevent the degradation mechanisms and increase service life of the subsea structures. Nowadays, the modern techniques are used in order to analyse the flow field data with high efficiency. For example, neural networks (NNs) are trained with massive experimental or numerical simulation data to predict the spatial-temporal evolution of the dominant coherent structures of the flow field and structural behaviour can be considered as an alternative to the conventional computational fluid dynamics (CFD) simulations. In the present thesis, numerical investigations of the flow around cylindrical bluff bodies in the upper transition Reynolds number regime ( = 3.6 ∙ 10^6 ) are performed. Two pipeline operational conditions are considered. First one is tandem configuration of the two stationary pipelines subjected to steady flow. Second one is a pipeline undergoing the vortex-induced-vibrations (VIV) subjected to a steady current. Two-dimensional (2D) Unsteady Reynolds-Averaged-Navier-Stokes (URANS) equations combined with the standard k−w SST turbulence model are solved.  The open source CFD toolbox OpenFOAM v2012 is employed to perform the simulations. The Reduced Order Models (ROMs) which can provide a low-dimensional representation of the simulation data with reduced computational time and cost are designed. Dynamic mode decomposition (DMD) and proper orthogonal decomposition (POD) techniques are implemented for the first and second cases, respectively. In addition, further development of the ROMs for the VIV cylinder case is done by implementing the long short-term neural network (LSTM-NN). The neural network based model allows to make the predictions of the dominant hydrodynamic characteristics of the flow around the cylindrical bluff bodies subjected to a high Reynolds number flow at a future time instances with a reduced computational cost

    Simultaneous Behavior Onset Detection and Task Classification for Patients with Parkinson Disease Using Subthalamic Nucleus Local Field Potentials

    Get PDF
    This thesis aims to develop of methods for behavior onset detection of patients with Parkinson\u27s disease (PD), as well as to investigate the models for classification of different behavioral tasks performed by PD patient. The detection is based on recorded Local Field Potentials (LFP) of the Subthalamic nucleus (STN), captured through Deep Brain Stimulation (DBS) process. One main part of this work is dedicated to the research of various properties and features of the STN LFP signals of several patients\u27 behavior conditions. Features based on temporal and time-frequency analysis of the signals are developed and implemented. Evaluation and comparison of the features is conducted on several patients\u27 data during a classification process, using onset windows of preprocessed signals. Another part of this research is concentrated on automated onset detection of behavioral tasks for patients with PD using the LFP signals collected during DBS implantation surgeries. Using time-frequency signal processing methods, features are extracted and clustered in the feature space for onset detection. Then, a supervised model is employed which used Discrete Hidden Markov Models (DHMM) to specify the onset location of the behavior in the LFP signal. Finally, a method for simultaneous onset detection and task classification for patients with PD is presented, which classifies the tasks into motor, language, and combination of motor and language behaviors, using LFP signals collected during DBS implantation surgeries. Again, time-frequency signal processing methods are applied, and features are extracted and clustered in the feature space. The features extracted from automated detected onset are used to classify the behavior task into predefined categories. DHMM is merged with SVM in a two-layer classifier to boost up the behavior classification rate into 84%, and the presented methodology is justified using the experimental results

    A new hybrid algorithm for multi‐objective reactive power planning via facts devices and renewable wind resources

    Get PDF
    The power system planning problem considering system loss function, voltage profile function, the cost function of FACTS (flexible alternating current transmission system) devices, and stability function are investigated in this paper. With the growth of electronic technologies, FACTS devices have improved stability and more reliable planning in reactive power (RP) planning. In addition, in modern power systems, renewable resources have an inevitable effect on power system planning. Therefore, wind resources make a complicated problem of planning due to conflicting functions and non-linear constraints. This confliction is the stochastic nature of the cost, loss, and voltage functions that cannot be summarized in function. A multi-objective hybrid algorithm is proposed to solve this problem by considering the linear and non-linear constraints that combine particle swarm optimization (PSO) and the virus colony search (VCS). VCS is a new optimization method based on viruses’ search function to destroy host cells and cause the penetration of the best virus into a cell for reproduction. In the proposed model, the PSO is used to enhance local and global search. In addition, the non-dominated sort of the Pareto criterion is used to sort the data. The optimization results on different scenarios reveal that the combined method of the proposed hybrid algorithm can improve the parameters such as convergence time, index of voltage stability, and absolute magnitude of voltage deviation, and this method can reduce the total transmission line losses. In addition, the presence of wind resources has a positive effect on the mentioned issue
    corecore