22 research outputs found

    Adaptive Ensemble of Classifiers with Regularization for Imbalanced Data Classification

    Get PDF
    The dynamic ensemble selection of classifiers is an effective approach for processing label-imbalanced data classifications. However, such a technique is prone to overfitting, owing to the lack of regularization methods and the dependence of the aforementioned technique on local geometry. In this study, focusing on binary imbalanced data classification, a novel dynamic ensemble method, namely adaptive ensemble of classifiers with regularization (AER), is proposed, to overcome the stated limitations. The method solves the overfitting problem through implicit regularization. Specifically, it leverages the properties of stochastic gradient descent to obtain the solution with the minimum norm, thereby achieving regularization; furthermore, it interpolates the ensemble weights by exploiting the global geometry of data to further prevent overfitting. According to our theoretical proofs, the seemingly complicated AER paradigm, in addition to its regularization capabilities, can actually reduce the asymptotic time and memory complexities of several other algorithms. We evaluate the proposed AER method on seven benchmark imbalanced datasets from the UCI machine learning repository and one artificially generated GMM-based dataset with five variations. The results show that the proposed algorithm outperforms the major existing algorithms based on multiple metrics in most cases, and two hypothesis tests (McNemar's and Wilcoxon tests) verify the statistical significance further. In addition, the proposed method has other preferred properties such as special advantages in dealing with highly imbalanced data, and it pioneers the research on the regularization for dynamic ensemble methods.Comment: Major revision; Change of authors due to contribution

    Scalar Quantization as Sparse Least Square Optimization

    Full text link
    Quantization can be used to form new vectors/matrices with shared values close to the original. In recent years, the popularity of scalar quantization for value-sharing applications has been soaring as it has been found huge utilities in reducing the complexity of neural networks. Existing clustering-based quantization techniques, while being well-developed, have multiple drawbacks including the dependency of the random seed, empty or out-of-the-range clusters, and high time complexity for a large number of clusters. To overcome these problems, in this paper, the problem of scalar quantization is examined from a new perspective, namely sparse least square optimization. Specifically, inspired by the property of sparse least square regression, several quantization algorithms based on l1l_1 least square are proposed. In addition, similar schemes with l1+l2l_1 + l_2 and l0l_0 regularization are proposed. Furthermore, to compute quantization results with a given amount of values/clusters, this paper designed an iterative method and a clustering-based method, and both of them are built on sparse least square. The paper shows that the latter method is mathematically equivalent to an improved version of k-means clustering-based quantization algorithm, although the two algorithms originated from different intuitions. The algorithms proposed were tested with three types of data and their computational performances, including information loss, time consumption, and the distribution of the values of the sparse vectors, were compared and analyzed. The paper offers a new perspective to probe the area of quantization, and the algorithms proposed can outperform existing methods especially under some bit-width reduction scenarios, when the required post-quantization resolution (number of values) is not significantly lower than the original number

    Integrating Wildfires Propagation Prediction Into Early Warning of Electrical Transmission Line Outages

    Get PDF
    Wildfires could pose a significant danger to electrical transmission lines and cause considerable losses to the power grids and residents nearby. Previous studies of preventing wildfire damages to electrical transmission lines mostly analyze wildfire and power system security independently due to their differences in disciplines and cannot satisfy the requirement of the power grid for active and timely responses. In this paper, we have designed an integrated wildfire early warning system framework for power grids, taking prediction of wildfires and early warning of line outage probability together. First, the proposed model simulates the spatiotemporal process of wildfires via a geography cellular automata model and predicts when and where wildfires initially get into the security buffer of an electrical transmission line. It is developed in the context of electrical transmission line operating with various situations of topography, vegetation, wind and, especially, multiple ignition points. Second, we have proposed a line outage model (LOM), based on wildfire prediction and breakdown mechanisms of the air gap, to predict the breakdown probability varying with time and the most vulnerable poles at the holistic line scale. Finally, to illustrate the validation and rationality of our proposed system, a case study for a 500-kV transmission line near Miyi county, China, is presented, and the results under various wildfire situations are studied and compared. By integrating wildfire prediction into the LOM and alarming the holistic line breakdown probability along time, this paper makes a significant contribution in the early warning system to prevent transmission lines to be damaged by wildfires, illustrating the related breakdown mechanisms at the line operation level rather than laboratory experiments only. Meanwhile, the implementation of cellular automata model under comprehensive environmental conditions and simulation of the breakdown probability for the 500-kV transmission line could serve as references for other studies in the community

    Three-Dimensional Printing Multi-Drug Delivery Core/Shell Fiber Systems with Designed Release Capability

    No full text
    A hydrogel system with the ability to control the delivery of multiple drugs has gained increasing interest for localized disease treatment and tissue engineering applications. In this study, a triple-drug-loaded model based on a core/shell fiber system (CFS) was fabricated through the co-axial 3D printing of hydrogel inks. A CFS with drug 1 loaded in the core, drug 2 in the shell part, and drug 3 in the hollow channel of the CFS was printed on a rotating collector using a co-axial nozzle. Doxorubicin (DOX), as the model drug, was selected to load in the core, with the shell and channel part of the CFS represented as drugs 1, 2, and 3, respectively. Drug 2 achieved the fastest release, while drug 3 showed the slowest release, which indicated that the three types of drugs printed on the CFS spatially can achieve sequential triple-drug release. Moreover, the release rate and sustained duration of each drug could be controlled by the unique core/shell helical structure, the concentration of alginate gels, the cross-linking density, the size and number of the open orifices in the fibers, and the CFS. Additionally, a near-infrared (NIR) laser or pH-responsive drug release could also be realized by introducing photo-thermal materials or a pH-sensitive polymer into this system. Finally, the drug-loaded system showed effective localized cancer therapy in vitro and in vivo. Therefore, this prepared CFS showed the potential application for disease treatment and tissue engineering by sequential- or stimulus-responsively releasing multi-drugs

    Global Analysis of Influencing Forces of Fire Activity: the Threshold Relationships between Vegetation and Fire

    No full text
    Abstract : Manylarge scale firestudies considered the relationships between fire and its influencing factors as smooth.However, the responses of fire activity to influencing factors could be abrupt on the global scale, because the hysteretic responses of vegetation to fire and vegetation types are discrete. This study examined the climatic, vegetation, anthropogenic, lightning, and topographic factorsdriving variations in global fire density, and discussedthe thresholds of vegetation on fire activity. Fire density was developed from 7 years of Moderate Resolution Imaging Spectroradiometer (MODIS) active fire data to represent global fire activity, and nine typical influencing variables were selected. The random forest regression tree method was used to identify the relative importance and relationships between fire and the influencing variables. The patterns of global fire density were captured well by the model (78.33% variance was explained), and the related thresholds were identified. Climatic factors played a primary role in determining global fire density. Agricultural land use and topographic roughness were not identified as the most important factors, probably due to the large scale we considered. Three intervals of tree density were identified to have distinct levels of fire density. Intermediate tree density (9%-53%) was related with the highest fire density, but both low and high percent of tree cover were associated with low fire density (7.0 vs. 1.3/0.9 counts per 100 km 2 per year). This study could provide further insights into understanding of the threshold effects of influencing factors on fire activity, and contribute to advances in fire modelingand vegetation distribution studies

    A Comparative Study of Genetic Responses to Short- and Long-Term Habitat Fragmentation in a Distylous Herb <i>Hedyotis chyrsotricha</i> (Rubiaceae)

    No full text
    The genetic effects of habitat fragmentation are complex and are influenced by both species traits and landscape features. For plants with strong seed or pollen dispersal capabilities, the question of whether the genetic erosion of an isolated population becomes stronger or is counterbalanced by sufficient gene flow across landscapes as the timescales of fragmentation increase has been less studied. In this study, we compared the population structure and genetic diversity of a distylous herb, Hedyotis chyrsotricha (Rubiaceae), in two contrasting island systems of southeast China. Based on RAD-Seq data, our results showed that populations from the artificially created Thousand-Island Lake (TIL) harbored significantly higher levels of genetic diversity than those from the Holocene-dated Zhoushan Archipelago (ZA) (π = 0.247 vs. 0.208, HO = 0.307 vs. 0.256, HE = 0.228 vs. 0.190), while genetic differences between island and mainland populations were significant in neither the TIL region nor the ZA region. A certain level of population substructure was found in TIL populations, and the level of gene flow among TIL populations was also lower than in ZA populations (m = 0.019 vs. 0.027). Overall, our comparative study revealed that genetic erosion has not become much stronger for the island populations of either the TIL or ZA regions. Our results emphasized that the matrix of water in the island system may facilitate the seed (fruit) dispersal of H. chrysotricha, thus maintaining population connectivity and providing ongoing resilience to the effects of habitat fragmentation over thousands of years
    corecore