42,868 research outputs found

    A Socio-Informatic Approach to Automated Account Classification on Social Media

    Full text link
    Automated accounts on social media have become increasingly problematic. We propose a key feature in combination with existing methods to improve machine learning algorithms for bot detection. We successfully improve classification performance through including the proposed feature.Comment: International Conference on Social Media and Societ

    explorase: Multivariate Exploratory Analysis and Visualization for Systems Biology

    Get PDF
    The datasets being produced by high-throughput biological experiments, such as microarrays, have forced biologists to turn to sophisticated statistical analysis and visualization tools in order to understand their data. We address the particular need for an open-source exploratory data analysis tool that applies numerical methods in coordination with interactive graphics to the analysis of experimental data. The software package, known as explorase, provides a graphical user interface (GUI) on top of the R platform for statistical computing and the GGobi software for multivariate interactive graphics. The GUI is designed for use by biologists, many of whom are unfamiliar with the R language. It displays metadata about experimental design and biological entities in tables that are sortable and filterable. There are menu shortcuts to the analysis methods implemented in R, including graphical interfaces to linear modeling tools. The GUI is linked to data plots in GGobi through a brush tool that simultaneously colors rows in the entity information table and points in the GGobi plots.

    Quantitive analysis of electric vehicle flexibility : a data-driven approach

    Get PDF
    The electric vehicle (EV) flexibility, indicates to what extent the charging load can be coordinated (i.e., to flatten the load curve or to utilize renewable energy resources). However, such flexibility is neither well analyzed nor effectively quantified in literature. In this paper we fill this gap and offer an extensive analysis of the flexibility characteristics of 390k EV charging sessions and propose measures to quantize their flexibility exploitation. Our contributions include: (1) characterization of the EV charging behavior by clustering the arrival and departure time combinations that leads to the identification of type of EV charging behavior, (2) in-depth analysis of the characteristics of the charging sessions in each behavioral cluster and investigation of the influence of weekdays and seasonal changes on those characteristics including arrival, sojourn and idle times, and (3) proposing measures and an algorithm to quantitatively analyze how much flexibility (in terms of duration and amount) is used at various times of a day, for two representative scenarios. Understanding the characteristics of that flexibility (e.g., amount, time and duration of availability) and when it is used (in terms of both duration and amount) helps to develop more realistic price and incentive schemes in DR algorithms to efficiently exploit the offered flexibility or to estimate when to stimulate additional flexibility. (C) 2017 Elsevier Ltd. All rights reserved

    Optimal Cell Clustering and Activation for Energy Saving in Load-Coupled Wireless Networks

    Full text link
    Optimizing activation and deactivation of base station transmissions provides an instrument for improving energy efficiency in cellular networks. In this paper, we study optimal cell clustering and scheduling of activation duration for each cluster, with the objective of minimizing the sum energy, subject to a time constraint of delivering the users' traffic demand. The cells within a cluster are simultaneously in transmission and napping modes, with cluster activation and deactivation, respectively. Our optimization framework accounts for the coupling relation among cells due to the mutual interference. Thus, the users' achievable rates in a cell depend on the cluster composition. On the theoretical side, we provide mathematical formulation and structural characterization for the energy-efficient cell clustering and scheduling optimization problem, and prove its NP hardness. On the algorithmic side, we first show how column generation facilitates problem solving, and then present our notion of local enumeration as a flexible and effective means for dealing with the trade-off between optimality and the combinatorial nature of cluster formation, as well as for the purpose of gauging the deviation from optimality. Numerical results demonstrate that our solutions achieve more than 60% energy saving over existing schemes, and that the solutions we obtain are within a few percent of deviation from global optimum.Comment: Revision, IEEE Transactions on Wireless Communication

    Sparsest factor analysis for clustering variables: a matrix decomposition approach

    Get PDF
    We propose a new procedure for sparse factor analysis (FA) such that each variable loads only one common factor. Thus, the loading matrix has a single nonzero element in each row and zeros elsewhere. Such a loading matrix is the sparsest possible for certain number of variables and common factors. For this reason, the proposed method is named sparsest FA (SSFA). It may also be called FA-based variable clustering, since the variables loading the same common factor can be classified into a cluster. In SSFA, all model parts of FA (common factors, their correlations, loadings, unique factors, and unique variances) are treated as fixed unknown parameter matrices and their least squares function is minimized through specific data matrix decomposition. A useful feature of the algorithm is that the matrix of common factor scores is re-parameterized using QR decomposition in order to efficiently estimate factor correlations. A simulation study shows that the proposed procedure can exactly identify the true sparsest models. Real data examples demonstrate the usefulness of the variable clustering performed by SSFA
    • …
    corecore