1,524 research outputs found

    Threshold Computation to Discover Cluster Structure: A New Approach

    Get PDF
    Cluster members are decided based on how close they are with each other. Compactness of cluster plays an important role in forming better quality clusters. ICNBCF incremental clustering algorithm computes closeness factor between every two data series. To decide members of cluster, it is necessary to know one more decisive factor to compare, threshold. Internal evaluation measure of cluster like variance and dunn index provide required decisive factor. in intial phase of ICNBCF, this decisive factor was given manually by investigative formed closeness factors. With values generated by internal evaluation measure formule, this process can be automated. This paper shows the detailed study of various evaluation measuress to work with new incremental clustreing algorithm ICNBCF

    Adapting RGB pose estimation to new domains

    Get PDF
    2019 Spring.Includes bibliographical references.Many multi-modal human computer interaction (HCI) systems interact with users in real-time by estimating the user's pose. Generally, they estimate human poses using depth sensors such as the Microsoft Kinect.For multi-modal HCI interfaces to gain traction in the real world, however, it would be better for pose estimation to be based on data from RGB cameras, which are more common and less expensive than depth sensors. This has motivated research into pose estimation from RGB images. Convolutional Neural Networks (CNNs) represent the state-of-the-art in this literature, for example [1–5], and [6]. These systems estimate 2D human poses from RGB images. A problem with current CNN-based pose estimators is that they require large amounts of labeled data for training. If the goal is to train an RGB pose estimator for a new domain, the cost of collecting and more importantly labeling data can be prohibitive. A common solution is to train on publicly available pose data sets, but then the trained system is not tailored to the domain. We propose using RGB+D sensors to collect domain-specific data in the lab, and then training the RGB pose estimator using skeletons automatically extracted from the RGB+D data. This paper presents a case study of adapting the RMPE pose estimation network [4] to the domain of the DARPA Communicating with Computers (CWC) program [7], as represented by the EGGNOG data set [8]. We chose RMPE because it predicts both joint locations and Part Affinity Fields (PAFs) in real-time. Our adaptation of RMPE trained on automatically-labeled data outperforms the original RMPE on the EGGNOG data set

    Initiation of CME event observed on November 3, 2010: Multi-wavelength Perspective

    Full text link
    One of the major unsolved problems in Solar Physics is that of CME initiation. In this paper, we have studied the initiation of a flare associated CME which occurred on 2010 November 03 using multi-wavelength observations recorded by Atmospheric Imaging Assembly (AIA) on board Solar Dynamics Observatory (SDO) and Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI). We report an observation of an inflow structure initially in 304~{\AA} and in 1600~{\AA} images, a few seconds later. This inflow strucure was detected as one of the legs of the CME. We also observed a non-thermal compact source concurrent and near co-spatial with the brightening and movement of the inflow structure. The appearance of this compact non-thermal source, brightening and movement of the inflow structure and the subsequent outward movement of the CME structure in the corona led us to conclude that the CME initiation was caused by magnetic reconnection.Comment: 17 pages, 9 figures, Accepted for Publication in The Astrophysical Journa

    LOCKS: User Differentially Private and Federated Optimal Client Sampling

    Full text link
    With changes in privacy laws, there is often a hard requirement for client data to remain on the device rather than being sent to the server. Therefore, most processing happens on the device, and only an altered element is sent to the server. Such mechanisms are developed by leveraging differential privacy and federated learning. Differential privacy adds noise to the client outputs and thus deteriorates the quality of each iteration. This distributed setting adds a layer of complexity and additional communication and performance overhead. These costs are additive per round, so we need to reduce the number of iterations. In this work, we provide an analytical framework for studying the convergence guarantees of gradient-based distributed algorithms. We show that our private algorithm minimizes the expected gradient variance by approximately d2d^2 rounds, where d is the dimensionality of the model. We discuss and suggest novel ways to improve the convergence rate to minimize the overhead using Importance Sampling (IS) and gradient diversity. Finally, we provide alternative frameworks that might be better suited to exploit client sampling techniques like IS and gradient diversity

    On Stability and Similarity of Network Embeddings

    Get PDF
    Machine Learning on graphs has become an active research area due to the prevailing graph-structured data in the real world. Many real-world applications can be modeled with graphs. Modern application domains include web-scale social networks [26], recommender systems, knowledge graphs, and biological or protein networks. However, there are various challenges. First, the graphs generated from such applications are often large. Moreover, in some scenarios, the complete graph is not available, e.g., for privacy reasons. Thus, it becomes impractical to perform network analysis or compute various graph measures. Hence, graph sampling becomes an important task.Sampling is often the first step to handle any type of massive data. The same applies to graphs as well, which leads to many graph sampling techniques. Sampling Techniques include Node-based (e.g., Random Node Sampling), Edge-based (e.g., Random Edge Sampling) and Traversal-based (e.g., Random Walk Sampling). Graphs are often analyzed by first embedding (i.e., representing) them in some matrix/vector form with some number of dimensions. Various graph embedding methods have been developed to convert raw graph data into high dimensional vectors while preserving intrinsic graph properties [3]. The embedding methods focus on the node-level, edge-level [28], a hybrid, or at the graph level. This thesis focuses on graph-level embeddings which allows calculating similarity between two graphs. With the knowledge of embedding and sampling methods, the natural questions to ask are: 1) What is a good sampling size to ensure embeddings are similar enough to that of the original graph? 2) Do results depend on the sampling method? 3) Do they depend on the embedding method? 4) As we have embeddings, can we find some similarity between the original graph and sample? 5) How do we decide if the sample is good or not? How do we decide if the embedding is good or not? Essentially, if we have an embedding method and a sampling strategy, can we find the smallest sampling size that will give an ε-similar embedding to that of the original graph? We will try to answer the above questions in the thesis and give a new perspective on graph sampling. The experiments are conducted on graphs with thousands of edges and nodes. The datasets include graphs from social networks, autonomous systems, peer-to-peer networks, and collabo- ration networks. Two sampling methods are targeted namely - Random Node Sampling, and Random Edge Sampling. Euclidean distance is used as a similarity metric. Experiments are car- ried out on Graph2vec, and Spectral Features(SF) graph embedding methods. Univariate analysis is performed to decide a minimum sample which gives, e.g., 40% minimum sample for 80% similarity. We also design a Regression model which predicts similarity for a given sampling size and graph properties. Finally, we analyze the stability of the embedding methods, where we find that that e.g., Graph2Vec is a stable embedding method
    • …
    corecore