867,646 research outputs found

    Supporting Regularized Logistic Regression Privately and Efficiently

    Full text link
    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Increasing concerns over data privacy make it more and more difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used machine learning model in various disciplines while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluation on several studies validated the privacy guarantees, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc

    Crowd-sourcing evaluation of automatically acquired, morphologically related word groupings

    Get PDF
    The automatic discovery and clustering of morphologically related words is an important problem with several practical applications. This paper describes the evaluation of word clusters carried out through crowd-sourcing techniques for the Maltese language. The hybrid (Semitic-Romance) nature of Maltese morphology, together with the fact that no large-scale lexical resources are available for Maltese, make this an interesting and challenging problem.peer-reviewe

    Modelling curved-layered printing paths for fabricating large-scale construction components

    Get PDF
    In this paper, a non-conventional way of additive manufacturing, curved-layered printing, has been applied to large-scale construction process. Despite the number of research works on Curved Layered Fused Deposition Modelling (CLFDM) over the last decade, few practical applications have been reported. An alternative method adopting the CLFDM principle, that generates a curved-layered printing path, was developed using a single scripting environment called Grasshopper – a plugin of Rhinoceros® . The method was evaluated with the 3D Concrete Printing process developed at Loughborough University. The evaluation of the method including the results of simulation and printing revealed three principal benefits compared with existing flat-layered printing paths, which are particularly beneficial to large-scale AM techniques: (i) better surface quality, (ii) shorter printing time and (iii) higher surface strengths

    TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank

    Full text link
    Learning-to-Rank deals with maximizing the utility of a list of examples presented to the user, with items of higher relevance being prioritized. It has several practical applications such as large-scale search, recommender systems, document summarization and question answering. While there is widespread support for classification and regression based learning, support for learning-to-rank in deep learning has been limited. We propose TensorFlow Ranking, the first open source library for solving large-scale ranking problems in a deep learning framework. It is highly configurable and provides easy-to-use APIs to support different scoring mechanisms, loss functions and evaluation metrics in the learning-to-rank setting. Our library is developed on top of TensorFlow and can thus fully leverage the advantages of this platform. For example, it is highly scalable, both in training and in inference, and can be used to learn ranking models over massive amounts of user activity data, which can include heterogeneous dense and sparse features. We empirically demonstrate the effectiveness of our library in learning ranking functions for large-scale search and recommendation applications in Gmail and Google Drive. We also show that ranking models built using our model scale well for distributed training, without significant impact on metrics. The proposed library is available to the open source community, with the hope that it facilitates further academic research and industrial applications in the field of learning-to-rank.Comment: KDD 201

    Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena

    Get PDF
    The problem of modeling and predicting spatiotemporal traffic phenomena over an urban road network is important to many traffic applications such as detecting and forecasting congestion hotspots. This paper presents a decentralized data fusion and active sensing (D2FAS) algorithm for mobile sensors to actively explore the road network to gather and assimilate the most informative data for predicting the traffic phenomenon. We analyze the time and communication complexity of D2FAS and demonstrate that it can scale well with a large number of observations and sensors. We provide a theoretical guarantee on its predictive performance to be equivalent to that of a sophisticated centralized sparse approximation for the Gaussian process (GP) model: The computation of such a sparse approximate GP model can thus be parallelized and distributed among the mobile sensors (in a Google-like MapReduce paradigm), thereby achieving efficient and scalable prediction. We also theoretically guarantee its active sensing performance that improves under various practical environmental conditions. Empirical evaluation on real-world urban road network data shows that our D2FAS algorithm is significantly more time-efficient and scalable than state-of-the-art centralized algorithms while achieving comparable predictive performance.Comment: 28th Conference on Uncertainty in Artificial Intelligence (UAI 2012), Extended version with proofs, 13 page

    Facilitating the transition to an inverter dominated power system : experimental evaluation of a non-intrusive add-on predictive controller

    Get PDF
    The transition to an inverter-dominated power system is expected with the large-scale integration of distributed energy resources (DER). To improve the dynamic response of DERs already installed within such a system, a non-intrusive add-on controller referred to as SPAACE (set point automatic adjustment with correction enabled), has been proposed in the literature. Extensive simulation-based analysis and supporting mathematical foundations have helped establish its theoretical prevalence. This paper establishes the practical real-world relevance of SPAACE via a rigorous performance evaluation utilizing a high fidelity hardware-in-the-loop systems test bed. A comprehensive methodological approach to the evaluation with several practical measures has been undertaken and the performance of SPAACE subject to representative scenarios assessed. With the evaluation undertaken, the fundamental hypothesis of SPAACE for real-world applications has been proven, i.e., improvements in dynamic performance can be achieved without access to the internal controller. Furthermore, based on the quantitative analysis, observations, and recommendations are reported. These provide guidance for future potential users of the approach in their efforts to accelerate the transition to an inverter-dominated power system
    corecore