Search CORE

867,646 research outputs found

Supporting Regularized Logistic Regression Privately and Efficiently

Author: Li Wenfa
Liu Hongzhe
Xie Wei
Yang Peng
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 30/09/2015
Field of study

As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Increasing concerns over data privacy make it more and more difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used machine learning model in various disciplines while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluation on several studies validated the privacy guarantees, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Crowd-sourcing evaluation of automatically acquired, morphologically related word groupings

Author: Borg Claudia
Gatt Albert
Ninth International Conference on Language Resources and Evaluation
Publication venue: European Language Resources Association
Publication date: 01/01/2014
Field of study

The automatic discovery and clustering of morphologically related words is an important problem with several practical applications. This paper describes the evaluation of word clusters carried out through crowd-sourcing techniques for the Maltese language. The hybrid (Semitic-Romance) nature of Maltese morphology, together with the fact that no large-scale lexical resources are available for Maltese, make this an interesting and challenging problem.peer-reviewe

OAR@UM

Modelling curved-layered printing paths for fabricating large-scale construction components

Author: Daniel Piker (7180262)
Philip J. Valentine (7180259)
Richard Buswell (1249932)
Simon Austin (1251210)
Sungwoo Lim (643589)
Xavier De Kestelier (7180265)
Publication venue
Publication date: 03/06/2016
Field of study

In this paper, a non-conventional way of additive manufacturing, curved-layered printing, has been applied to large-scale construction process. Despite the number of research works on Curved Layered Fused Deposition Modelling (CLFDM) over the last decade, few practical applications have been reported. An alternative method adopting the CLFDM principle, that generates a curved-layered printing path, was developed using a single scripting environment called Grasshopper – a plugin of Rhinoceros® . The method was evaluated with the 3D Concrete Printing process developed at Loughborough University. The evaluation of the method including the results of simulation and printing revealed three principal benefits compared with existing flat-layered printing paths, which are particularly beneficial to large-scale AM techniques: (i) better surface quality, (ii) shorter printing time and (iii) higher surface strengths

Loughborough University Institutional Repository

TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank

Author: Anil Rohan
Bendersky Michael
Bruch Sebastian
Golbandi Nadav
Li Cheng
Najork Marc
Pasumarthi Rama Kumar
Pfeifer Jan
Wang Xuanhui
Wolf Stephan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/05/2019
Field of study

Learning-to-Rank deals with maximizing the utility of a list of examples presented to the user, with items of higher relevance being prioritized. It has several practical applications such as large-scale search, recommender systems, document summarization and question answering. While there is widespread support for classification and regression based learning, support for learning-to-rank in deep learning has been limited. We propose TensorFlow Ranking, the first open source library for solving large-scale ranking problems in a deep learning framework. It is highly configurable and provides easy-to-use APIs to support different scoring mechanisms, loss functions and evaluation metrics in the learning-to-rank setting. Our library is developed on top of TensorFlow and can thus fully leverage the advantages of this platform. For example, it is highly scalable, both in training and in inference, and can be used to learn ranking models over massive amounts of user activity data, which can include heterogeneous dense and sparse features. We empirically demonstrate the effectiveness of our library in learning ranking functions for large-scale search and recommendation applications in Gmail and Google Drive. We also show that ranking models built using our model scale well for distributed training, without significant impact on metrics. The proposed library is available to the open source community, with the hope that it facilitates further academic research and industrial applications in the field of learning-to-rank.Comment: KDD 201

arXiv.org e-Print Archive

Crossref

Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena

Author: Chen Jie
Dolan John M.
Jaillet Patrick
Low Kian Hsiang
Oran Ali
Sukhatme Gaurav S.
Tan Colin Keng-Yan
Publication venue
Publication date: 01/01/2012
Field of study

The problem of modeling and predicting spatiotemporal traffic phenomena over an urban road network is important to many traffic applications such as detecting and forecasting congestion hotspots. This paper presents a decentralized data fusion and active sensing (D2FAS) algorithm for mobile sensors to actively explore the road network to gather and assimilate the most informative data for predicting the traffic phenomenon. We analyze the time and communication complexity of D2FAS and demonstrate that it can scale well with a large number of observations and sensors. We provide a theoretical guarantee on its predictive performance to be equivalent to that of a sophisticated centralized sparse approximation for the Gaussian process (GP) model: The computation of such a sparse approximate GP model can thus be parallelized and distributed among the mobile sensors (in a Google-like MapReduce paradigm), thereby achieving efficient and scalable prediction. We also theoretically guarantee its active sensing performance that improves under various practical environmental conditions. Empirical evaluation on real-world urban road network data shows that our D2FAS algorithm is significantly more time-efficient and scalable than state-of-the-art centralized algorithms while achieving comparable predictive performance.Comment: 28th Conference on Uncertainty in Artificial Intelligence (UAI 2012), Extended version with proofs, 13 page

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

ScholarBank@NUS

Facilitating the transition to an inverter dominated power system : experimental evaluation of a non-intrusive add-on predictive controller

Author: Burt Graeme M.
Guillo-Sansano Efren
Mehrizi-Sani Ali
Syed Mazheruddin H.
Publication venue: 'MDPI AG'
Publication date: 16/08/2020
Field of study

The transition to an inverter-dominated power system is expected with the large-scale integration of distributed energy resources (DER). To improve the dynamic response of DERs already installed within such a system, a non-intrusive add-on controller referred to as SPAACE (set point automatic adjustment with correction enabled), has been proposed in the literature. Extensive simulation-based analysis and supporting mathematical foundations have helped establish its theoretical prevalence. This paper establishes the practical real-world relevance of SPAACE via a rigorous performance evaluation utilizing a high fidelity hardware-in-the-loop systems test bed. A comprehensive methodological approach to the evaluation with several practical measures has been undertaken and the performance of SPAACE subject to representative scenarios assessed. With the evaluation undertaken, the fundamental hypothesis of SPAACE for real-world applications has been proven, i.e., improvements in dynamic performance can be achieved without access to the internal controller. Furthermore, based on the quantitative analysis, observations, and recommendations are reported. These provide guidance for future potential users of the approach in their efforts to accelerate the transition to an inverter-dominated power system

Multidisciplinary Digital Publishing Institute

University of Strathclyde Institutional Repository