867,646 research outputs found
Supporting Regularized Logistic Regression Privately and Efficiently
As one of the most popular statistical and machine learning models, logistic
regression with regularization has found wide adoption in biomedicine, social
sciences, information technology, and so on. These domains often involve data
of human subjects that are contingent upon strict privacy regulations.
Increasing concerns over data privacy make it more and more difficult to
coordinate and conduct large-scale collaborative studies, which typically rely
on cross-institution data sharing and joint analysis. Our work here focuses on
safeguarding regularized logistic regression, a widely-used machine learning
model in various disciplines while at the same time has not been investigated
from a data security and privacy perspective. We consider a common use scenario
of multi-institution collaborative studies, such as in the form of research
consortia or networks as widely seen in genetics, epidemiology, social
sciences, etc. To make our privacy-enhancing solution practical, we demonstrate
a non-conventional and computationally efficient method leveraging distributing
computing and strong cryptography to provide comprehensive protection over
individual-level and summary data. Extensive empirical evaluation on several
studies validated the privacy guarantees, efficiency and scalability of our
proposal. We also discuss the practical implications of our solution for
large-scale studies and applications from various disciplines, including
genetic and biomedical studies, smart grid, network analysis, etc
Crowd-sourcing evaluation of automatically acquired, morphologically related word groupings
The automatic discovery and clustering of morphologically related words is an important problem with several practical
applications. This paper describes the evaluation of word clusters carried out through crowd-sourcing techniques for the
Maltese language. The hybrid (Semitic-Romance) nature of Maltese morphology, together with the fact that no large-scale
lexical resources are available for Maltese, make this an interesting and challenging problem.peer-reviewe
Modelling curved-layered printing paths for fabricating large-scale construction components
In this paper, a non-conventional way of additive manufacturing, curved-layered printing, has been applied to large-scale
construction process. Despite the number of research works on Curved Layered Fused Deposition Modelling (CLFDM)
over the last decade, few practical applications have been reported. An alternative method adopting the CLFDM principle,
that generates a curved-layered printing path, was developed using a single scripting environment called Grasshopper
– a plugin of Rhinoceros®
. The method was evaluated with the 3D Concrete Printing process developed at Loughborough
University. The evaluation of the method including the results of simulation and printing revealed three principal benefits
compared with existing flat-layered printing paths, which are particularly beneficial to large-scale AM techniques: (i)
better surface quality, (ii) shorter printing time and (iii) higher surface strengths
TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank
Learning-to-Rank deals with maximizing the utility of a list of examples
presented to the user, with items of higher relevance being prioritized. It has
several practical applications such as large-scale search, recommender systems,
document summarization and question answering. While there is widespread
support for classification and regression based learning, support for
learning-to-rank in deep learning has been limited. We propose TensorFlow
Ranking, the first open source library for solving large-scale ranking problems
in a deep learning framework. It is highly configurable and provides
easy-to-use APIs to support different scoring mechanisms, loss functions and
evaluation metrics in the learning-to-rank setting. Our library is developed on
top of TensorFlow and can thus fully leverage the advantages of this platform.
For example, it is highly scalable, both in training and in inference, and can
be used to learn ranking models over massive amounts of user activity data,
which can include heterogeneous dense and sparse features. We empirically
demonstrate the effectiveness of our library in learning ranking functions for
large-scale search and recommendation applications in Gmail and Google Drive.
We also show that ranking models built using our model scale well for
distributed training, without significant impact on metrics. The proposed
library is available to the open source community, with the hope that it
facilitates further academic research and industrial applications in the field
of learning-to-rank.Comment: KDD 201
Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena
The problem of modeling and predicting spatiotemporal traffic phenomena over
an urban road network is important to many traffic applications such as
detecting and forecasting congestion hotspots. This paper presents a
decentralized data fusion and active sensing (D2FAS) algorithm for mobile
sensors to actively explore the road network to gather and assimilate the most
informative data for predicting the traffic phenomenon. We analyze the time and
communication complexity of D2FAS and demonstrate that it can scale well with a
large number of observations and sensors. We provide a theoretical guarantee on
its predictive performance to be equivalent to that of a sophisticated
centralized sparse approximation for the Gaussian process (GP) model: The
computation of such a sparse approximate GP model can thus be parallelized and
distributed among the mobile sensors (in a Google-like MapReduce paradigm),
thereby achieving efficient and scalable prediction. We also theoretically
guarantee its active sensing performance that improves under various practical
environmental conditions. Empirical evaluation on real-world urban road network
data shows that our D2FAS algorithm is significantly more time-efficient and
scalable than state-of-the-art centralized algorithms while achieving
comparable predictive performance.Comment: 28th Conference on Uncertainty in Artificial Intelligence (UAI 2012),
Extended version with proofs, 13 page
Facilitating the transition to an inverter dominated power system : experimental evaluation of a non-intrusive add-on predictive controller
The transition to an inverter-dominated power system is expected with the large-scale integration of distributed energy resources (DER). To improve the dynamic response of DERs already installed within such a system, a non-intrusive add-on controller referred to as SPAACE (set point automatic adjustment with correction enabled), has been proposed in the literature. Extensive simulation-based analysis and supporting mathematical foundations have helped establish its theoretical prevalence. This paper establishes the practical real-world relevance of SPAACE via a rigorous performance evaluation utilizing a high fidelity hardware-in-the-loop systems test bed. A comprehensive methodological approach to the evaluation with several practical measures has been undertaken and the performance of SPAACE subject to representative scenarios assessed. With the evaluation undertaken, the fundamental hypothesis of SPAACE for real-world applications has been proven, i.e., improvements in dynamic performance can be achieved without access to the internal controller. Furthermore, based on the quantitative analysis, observations, and recommendations are reported. These provide guidance for future potential users of the approach in their efforts to accelerate the transition to an inverter-dominated power system
- …