76 research outputs found
Diversity-aware -median : Clustering with fair center representation
We introduce a novel problem for diversity-aware clustering. We assume that
the potential cluster centers belong to a set of groups defined by protected
attributes, such as ethnicity, gender, etc. We then ask to find a minimum-cost
clustering of the data into clusters so that a specified minimum number of
cluster centers are chosen from each group. We thus require that all groups are
represented in the clustering solution as cluster centers, according to
specified requirements. More precisely, we are given a set of clients , a
set of facilities \pazocal{F}, a collection
of facility groups F_i \subseteq \pazocal{F}, budget , and a set of
lower-bound thresholds , one for each group in
. The \emph{diversity-aware -median problem} asks to find a set
of facilities in \pazocal{F} such that , that
is, at least centers in are from group , and the -median cost
is minimized. We show that in the
general case where the facility groups may overlap, the diversity-aware
-median problem is \np-hard, fixed-parameter intractable, and inapproximable
to any multiplicative factor. On the other hand, when the facility groups are
disjoint, approximation algorithms can be obtained by reduction to the
\emph{matroid median} and \emph{red-blue median} problems. Experimentally, we
evaluate our approximation methods for the tractable cases, and present a
relaxation-based heuristic for the theoretically intractable case, which can
provide high-quality and efficient solutions for real-world datasets.Comment: To appear in ECML-PKDD 202
SARS-CoV-2 and the incredible tale of the dying monkeys
Where did the SARS-CoV-2
come from? Did it appear
suddenly, out of nowhere,
fully equipped to infect us?
Or is it a virus from a bat
or pangolin that suddenly
jumped species to infect
us? How common is it for
a microbe to jump host
species? And why would a
microbe make such a jump
Fair Column Subset Selection
We consider the problem of fair column subset selection. In particular, we
assume that two groups are present in the data, and the chosen column subset
must provide a good approximation for both, relative to their respective best
rank-k approximations. We show that this fair setting introduces significant
challenges: in order to extend known results, one cannot do better than the
trivial solution of simply picking twice as many columns as the original
methods. We adopt a known approach based on deterministic leverage-score
sampling, and show that merely sampling a subset of appropriate size becomes
NP-hard in the presence of two groups. Whereas finding a subset of two times
the desired size is trivial, we provide an efficient algorithm that achieves
the same guarantees with essentially 1.5 times that size. We validate our
methods through an extensive set of experiments on real-world data
Featuring of Electricity Consumption Behavior towards Big-Data Applications
There is growing interest in discerning behaviors of electricity users in both the residential and commercial sectors. With the advent of high-resolution time-series power demand data through advanced metering. Large volumes of smart meter data gives opportunity for load serving entities to improve their knowledge on customers electricity consumption behavior via load profiling. This paper implements a novel approach for clustering of electricity consumption behavior dynamics.first for each individual customer symbolic aggregate approximation(SAX) to reduce the scale of the data set,and time based Markov model is applied to model the dynamics of electricity consumption, transforming the large set of load curves to several state transition matrixes. A density-based clustering technique, CFSFDP, is performed to discover the typical dynamics of electricity consumption and segment customers into different groups
Sleep Health IQP
In this project, the concepts of Behavior Change Support Systems (BCSS) and responsive design were reviewed, and these theories/methodologies were applied to the development of a phone app with the goal of helping to improve sleep health amongst a college undergraduate population. The phone app provides reminders and useful information, allows personalization of the features and interface, and provides feedback and goal tracking features. These aspects of the app were used to try and project expertise while still promoting similarity,facilitating self-monitoring, and providing reminders as they are defined within the context of BCSS. This app serves as the foundation for testing the robustness of BCSS and addressing the problem of poor sleep health among WPI undergraduates
A prospective randomized comparative study of the efficacy of sustained release vaginal insert versus intracervical gel in primigravidae at term pregnancy
vBackground: Induction of labour is the intentional initiation of labour before spontaneous onset for the purpose of delivery of fetoplacental unit. Failure of induction is responsible for increased incidence of caesarean delivery. This study performed to assess and compare the clinical effects of sustained release vaginal insert versus intracervical gel in primiparous women with term pregnancy in terms of improvement of Bishop’s score, Induction delivery interval, incidence of hyperstimulation, maternal and neonatal outcomes.Methods: A total 100 consecutive term pregnant women who underwent labor induction for fetal or maternal indications were divided randomly into two groups. Group A - sustained release Vaginal insert and Group B - Intracervical gel. Informed consent was taken from each patient.Results: Statistically significant increase in final Bishop’s score (p=0.008) and hyperstimulation (p=0.04) was seen in Vaginal insert group as compared to Intracervical gel group, while there were no statistically significant differences in maternal outcomes, neonatal outcomes and need for oxytocin augmentation in both groups.Conclusions: In this study we found that insert did not improve the induction delivery interval or rate of successful induction, nor did it have any advantage in terms of neonatal outcome although it did improve the Bishops score – Its advantage was in terms of single application, few prevaginal examinations, longer duration of action and immediate retrieval in case of hyperstimulation. Its main drawback remained the maintenance of cold chain without which its efficacy decreases. Another significant observation was the dropout rate of insert (16%)
- …