142 research outputs found
Trust- and Distrust-Based Recommendations for Controversial Reviews
Recommender systems that incorporate a social trust network among their users have the potential to make more personalized recommendations compared to traditional collaborative filtering systems, provided they succeed in utilizing the additional trust and distrust information to their advantage. We compare the performance of several well-known trust-enhanced techniques for recommending controversial reviews from Epinions.com, and provide the first experimental study of using distrust in the recommendation process
Born to trade: a genetically evolved keyword bidder for sponsored search
In sponsored search auctions, advertisers choose a set of keywords based on products they wish to market. They bid for advertising slots that will be displayed on the search results page when a user submits a query containing the keywords that the advertiser selected. Deciding how much to bid is a real challenge: if the bid is too low with respect to the bids of other advertisers, the ad might not get displayed in a favorable position; a bid that is too high on the other hand might not be profitable either, since the attracted number of conversions might not be enough to compensate for the high cost per click.
In this paper we propose a genetically evolved keyword bidding strategy that decides how much to bid for each query based on historical data such as the position obtained on the previous day. In light of the fact that our approach does not implement any particular expert knowledge on keyword auctions, it did remarkably well in the Trading Agent Competition at IJCAI2009
Privacy-preserving scoring of tree ensembles : a novel framework for AI in healthcare
Machine Learning (ML) techniques now impact a wide variety of domains. Highly regulated industries such as healthcare and finance have stringent compliance and data governance policies around data sharing. Advances in secure multiparty computation (SMC) for privacy-preserving machine learning (PPML) can help transform these regulated industries by allowing ML computations over encrypted data with personally identifiable information (PII). Yet very little of SMC-based PPML has been put into practice so far. In this paper we present the very first framework for privacy-preserving classification of tree ensembles with application in healthcare. We first describe the underlying cryptographic protocols that enable a healthcare organization to send encrypted data securely to a ML scoring service and obtain encrypted class labels without the scoring service actually seeing that input in the clear. We then describe the deployment challenges we solved to integrate these protocols in a cloud based scalable risk-prediction platform with multiple ML models for healthcare AI. Included are system internals, and evaluations of our deployment for supporting physicians to drive better clinical outcomes in an accurate, scalable, and provably secure manner. To the best of our knowledge, this is the first such applied framework with SMC-based privacy-preserving machine learning for healthcare
Electrochemical tuning and mechanical resilience of single wall carbon nanotubes
Single-wall carbon nanotubes (SWNTs) are fascinating systems exhibiting many novel physical properties. In this paper, we give a brief review of the structural, electronic, vibrational, and mechanical properties of carbon nanotubes. In situ resonance Raman scattering of SWNTs investigated under electrochemical biasing demonstrates that the intensity of the radial breathing mode varies significantly in a nonmonotonic manner as a function of the cathodic bias voltage, but does not change appreciably under anodic bias. These results can be quantitatively understood in terms of the changes in the energy gaps between the 1D van Hove singularities in the electron density of states, arising possibly due to the alterations in the overlap integral of π bonds between the p-orbitals of the adjacent carbon atoms. In the second part of this paper, we review our high-pressure X-ray diffraction results, which show that the triangular lattice of the carbon nanotube bundles continues to persist up to ~10 GPa. The lattice is seen to relax just before the phase transformation, which is observed at ~10 GPa. Further, our results display the reversibility of the 2D lattice symmetry even after compression up to 13 GPa well beyond the 5 GPa value observed recently. These experimental results explicitly validate the predicted remarkable mechanical resilience of the nanotubes
Computing fuzzy rough approximations in large scale information systems
Rough set theory is a popular and powerful machine learning tool. It is especially suitable for dealing with information systems that exhibit inconsistencies, i.e. objects that have the same values for the conditional attributes but a different value for the decision attribute. In line with the emerging granular computing paradigm, rough set theory groups objects together based on the indiscernibility of their attribute values. Fuzzy rough set theory extends rough set theory to data with continuous attributes, and detects degrees of inconsistency in the data. Key to this is turning the indiscernibility relation into a gradual relation, acknowledging that objects can be similar to a certain extent. In very large datasets with millions of objects, computing the gradual indiscernibility relation (or in other words, the soft granules) is very demanding, both in terms of runtime and in terms of memory. It is however required for the computation of the lower and upper approximations of concepts in the fuzzy rough set analysis pipeline. Current non-distributed implementations in R are limited by memory capacity. For example, we found that a state of the art non-distributed implementation in R could not handle 30,000 rows and 10 attributes on a node with 62GB of memory. This is clearly insufficient to scale fuzzy rough set analysis to massive datasets. In this paper we present a parallel and distributed solution based on Message Passing Interface (MPI) to compute fuzzy rough approximations in very large information systems. Our results show that our parallel approach scales with problem size to information systems with millions of objects. To the best of our knowledge, no other parallel and distributed solutions have been proposed so far in the literature for this problem
NPRL: Nightly Profile Representation Learning for Early Sepsis Onset Prediction in ICU Trauma Patients
Sepsis is a syndrome that develops in response to the presence of infection.
It is characterized by severe organ dysfunction and is one of the leading
causes of mortality in Intensive Care Units (ICUs) worldwide. These
complications can be reduced through early application of antibiotics, hence
the ability to anticipate the onset of sepsis early is crucial to the survival
and well-being of patients. Current machine learning algorithms deployed inside
medical infrastructures have demonstrated poor performance and are insufficient
for anticipating sepsis onset early. In recent years, deep learning
methodologies have been proposed to predict sepsis, but some fail to capture
the time of onset (e.g., classifying patients' entire visits as developing
sepsis or not) and others are unrealistic to be deployed into medical
facilities (e.g., creating training instances using a fixed time to onset where
the time of onset needs to be known apriori). Therefore, in this paper, we
first propose a novel but realistic prediction framework that predicts each
morning whether sepsis onset will occur within the next 24 hours with the help
of most recent data collected at night, when patient-provider ratios are higher
due to cross-coverage resulting in limited observation to each patient.
However, as we increase the prediction rate into daily, the number of negative
instances will increase while that of positive ones remain the same.
Thereafter, we have a severe class imbalance problem, making a machine learning
model hard to capture rare sepsis cases. To address this problem, we propose to
do nightly profile representation learning (NPRL) for each patient. We prove
that NPRL can theoretically alleviate the rare event problem. Our empirical
study using data from a level-1 trauma center further demonstrates the
effectiveness of our proposal
Multi-Subset Approach to Early Sepsis Prediction
Sepsis is a life-threatening organ malfunction caused by the host's inability
to fight infection, which can lead to death without proper and immediate
treatment. Therefore, early diagnosis and medical treatment of sepsis in
critically ill populations at high risk for sepsis and sepsis-associated
mortality are vital to providing the patient with rapid therapy. Studies show
that advancing sepsis detection by 6 hours leads to earlier administration of
antibiotics, which is associated with improved mortality. However, clinical
scores like Sequential Organ Failure Assessment (SOFA) are not applicable for
early prediction, while machine learning algorithms can help capture the
progressing pattern for early prediction. Therefore, we aim to develop a
machine learning algorithm that predicts sepsis onset 6 hours before it is
suspected clinically. Although some machine learning algorithms have been
applied to sepsis prediction, many of them did not consider the fact that six
hours is not a small gap. To overcome this big gap challenge, we explore a
multi-subset approach in which the likelihood of sepsis occurring earlier than
6 hours is output from a previous subset and feed to the target subset as
additional features. Moreover, we use the hourly sampled data like vital signs
in an observation window to derive a temporal change trend to further assist,
which however is often ignored by previous studies. Our empirical study shows
that both the multi-subset approach to alleviating the 6-hour gap and the added
temporal trend features can help improve the performance of sepsis-related
early prediction
- …