27 research outputs found

    MedLens: Improve mortality prediction via medical signs selecting and regression interpolation

    Full text link
    Monitoring the health status of patients and predicting mortality in advance is vital for providing patients with timely care and treatment. Massive medical signs in electronic health records (EHR) are fitted into advanced machine learning models to make predictions. However, the data-quality problem of original clinical signs is less discussed in the literature. Based on an in-depth measurement of the missing rate and correlation score across various medical signs and a large amount of patient hospital admission records, we discovered the comprehensive missing rate is extremely high, and a large number of useless signs could hurt the performance of prediction models. Then we concluded that only improving data-quality could improve the baseline accuracy of different prediction algorithms. We designed MEDLENS, with an automatic vital medical signs selection approach via statistics and a flexible interpolation approach for high missing rate time series. After augmenting the data-quality of original medical signs, MEDLENS applies ensemble classifiers to boost the accuracy and reduce the computation overhead at the same time. It achieves a very high accuracy performance of 0.96% AUC-ROC and 0.81% AUC-PR, which exceeds the previous benchmark

    An Integrative Paradigm for Enhanced Stroke Prediction: Synergizing XGBoost and xDeepFM Algorithms

    Full text link
    Stroke prediction plays a crucial role in preventing and managing this debilitating condition. In this study, we address the challenge of stroke prediction using a comprehensive dataset, and propose an ensemble model that combines the power of XGBoost and xDeepFM algorithms. Our work aims to improve upon existing stroke prediction models by achieving higher accuracy and robustness. Through rigorous experimentation, we validate the effectiveness of our ensemble model using the AUC metric. Through comparing our findings with those of other models in the field, we gain valuable insights into the merits and drawbacks of various approaches. This, in turn, contributes significantly to the progress of machine learning and deep learning techniques specifically in the domain of stroke prediction

    Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems

    Full text link
    Recommender systems are expected to be assistants that help human users find relevant information automatically without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of four stages, initial preference calculation, training data collection, algorithm training and interaction, and metrics calculation that involves two proposed metrics. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not necessarily mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.Comment: 33 pages, 11 figures, 4 tables, ACM Transactions on Information System

    Large-scale Interactive Recommendation with Tree-structured Policy Gradient

    Full text link
    Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for long-run performance. As IRS is always with thousands of items to recommend (i.e., thousands of actions), most existing RL-based methods, however, fail to handle such a large discrete action space problem and thus become inefficient. The existing work that tries to deal with the large discrete action space problem by utilizing the deep deterministic policy gradient framework suffers from the inconsistency between the continuous action representation (the output of the actor network) and the real discrete action. To avoid such inconsistency and achieve high efficiency and recommendation effectiveness, in this paper, we propose a Tree-structured Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical clustering tree is built over the items and picking an item is formulated as seeking a path from the root to a certain leaf of the tree. Extensive experiments on carefully-designed environments based on two real-world datasets demonstrate that our model provides superior recommendation performance and significant efficiency improvement over state-of-the-art methods

    U-rank: Utility-oriented Learning to Rank with Implicit Feedback

    Get PDF
    Learning to rank with implicit feedback is one of the most important tasks in many real-world information systems where the objective is some specific utility, e.g., clicks and revenue. However, we point out that existing methods based on probabilistic ranking principle do not necessarily achieve the highest utility. To this end, we propose a novel ranking framework called U-rank that directly optimizes the expected utility of the ranking list. With a position-aware deep click-through rate prediction model, we address the attention bias considering both query-level and item-level features. Due to the item-specific attention bias modeling, the optimization for expected utility corresponds to a maximum weight matching on the item-position bipartite graph. We base the optimization of this objective in an efficient Lambdaloss framework, which is supported by both theoretical and empirical analysis. We conduct extensive experiments for both web search and recommender systems over three benchmark datasets and two proprietary datasets, where the performance gain of U-rank over state-of-the-arts is demonstrated. Moreover, our proposed U-rank has been deployed on a large-scale commercial recommender and a large improvement over the production baseline has been observed in an online A/B testing

    How Can Recommender Systems Benefit from Large Language Models: A Survey

    Full text link
    Recommender systems (RS) play important roles to match users' information needs for Internet applications. In natural language processing (NLP) domains, large language model (LLM) has shown astonishing emergent abilities (e.g., instruction following, reasoning), thus giving rise to the promising research direction of adapting LLM to RS for performance enhancements and user experience improvements. In this paper, we conduct a comprehensive survey on this research direction from an application-oriented view. We first summarize existing research works from two orthogonal perspectives: where and how to adapt LLM to RS. For the "WHERE" question, we discuss the roles that LLM could play in different stages of the recommendation pipeline, i.e., feature engineering, feature encoder, scoring/ranking function, and pipeline controller. For the "HOW" question, we investigate the training and inference strategies, resulting in two fine-grained taxonomy criteria, i.e., whether to tune LLMs or not, and whether to involve conventional recommendation model (CRM) for inference. Detailed analysis and general development trajectories are provided for both questions, respectively. Then, we highlight key challenges in adapting LLM to RS from three aspects, i.e., efficiency, effectiveness, and ethics. Finally, we summarize the survey and discuss the future prospects. We also actively maintain a GitHub repository for papers and other related resources in this rising direction: https://github.com/CHIANGEL/Awesome-LLM-for-RecSys.Comment: 15 pages; 3 figures; summarization table in appendi

    Adaptive optimal output regulation for wheel-legged robot Ollie: A data-driven approach

    Get PDF
    The dynamics of a robot may vary during operation due to both internal and external factors, such as non-ideal motor characteristics and unmodeled loads, which would lead to control performance deterioration and even instability. In this paper, the adaptive optimal output regulation (AOOR)-based controller is designed for the wheel-legged robot Ollie to deal with the possible model uncertainties and disturbances in a data-driven approach. We test the AOOR-based controller by forcing the robot to stand still, which is a conventional index to judge the balance controller for two-wheel robots. By online training with small data, the resultant AOOR achieves the optimality of the control performance and stabilizes the robot within a small displacement in rich experiments with different working conditions. Finally, the robot further balances a rolling cylindrical bottle on its top with the balance control using the AOOR, but it fails with the initial controller. Experimental results demonstrate that the AOOR-based controller shows the effectiveness and high robustness with model uncertainties and external disturbances

    A functional variant in the Stearoyl-CoA desaturase gene promoter enhances fatty acid desaturation in pork

    Get PDF
    There is growing public concern about reducing saturated fat intake. Stearoyl-CoA desaturase (SCD) is the lipogenic enzyme responsible for the biosynthesis of oleic acid (18:1) by desaturating stearic acid (18:0). Here we describe a total of 18 mutations in the promoter and 3′ non-coding region of the pig SCD gene and provide evidence that allele T at AY487830:g.2228T>C in the promoter region enhances fat desaturation (the ratio 18:1/18:0 in muscle increases from 3.78 to 4.43 in opposite homozygotes) without affecting fat content (18:0+18:1, intramuscular fat content, and backfat thickness). No mutations that could affect the functionality of the protein were found in the coding region. First, we proved in a purebred Duroc line that the C-T-A haplotype of the 3 single nucleotide polymorphisms (SNPs) (g.2108C>T; g.2228T>C; g.2281A>G) of the promoter region was additively associated to enhanced 18:1/18:0 both in muscle and subcutaneous fat, but not in liver. We show that this association was consistent over a 10-year period of overlapping generations and, in line with these results, that the C-T-A haplotype displayed greater SCD mRNA expression in muscle. The effect of this haplotype was validated both internally, by comparing opposite homozygote siblings, and externally, by using experimental Duroc-based crossbreds. Second, the g.2281A>G and the g.2108C>T SNPs were excluded as causative mutations using new and previously published data, restricting the causality to g.2228T>C SNP, the last source of genetic variation within the haplotype. This mutation is positioned in the core sequence of several putative transcription factor binding sites, so that there are several plausible mechanisms by which allele T enhances 18:1/18:0 and, consequently, the proportion of monounsaturated to saturated fat.This research was supported by grants from the Spanish Ministry of Science and Innovation (AGL2009-09779 and AGL2012-33529). RRF is recipient of a PhD scholarship from the Spanish Ministry of Science and Innovation (BES-2010-034607). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of manuscript