458 research outputs found

    Real Time Web Search Framework for Performing Efficient Retrieval of Data

    Get PDF
    With the rapidly growing amount of information on the internet, real-time system is one of the key strategies to cope with the information overload and to help users in finding highly relevant information. Real-time events and domain-specific information are important knowledge base references on the Web that frequently accessed by millions of users. Real-time system is a vital to product and a technique must resolve the context of challenges to be more reliable, e.g. short data life-cycles, heterogeneous user interests, strict time constraints, and context-dependent article relevance. Since real-time data have only a short time to live, real-time models have to be continuously adapted, ensuring that real-time data are always up-to-date. The focal point of this manuscript is for designing a real-time web search approach that aggregates several web search algorithms at query time to tune search results for relevancy. We learn a context-aware delegation algorithm that allows choosing the best real-time algorithms for each query request. The evaluation showed that the proposed approach outperforms the traditional models, in which it allows us to adapt the specific properties of the considered real-time resources. In the experiments, we found that it is highly relevant for most recently searched queries, consistent in its performance, and resilient to the drawbacks faced by other algorithms

    FLINT: A Platform for Federated Learning Integration

    Full text link
    Cross-device federated learning (FL) has been well-studied from algorithmic, system scalability, and training speed perspectives. Nonetheless, moving from centralized training to cross-device FL for millions or billions of devices presents many risks, including performance loss, developer inertia, poor user experience, and unexpected application failures. In addition, the corresponding infrastructure, development costs, and return on investment are difficult to estimate. In this paper, we present a device-cloud collaborative FL platform that integrates with an existing machine learning platform, providing tools to measure real-world constraints, assess infrastructure capabilities, evaluate model training performance, and estimate system resource requirements to responsibly bring FL into production. We also present a decision workflow that leverages the FL-integrated platform to comprehensively evaluate the trade-offs of cross-device FL and share our empirical evaluations of business-critical machine learning applications that impact hundreds of millions of users.Comment: Preprint for MLSys 202

    Building K-Anonymous User Cohorts with\\ Consecutive Consistent Weighted Sampling (CCWS)

    Full text link
    To retrieve personalized campaigns and creatives while protecting user privacy, digital advertising is shifting from member-based identity to cohort-based identity. Under such identity regime, an accurate and efficient cohort building algorithm is desired to group users with similar characteristics. In this paper, we propose a scalable KK-anonymous cohort building algorithm called {\em consecutive consistent weighted sampling} (CCWS). The proposed method combines the spirit of the (pp-powered) consistent weighted sampling and hierarchical clustering, so that the KK-anonymity is ensured by enforcing a lower bound on the size of cohorts. Evaluations on a LinkedIn dataset consisting of >70>70M users and ads campaigns demonstrate that CCWS achieves substantial improvements over several hashing-based methods including sign random projections (SignRP), minwise hashing (MinHash), as well as the vanilla CWS

    Policy Implications of User-Generated Data Network Effects

    Get PDF
    User-generated data (UGD) network effects are an exciting and novel economic force. They upset conventional market competition dynamics, and they lead to the formation of dominant data platforms with market power that spans different and seemingly unrelated markets. This article explains that UGD network effects are a blessing and a curse. They provide dominant data platforms with the opportunity to generate welfare-enhancing efficiencies as well as welfare-reducing anticompetitive harms. After exploring the economic opportunities and social threats, this article explores the implications of UGD network effects on competition policy. Drawing on traditional network effects theory, this article proposes and critically examines a host of remedial approaches for policymakers to consider. These remedies include modernized public utility-style regulation, open access policies, and adjusted standards for anti-monopolization and merger scrutiny

    Fairness in Recommendation: Foundations, Methods and Applications

    Full text link
    As one of the most pervasive applications of machine learning, recommender systems are playing an important role on assisting human decision making. The satisfaction of users and the interests of platforms are closely related to the quality of the generated recommendation results. However, as a highly data-driven system, recommender system could be affected by data or algorithmic bias and thus generate unfair results, which could weaken the reliance of the systems. As a result, it is crucial to address the potential unfairness problems in recommendation settings. Recently, there has been growing attention on fairness considerations in recommender systems with more and more literature on approaches to promote fairness in recommendation. However, the studies are rather fragmented and lack a systematic organization, thus making it difficult to penetrate for new researchers to the domain. This motivates us to provide a systematic survey of existing works on fairness in recommendation. This survey focuses on the foundations for fairness in recommendation literature. It first presents a brief introduction about fairness in basic machine learning tasks such as classification and ranking in order to provide a general overview of fairness research, as well as introduce the more complex situations and challenges that need to be considered when studying fairness in recommender systems. After that, the survey will introduce fairness in recommendation with a focus on the taxonomies of current fairness definitions, the typical techniques for improving fairness, as well as the datasets for fairness studies in recommendation. The survey also talks about the challenges and opportunities in fairness research with the hope of promoting the fair recommendation research area and beyond.Comment: Accepted by ACM Transactions on Intelligent Systems and Technology (TIST

    A reference architecture for big data systems

    Get PDF
    Over dozens of years, applying new IT technologies into organizations has always been a big concern for business. Big data certainly is a new concept exciting business. To be able to access more data and empower to analysis big data requires new big data platforms. However, there still remains limited reference architecture for big data systems. In this paper, based on existing reference architecture of big data systems, we propose new high level abstract reference architecture and related reference architecture notations, that better express the overall architecture. The new reference architecture is verified using one existing case and an additional new use case

    A novel approach towards skill-based search and services of Open Educational Resources

    Get PDF
    Ha, K.-H., Niemann, K., Schwertel, U., Holtkamp, P., Pirkkalainen, H., Börner, D. et al (2011). A novel approach towards skill-based search and services of Open Educational Resources. In E. Garcia-Barriocanal, A. Öztürk, & M. C. Okur (Eds.), Metadata and Semantics Research: 5th International Conference MTSR 2011 (pp. 312-323), Izmir, Turkey, October 12-14, 2011. Springer.Open educational resources (OER) have a high potential to address the growing need for training materials in management education and training. Today, a high number of OER in management are already available in a large number of repositories. However, users face barriers as they have to search repository by repository with different interfaces to retrieve the appropriate learning content. In addition, the use of search criteria related to skills, such as learning objectives and skill-levels is not generally supported. The European co-funded project OpenScout addresses these barriers by intelligently connecting leading European OER repositories and providing federated, skillbased search and retrieval web services. On top of this content federation the project supports users with easy-to-apply tools that will accelerate the (re-) use of open content

    Don\u27s Conference Notes--An NFAIS Workshop and A Charleston Preconference Seminar

    Get PDF
    • …
    corecore