11,096 research outputs found

    Learning and Transferring IDs Representation in E-commerce

    Full text link
    Many machine intelligence techniques are developed in E-commerce and one of the most essential components is the representation of IDs, including user ID, item ID, product ID, store ID, brand ID, category ID etc. The classical encoding based methods (like one-hot encoding) are inefficient in that it suffers sparsity problems due to its high dimension, and it cannot reflect the relationships among IDs, either homogeneous or heterogeneous ones. In this paper, we propose an embedding based framework to learn and transfer the representation of IDs. As the implicit feedbacks of users, a tremendous amount of item ID sequences can be easily collected from the interactive sessions. By jointly using these informative sequences and the structural connections among IDs, all types of IDs can be embedded into one low-dimensional semantic space. Subsequently, the learned representations are utilized and transferred in four scenarios: (i) measuring the similarity between items, (ii) transferring from seen items to unseen items, (iii) transferring across different domains, (iv) transferring across different tasks. We deploy and evaluate the proposed approach in Hema App and the results validate its effectiveness.Comment: KDD'18, 9 page

    The Graphical Lasso: New Insights and Alternatives

    Full text link
    The graphical lasso \citep{FHT2007a} is an algorithm for learning the structure in an undirected Gaussian graphical model, using â„“1\ell_1 regularization to control the number of zeros in the precision matrix {\B\Theta}={\B\Sigma}^{-1} \citep{BGA2008,yuan_lin_07}. The {\texttt R} package \GL\ \citep{FHT2007a} is popular, fast, and allows one to efficiently build a path of models for different values of the tuning parameter. Convergence of \GL\ can be tricky; the converged precision matrix might not be the inverse of the estimated covariance, and occasionally it fails to converge with warm starts. In this paper we explain this behavior, and propose new algorithms that appear to outperform \GL. By studying the "normal equations" we see that, \GL\ is solving the {\em dual} of the graphical lasso penalized likelihood, by block coordinate ascent; a result which can also be found in \cite{BGA2008}. In this dual, the target of estimation is \B\Sigma, the covariance matrix, rather than the precision matrix \B\Theta. We propose similar primal algorithms \PGL\ and \DPGL, that also operate by block-coordinate descent, where \B\Theta is the optimization target. We study all of these algorithms, and in particular different approaches to solving their coordinate sub-problems. We conclude that \DPGL\ is superior from several points of view.Comment: This is a revised version of our previous manuscript with the same name ArXiv id: http://arxiv.org/abs/1111.547

    Learned Cardinalities: Estimating Correlated Joins with Deep Learning

    Get PDF
    We describe a new deep learning approach to cardinality estimation. MSCN is a multi-set convolutional network, tailored to representing relational query plans, that employs set semantics to capture query features and true cardinalities. MSCN builds on sampling-based estimation, addressing its weaknesses when no sampled tuples qualify a predicate, and in capturing join-crossing correlations. Our evaluation of MSCN using a real-world dataset shows that deep learning significantly enhances the quality of cardinality estimation, which is the core problem in query optimization.Comment: CIDR 2019. https://github.com/andreaskipf/learnedcardinalitie

    Development of Economic Water Usage Sensor and Cyber-Physical Systems Co-Simulation Platform for Home Energy Saving

    Get PDF
    In this thesis, two Cyber-Physical Systems (CPS) approaches were considered to reduce residential building energy consumption. First, a flow sensor was developed for residential gas and electric storage water heaters. The sensor utilizes unique temperature changes of tank inlet and outlet pipes upon water draw to provide occupant hot water usage. Post processing of measured pipe temperature data was able to detect water draw events. Conservation of energy was applied to heater pipes to determine relative internal water flow rate based on transient temperature measurements. Correlations between calculated flow and actual flow were significant at a 95% confidence level. Using this methodology, a CPS water heater controller can activate existing residential storage water heaters according to occupant hot water demand. The second CPS approach integrated an open-source building simulation tool, EnergyPlus, into a CPS simulation platform developed by the National Institute of Standards and Technology (NIST). The NIST platform utilizes the High Level Architecture (HLA) co-simulation protocol for logical timing control and data communication. By modifying existing EnergyPlus co-simulation capabilities, NIST’s open-source platform was able to execute an uninterrupted simulation between a residential house in EnergyPlus and an externally connected thermostat controller. The developed EnergyPlus wrapper for HLA co-simulation can allow active replacement of traditional real-time data collection for building CPS development. As such, occupant sensors and simple home CPS product can allow greater residential participation in energy saving practices, saving up to 33% on home energy consumption nationally

    AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems

    Full text link
    Recently, there has been an emergence of employing LLM-powered agents as believable human proxies, based on their remarkable decision-making capability. However, existing studies mainly focus on simulating human dialogue. Human non-verbal behaviors, such as item clicking in recommender systems, although implicitly exhibiting user preferences and could enhance the modeling of users, have not been deeply explored. The main reasons lie in the gap between language modeling and behavior modeling, as well as the incomprehension of LLMs about user-item relations. To address this issue, we propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering. We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimizes both kinds of agents together. Specifically, at each time step, we first prompt the user and item agents to interact autonomously. Then, based on the disparities between the agents' decisions and real-world interaction records, user and item agents are prompted to reflect on and adjust the misleading simulations collaboratively, thereby modeling their two-sided relations. The optimized agents can also propagate their preferences to other agents in subsequent interactions, implicitly capturing the collaborative filtering idea. Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions. The results show that these agents can demonstrate personalized behaviors akin to those of real-world individuals, sparking the development of next-generation user behavior simulation

    QueRIE: Collaborative Database Exploration

    Get PDF
    Interactive database exploration is a key task in information mining. However, users who lack SQL expertise or familiarity with the database schema face great difficulties in performing this task. To aid these users, we developed the QueRIE system for personalized query recommendations. QueRIE continuously monitors the user’s querying behavior and finds matching patterns in the system’s query log, in an attempt to identify previous users with similar information needs. Subsequently, QueRIE uses these “similar” users and their queries to recommend queries that the current user may find interesting. In this work we describe an instantiation of the QueRIE framework, where the active user’s session is represented by a set of query fragments. The recorded fragments are used to identify similar query fragments in the previously recorded sessions, which are in turn assembled in potentially interesting queries for the active user. We show through experimentation that the proposed method generates meaningful recommendations on real-life traces from the SkyServer database and propose a scalable design that enables the incremental update of similarities, making real-time computations on large amounts of data feasible. Finally, we compare this fragment-based instantiation with our previously proposed tuple-based instantiation discussing the advantages and disadvantages of each approach
    • …
    corecore