11,096 research outputs found
Learning and Transferring IDs Representation in E-commerce
Many machine intelligence techniques are developed in E-commerce and one of
the most essential components is the representation of IDs, including user ID,
item ID, product ID, store ID, brand ID, category ID etc. The classical
encoding based methods (like one-hot encoding) are inefficient in that it
suffers sparsity problems due to its high dimension, and it cannot reflect the
relationships among IDs, either homogeneous or heterogeneous ones. In this
paper, we propose an embedding based framework to learn and transfer the
representation of IDs. As the implicit feedbacks of users, a tremendous amount
of item ID sequences can be easily collected from the interactive sessions. By
jointly using these informative sequences and the structural connections among
IDs, all types of IDs can be embedded into one low-dimensional semantic space.
Subsequently, the learned representations are utilized and transferred in four
scenarios: (i) measuring the similarity between items, (ii) transferring from
seen items to unseen items, (iii) transferring across different domains, (iv)
transferring across different tasks. We deploy and evaluate the proposed
approach in Hema App and the results validate its effectiveness.Comment: KDD'18, 9 page
The Graphical Lasso: New Insights and Alternatives
The graphical lasso \citep{FHT2007a} is an algorithm for learning the
structure in an undirected Gaussian graphical model, using
regularization to control the number of zeros in the precision matrix
{\B\Theta}={\B\Sigma}^{-1} \citep{BGA2008,yuan_lin_07}. The {\texttt R}
package \GL\ \citep{FHT2007a} is popular, fast, and allows one to efficiently
build a path of models for different values of the tuning parameter.
Convergence of \GL\ can be tricky; the converged precision matrix might not be
the inverse of the estimated covariance, and occasionally it fails to converge
with warm starts. In this paper we explain this behavior, and propose new
algorithms that appear to outperform \GL.
By studying the "normal equations" we see that, \GL\ is solving the {\em
dual} of the graphical lasso penalized likelihood, by block coordinate ascent;
a result which can also be found in \cite{BGA2008}.
In this dual, the target of estimation is \B\Sigma, the covariance matrix,
rather than the precision matrix \B\Theta. We propose similar primal
algorithms \PGL\ and \DPGL, that also operate by block-coordinate descent,
where \B\Theta is the optimization target. We study all of these algorithms,
and in particular different approaches to solving their coordinate
sub-problems. We conclude that \DPGL\ is superior from several points of view.Comment: This is a revised version of our previous manuscript with the same
name ArXiv id: http://arxiv.org/abs/1111.547
Learned Cardinalities: Estimating Correlated Joins with Deep Learning
We describe a new deep learning approach to cardinality estimation. MSCN is a
multi-set convolutional network, tailored to representing relational query
plans, that employs set semantics to capture query features and true
cardinalities. MSCN builds on sampling-based estimation, addressing its
weaknesses when no sampled tuples qualify a predicate, and in capturing
join-crossing correlations. Our evaluation of MSCN using a real-world dataset
shows that deep learning significantly enhances the quality of cardinality
estimation, which is the core problem in query optimization.Comment: CIDR 2019. https://github.com/andreaskipf/learnedcardinalitie
Development of Economic Water Usage Sensor and Cyber-Physical Systems Co-Simulation Platform for Home Energy Saving
In this thesis, two Cyber-Physical Systems (CPS) approaches were considered to reduce residential building energy consumption. First, a flow sensor was developed for residential gas and electric storage water heaters. The sensor utilizes unique temperature changes of tank inlet and outlet pipes upon water draw to provide occupant hot water usage. Post processing of measured pipe temperature data was able to detect water draw events. Conservation of energy was applied to heater pipes to determine relative internal water flow rate based on transient temperature measurements. Correlations between calculated flow and actual flow were significant at a 95% confidence level. Using this methodology, a CPS water heater controller can activate existing residential storage water heaters according to occupant hot water demand. The second CPS approach integrated an open-source building simulation tool, EnergyPlus, into a CPS simulation platform developed by the National Institute of Standards and Technology (NIST). The NIST platform utilizes the High Level Architecture (HLA) co-simulation protocol for logical timing control and data communication. By modifying existing EnergyPlus co-simulation capabilities, NIST’s open-source platform was able to execute an uninterrupted simulation between a residential house in EnergyPlus and an externally connected thermostat controller. The developed EnergyPlus wrapper for HLA co-simulation can allow active replacement of traditional real-time data collection for building CPS development. As such, occupant sensors and simple home CPS product can allow greater residential participation in energy saving practices, saving up to 33% on home energy consumption nationally
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems
Recently, there has been an emergence of employing LLM-powered agents as
believable human proxies, based on their remarkable decision-making capability.
However, existing studies mainly focus on simulating human dialogue. Human
non-verbal behaviors, such as item clicking in recommender systems, although
implicitly exhibiting user preferences and could enhance the modeling of users,
have not been deeply explored. The main reasons lie in the gap between language
modeling and behavior modeling, as well as the incomprehension of LLMs about
user-item relations.
To address this issue, we propose AgentCF for simulating user-item
interactions in recommender systems through agent-based collaborative
filtering. We creatively consider not only users but also items as agents, and
develop a collaborative learning approach that optimizes both kinds of agents
together. Specifically, at each time step, we first prompt the user and item
agents to interact autonomously. Then, based on the disparities between the
agents' decisions and real-world interaction records, user and item agents are
prompted to reflect on and adjust the misleading simulations collaboratively,
thereby modeling their two-sided relations. The optimized agents can also
propagate their preferences to other agents in subsequent interactions,
implicitly capturing the collaborative filtering idea. Overall, the optimized
agents exhibit diverse interaction behaviors within our framework, including
user-item, user-user, item-item, and collective interactions. The results show
that these agents can demonstrate personalized behaviors akin to those of
real-world individuals, sparking the development of next-generation user
behavior simulation
QueRIE: Collaborative Database Exploration
Interactive database exploration is a key task in information mining. However, users who lack SQL expertise or familiarity with the database schema face great difficulties in performing this task. To aid these users, we developed the QueRIE system for personalized query recommendations. QueRIE continuously monitors the user’s querying behavior and finds matching patterns in the system’s query log, in an attempt to identify previous users with similar information needs. Subsequently, QueRIE uses these “similar” users and their queries to recommend queries that the current user may find interesting. In this work we describe an instantiation of the QueRIE framework, where the active user’s session is represented by a set of query fragments. The recorded fragments are used to identify similar query fragments in the previously recorded sessions, which are in turn assembled in potentially interesting queries for the active user. We show through experimentation that the proposed method generates meaningful recommendations on real-life traces from the SkyServer database and propose a scalable design that enables the incremental update of similarities, making real-time computations on large amounts of data feasible. Finally, we compare this fragment-based instantiation with our previously proposed tuple-based instantiation discussing the advantages and disadvantages of each approach
- …