1,197 research outputs found
Improving package recommendations through query relaxation
Recommendation systems aim to identify items that are likely to be of
interest to users. In many cases, users are interested in package
recommendations as collections of items. For example, a dietitian may wish to
derive a dietary plan as a collection of recipes that is nutritionally
balanced, and a travel agent may want to produce a vacation package as a
coordinated collection of travel and hotel reservations. Recent work has
explored extending recommendation systems to support packages of items. These
systems need to solve complex combinatorial problems, enforcing various
properties and constraints defined on sets of items. Introducing constraints on
packages makes recommendation queries harder to evaluate, but also harder to
express: Queries that are under-specified produce too many answers, whereas
queries that are over-specified frequently miss interesting solutions.
In this paper, we study query relaxation techniques that target package
recommendation systems. Our work offers three key insights: First, even when
the original query result is not empty, relaxing constraints can produce
preferable solutions. Second, a solution due to relaxation can only be
preferred if it improves some property specified by the query. Third,
relaxation should not treat all constraints as equals: some constraints are
more important to the users than others. Our contributions are threefold: (a)
we define the problem of deriving package recommendations through query
relaxation, (b) we design and experimentally evaluate heuristics that relax
query constraints to derive interesting packages, and (c) we present a crowd
study that evaluates the sensitivity of real users to different kinds of
constraints and demonstrates that query relaxation is a powerful tool in
diversifying package recommendations
ANSWERING WHY-NOT QUESTIONS ON REVERSE SKYLINE QUERIES OVER INCOMPLETE DATA
Recently, the development of the query-based preferences has received considerable attention from researchers and data users. One of the most popular preference-based queries is the skyline query, which will give a subset of superior records that are not dominated by any other records. As the developed version of skyline queries, a reverse skyline query rise. This query aims to get information about the query points that make a data or record as the part of result of their skyline query. Furthermore, data-oriented IT development requires scientists to be able to process data in all conditions. In the real world, there exist incomplete multidimensional data, both because of damage, loss, and privacy. In order to increase the usability over a data set, this study will discuss one of the problems in processing reverse skyline queries over incomplete data, namely the "why-not" problem. The considered solution to this "why-not" problem is advice and steps so that a query point that does not initially consider an incomplete data, as a result, can later make the record or incomplete data as part of the results. In this study, there will be further discussion about the dominance relationship between incomplete data along with the solution of the problem. Moreover, some performance evaluations are conducted to measure the level of efficiency and effectiveness
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
The introduction of large language models has significantly advanced code
generation. However, open-source models often lack the execution capabilities
and iterative refinement of advanced systems like the GPT-4 Code Interpreter.
To address this, we introduce OpenCodeInterpreter, a family of open-source code
systems designed for generating, executing, and iteratively refining code.
Supported by Code-Feedback, a dataset featuring 68K multi-turn interactions,
OpenCodeInterpreter integrates execution and human feedback for dynamic code
refinement. Our comprehensive evaluation of OpenCodeInterpreter across key
benchmarks such as HumanEval, MBPP, and their enhanced versions from EvalPlus
reveals its exceptional performance. Notably, OpenCodeInterpreter-33B achieves
an accuracy of 83.2 (76.4) on the average (and plus versions) of HumanEval and
MBPP, closely rivaling GPT-4's 84.2 (76.2) and further elevates to 91.6 (84.6)
with synthesized human feedback from GPT-4. OpenCodeInterpreter brings the gap
between open-source code generation models and proprietary systems like GPT-4
Code Interpreter
K-Dominance in Multidimensional Data: Theory and Applications
We study the problem of k-dominance in a set of d-dimensional vectors, prove bounds on the number of maxima (skyline vectors), under both worst-case and average-case models, perform experimental evaluation using synthetic and real-world data, and explore an application of k-dominant skyline for extracting a small set of top-ranked vectors in high dimensions where the full skylines can be unmanageably large
Integration of Skyline Queries into Spark SQL
Skyline queries are frequently used in data analytics and multi-criteria
decision support applications to filter relevant information from big amounts
of data. Apache Spark is a popular framework for processing big, distributed
data. The framework even provides a convenient SQL-like interface via the Spark
SQL module. However, skyline queries are not natively supported and require
tedious rewriting to fit the SQL standard or Spark's SQL-like language. The
goal of our work is to fill this gap. We thus provide a full-fledged
integration of the skyline operator into Spark SQL. This allows for a simple
and easy to use syntax to input skyline queries. Moreover, our empirical
results show that this integrated solution of skyline queries by far
outperforms a solution based on rewriting into standard SQL
- …