4,058 research outputs found
Improved Heterogeneous Distance Functions
Instance-based learning techniques typically handle continuous and linear
input values well, but often do not handle nominal input attributes
appropriately. The Value Difference Metric (VDM) was designed to find
reasonable distance values between nominal attribute values, but it largely
ignores continuous attributes, requiring discretization to map continuous
values into nominal values. This paper proposes three new heterogeneous
distance functions, called the Heterogeneous Value Difference Metric (HVDM),
the Interpolated Value Difference Metric (IVDM), and the Windowed Value
Difference Metric (WVDM). These new distance functions are designed to handle
applications with nominal attributes, continuous attributes, or both. In
experiments on 48 applications the new distance metrics achieve higher
classification accuracy on average than three previous distance functions on
those datasets that have both nominal and continuous attributes.Comment: See http://www.jair.org/ for an online appendix and other files
accompanying this articl
Recommended from our members
Distribution of Value of Time and Ways to Model Value of Time in Long-Range Planning Models
As managed lanes (ML) become more integrated in regional urban networks with existing general purpose (GP) lanes, the distribution of travelers’ value of time (VOT) is becoming more important for transportation planning agencies to quantify in order to accurately predict future travel patterns. Since travelers’ VOT varies depending on a multitude of factors, this study investigates ways that we can determine the VOT distribution of a region from existing travel data as well as effective ways that we can model VOT using traffic assignment algorithms. In networks with available link volumes and toll data on segments where travelers have the option of choosing to stay on the GP lanes or entering a ML facility, a VOT distribution can be inferred assuming that travelers who enter the ML choose to do so based on a certain “threshold” VOT. When modeling these VOT distributions, errors are observed in the traffic assignment results when both the continuous nature of VOT distributions are discretized, and when varying toll values are assumed to be constant. Specifically in the context of TransCAD software, link travel time errors appear to be much less significant than flow errors when tested on a nine node network. Additional experimentation on larger regional networks is needed to verify the significance of these errors and their impact on predicted travel patterns.Civil, Architectural, and Environmental Engineerin
Surveying human habit modeling and mining techniques in smart spaces
A smart space is an environment, mainly equipped with Internet-of-Things (IoT) technologies, able to provide services to humans, helping them to perform daily tasks by monitoring the space and autonomously executing actions, giving suggestions and sending alarms. Approaches suggested in the literature may differ in terms of required facilities, possible applications, amount of human intervention required, ability to support multiple users at the same time adapting to changing needs. In this paper, we propose a Systematic Literature Review (SLR) that classifies most influential approaches in the area of smart spaces according to a set of dimensions identified by answering a set of research questions. These dimensions allow to choose a specific method or approach according to available sensors, amount of labeled data, need for visual analysis, requirements in terms of enactment and decision-making on the environment. Additionally, the paper identifies a set of challenges to be addressed by future research in the field
Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems
Crowdsourcing markets have emerged as a popular platform for matching
available workers with tasks to complete. The payment for a particular task is
typically set by the task's requester, and may be adjusted based on the quality
of the completed work, for example, through the use of "bonus" payments. In
this paper, we study the requester's problem of dynamically adjusting
quality-contingent payments for tasks. We consider a multi-round version of the
well-known principal-agent model, whereby in each round a worker makes a
strategic choice of the effort level which is not directly observable by the
requester. In particular, our formulation significantly generalizes the
budget-free online task pricing problems studied in prior work.
We treat this problem as a multi-armed bandit problem, with each "arm"
representing a potential contract. To cope with the large (and in fact,
infinite) number of arms, we propose a new algorithm, AgnosticZooming, which
discretizes the contract space into a finite number of regions, effectively
treating each region as a single arm. This discretization is adaptively
refined, so that more promising regions of the contract space are eventually
discretized more finely. We analyze this algorithm, showing that it achieves
regret sublinear in the time horizon and substantially improves over
non-adaptive discretization (which is the only competing approach in the
literature).
Our results advance the state of art on several different topics: the theory
of crowdsourcing markets, principal-agent problems, multi-armed bandits, and
dynamic pricing.Comment: This is the full version of a paper in the ACM Conference on
Economics and Computation (ACM-EC), 201
- …