13 research outputs found
A Parallel and Efficient Algorithm for Learning to Match
Many tasks in data mining and related fields can be formalized as matching
between objects in two heterogeneous domains, including collaborative
filtering, link prediction, image tagging, and web search. Machine learning
techniques, referred to as learning-to-match in this paper, have been
successfully applied to the problems. Among them, a class of state-of-the-art
methods, named feature-based matrix factorization, formalize the task as an
extension to matrix factorization by incorporating auxiliary features into the
model. Unfortunately, making those algorithms scale to real world problems is
challenging, and simple parallelization strategies fail due to the complex
cross talking patterns between sub-tasks. In this paper, we tackle this
challenge with a novel parallel and efficient algorithm for feature-based
matrix factorization. Our algorithm, based on coordinate descent, can easily
handle hundreds of millions of instances and features on a single machine. The
key recipe of this algorithm is an iterative relaxation of the objective to
facilitate parallel updates of parameters, with guaranteed convergence on
minimizing the original objective function. Experimental results demonstrate
that the proposed method is effective on a wide range of matching problems,
with efficiency significantly improved upon the baselines while accuracy
retained unchanged.Comment: 10 pages, short version was published in ICDM 201
Recommended from our members
On thermal sensor calibration and software techniques for many-core thermal management
The high power density of a many-core processor results in increased temperature which negatively impacts system reliability and performance. Dynamic thermal management applies thermal-aware techniques at run time to avoid overheating using temperature information collected from on-chip thermal sensors. Temperature sensing and thermal control schemes are two critical technologies for successfully maintaining thermal safety. In this dissertation, on-line thermal sensor calibration schemes are developed to provide accurate temperature information.
Software-based dynamic thermal management techniques are proposed using calibrated thermal sensors. Due to process variation and silicon aging, on-chip thermal sensors require periodic calibration before use in DTM. However, the calibration cost for thermal sensors can be prohibitively high as the number of on-chip sensors increases. Linear models which are suitable for on-line calculation are employed to estimate temperatures at multiple sensor locations using performance counters. The estimated temperature and the actual sensor thermal profile show a very high similarity with correlation coefficient ~0.9 for SPLASH2 and SPEC2000 benchmarks.
A calibration approach is proposed to combine potentially inaccurate temperature values obtained from two sources: thermal sensor readings and temperature estimations. A data fusion strategy based on Bayesian inference, which combines information from these two sources, is demonstrated. The result shows the strategy can effectively recalibrate sensor readings in response to inaccuracies caused by process variation and environmental noise. The average absolute error of the corrected sensor temperature readings is
A dynamic task allocation strategy is proposed to address localized overheating in many-core systems. Our approach employs reinforcement learning, a dynamic machine learning algorithm that performs task allocation based on current temperatures and a prediction regarding which assignment will minimize the peak temperature. Our results show that the proposed technique is fast (scheduling performed in \u3c1 \u3ems) and can efficiently reduce peak temperature by up to 8 degree C in a 49-core processor (6% on average) versus a leading competing task allocation approach for a series of SPLASH-2 benchmarks. Reinforcement learning has also been applied to 3D integrated circuits to allocate tasks with thermal awareness
PREFERENCES: OPTIMIZATION, IMPORTANCE LEARNING AND STRATEGIC BEHAVIORS
Preferences are fundamental to decision making and play an important role in artificial intelligence. Our research focuses on three group of problems based on the preference formalism Answer Set Optimization (ASO): preference aggregation problems such as computing optimal (near optimal) solutions, strategic behaviors in preference representation, and learning ranks (weights) for preferences.
In the first group of problems, of interest are optimal outcomes, that is, outcomes that are optimal with respect to the preorder defined by the preference rules. In this work, we consider computational problems concerning optimal outcomes. We propose, implement and study methods to compute an optimal outcome; to compute another optimal outcome once the first one is found; to compute an optimal outcome that is similar to (or, dissimilar from) a given candidate outcome; and to compute a set of optimal answer sets each significantly different from the others. For the decision version of several of these problems we establish their computational complexity.
For the second topic, the strategic behaviors such as manipulation and bribery have received much attention from the social choice community. We study these concepts for preference formalisms that identify a set of optimal outcomes rather than a single winning outcome, the case common to social choice. Such preference formalisms are of interest in the context of combinatorial domains, where preference representations are only approximations to true preferences, and seeking a single optimal outcome runs a risk of missing the one which is optimal with respect to the actual preferences. In this work, we assume that preferences may be ranked (differ in importance), and we use the Pareto principle adjusted to the case of ranked preferences as the preference aggregation rule. For two important classes of preferences, representing the extreme ends of the spectrum, we provide characterizations of situations when manipulation and bribery is possible, and establish the complexity of the problem to decide that.
Finally, we study the problem of learning the importance of individual preferences in preference profiles aggregated by the ranked Pareto rule or positional scoring rules. We provide a polynomial-time algorithm that finds a ranking of preferences such that the ranked profile correctly decided all the examples, whenever such a ranking exists. We also show that the problem to learn a ranking maximizing the number of correctly decided examples is NP-hard. We obtain similar results for the case of weighted profiles
Advances in knowledge discovery and data mining Part II
19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p