616,190 research outputs found
Description of the Chinese-to-Spanish rule-based machine translation system developed with a hybrid combination of human annotation and statistical techniques
Two of the most popular Machine Translation (MT) paradigms are rule based (RBMT) and corpus based, which include the statistical systems (SMT). When scarce parallel corpus is available, RBMT becomes particularly attractive. This is the case of the Chinese--Spanish language pair.
This article presents the first RBMT system for Chinese to Spanish. We describe a hybrid method for constructing this system taking advantage of available resources such as parallel corpora that are used to extract dictionaries and lexical and structural transfer rules.
The final system is freely available online and open source. Although performance lags behind standard SMT systems for an in-domain test set, the results show that the RBMTâs coverage is competitive and it outperforms the SMT system in an out-of-domain test set. This RBMT system is available to the general public, it can be further enhanced, and it opens up the possibility of creating future hybrid MT systems.Peer ReviewedPostprint (author's final draft
Mining top-k granular association rules for recommendation
Recommender systems are important for e-commerce companies as well as
researchers. Recently, granular association rules have been proposed for
cold-start recommendation. However, existing approaches reserve only globally
strong rules; therefore some users may receive no recommendation at all. In
this paper, we propose to mine the top-k granular association rules for each
user. First we define three measures of granular association rules. These are
the source coverage which measures the user granule size, the target coverage
which measures the item granule size, and the confidence which measures the
strength of the association. With the confidence measure, rules can be ranked
according to their strength. Then we propose algorithms for training the
recommender and suggesting items to each user. Experimental are undertaken on a
publicly available data set MovieLens. Results indicate that the appropriate
setting of granule can avoid over-fitting and at the same time, help obtaining
high recommending accuracy.Comment: 12 pages, 5 figures, submitted to Advances in Granular Computing and
Advances in Rough Sets, 2013. arXiv admin note: substantial text overlap with
arXiv:1305.137
When Social Influence Meets Item Inference
Research issues and data mining techniques for product recommendation and
viral marketing have been widely studied. Existing works on seed selection in
social networks do not take into account the effect of product recommendations
in e-commerce stores. In this paper, we investigate the seed selection problem
for viral marketing that considers both effects of social influence and item
inference (for product recommendation). We develop a new model, Social Item
Graph (SIG), that captures both effects in form of hyperedges. Accordingly, we
formulate a seed selection problem, called Social Item Maximization Problem
(SIMP), and prove the hardness of SIMP. We design an efficient algorithm with
performance guarantee, called Hyperedge-Aware Greedy (HAG), for SIMP and
develop a new index structure, called SIG-index, to accelerate the computation
of diffusion process in HAG. Moreover, to construct realistic SIG models for
SIMP, we develop a statistical inference based framework to learn the weights
of hyperedges from data. Finally, we perform a comprehensive evaluation on our
proposals with various baselines. Experimental result validates our ideas and
demonstrates the effectiveness and efficiency of the proposed model and
algorithms over baselines.Comment: 12 page
- âŠ