64,925 research outputs found
A Shared Task on Bandit Learning for Machine Translation
We introduce and describe the results of a novel shared task on bandit
learning for machine translation. The task was organized jointly by Amazon and
Heidelberg University for the first time at the Second Conference on Machine
Translation (WMT 2017). The goal of the task is to encourage research on
learning machine translation from weak user feedback instead of human
references or post-edits. On each of a sequence of rounds, a machine
translation system is required to propose a translation for an input, and
receives a real-valued estimate of the quality of the proposed translation for
learning. This paper describes the shared task's learning and evaluation setup,
using services hosted on Amazon Web Services (AWS), the data and evaluation
metrics, and the results of various machine translation architectures and
learning protocols.Comment: Conference on Machine Translation (WMT) 201
The step project:societal and political engagement of young people in environmental issues
Decisions on environmental topics taken today are going to have long-term consequences that will affect future generations. Young people will have to live with the consequences of these decisions and undertake special responsibilities. Moreover, as tomorrowās decision makers, they themselves should learn how to negotiate and debate issues before final decisions are made. Therefore, any participation they can have in environmental decision making processes will prove essential in developing a sustainable future for the community.However, recent data indicate that the young distance themselves from community affairs, mainly because the procedures involved are āwoodenā, politiciansā discourse alienates the young and the whole experience is too formalized to them. Authorities are aware of this fact and try to establish communication channels to ensure transparency and use a language that speaks to new generations of citizens. This is where STEP project comes in.STEP (www.step4youth.eu) is a digital Platform (web/mobile) enabling youth Societal and Political e-Participation in decision-making procedures concerning environmental issues. STEP is enhanced with web/social media mining, gamification, machine translation, and visualisation features.Six pilots in real contexts are being organised for the deployment of the STEP solution in 4 European Countries: Italy, Spain, Greece, and Turkey. Pilots are implemented with the direct participation of one regional authority, four municipalities, and one association of municipalities, and include decision-making procedures on significant environmental questions.</p
Model-Based Mitigation of Availability Risks
The assessment and mitigation of risks related to the availability of the IT infrastructure is becoming increasingly important in modern organizations. Unfortunately, present standards for Risk Assessment and Mitigation show limitations when evaluating and mitigating availability risks. This is due to the fact that they do not fully consider the dependencies between the constituents of an IT infrastructure that are paramount in large enterprises. These dependencies make the technical problem of assessing availability issues very challenging. In this paper we define a method and a tool for carrying out a Risk Mitigation activity which allows to assess the global impact of a set of risks and to choose the best set of countermeasures to cope with them. To this end, the presence of a tool is necessary due to the high complexity of the assessment problem. Our approach can be integrated in present Risk Management methodologies (e.g. COBIT) to provide a more precise Risk Mitigation activity. We substantiate the viability of this approach by showing that most of the input required by the tool is available as part of a standard business continuity plan, and/or by performing a common tool-assisted Risk Management
Fighting Authorship Linkability with Crowdsourcing
Massive amounts of contributed content -- including traditional literature,
blogs, music, videos, reviews and tweets -- are available on the Internet
today, with authors numbering in many millions. Textual information, such as
product or service reviews, is an important and increasingly popular type of
content that is being used as a foundation of many trendy community-based
reviewing sites, such as TripAdvisor and Yelp. Some recent results have shown
that, due partly to their specialized/topical nature, sets of reviews authored
by the same person are readily linkable based on simple stylometric features.
In practice, this means that individuals who author more than a few reviews
under different accounts (whether within one site or across multiple sites) can
be linked, which represents a significant loss of privacy.
In this paper, we start by showing that the problem is actually worse than
previously believed. We then explore ways to mitigate authorship linkability in
community-based reviewing. We first attempt to harness the global power of
crowdsourcing by engaging random strangers into the process of re-writing
reviews. As our empirical results (obtained from Amazon Mechanical Turk)
clearly demonstrate, crowdsourcing yields impressively sensible reviews that
reflect sufficiently different stylometric characteristics such that prior
stylometric linkability techniques become largely ineffective. We also consider
using machine translation to automatically re-write reviews. Contrary to what
was previously believed, our results show that translation decreases authorship
linkability as the number of intermediate languages grows. Finally, we explore
the combination of crowdsourcing and machine translation and report on the
results
An Investigation on Text-Based Cross-Language Picture Retrieval Effectiveness through the Analysis of User Queries
Purpose: This paper describes a study of the queries generated from a user experiment for cross-language information retrieval (CLIR) from a historic image archive. Italian speaking users generated 618 queries for a set of known-item search tasks. The queries generated by userās interaction with the system have been analysed and the results used to suggest recommendations for the future development of cross-language retrieval systems for digital image libraries.
Methodology: A controlled lab-based user study was carried out using a prototype Italian-English image retrieval system. Participants were asked to carry out searches for 16 images provided to them, a known-item search task. Userās interactions with the system were recorded and queries were analysed manually quantitatively and qualitatively.
Findings: Results highlight the diversity in requests for similar visual content and the weaknesses of Machine Translation for query translation. Through the manual translation of queries we show the benefits of using high-quality translation resources. The results show the individual characteristics of userās whilst performing known-item searches and the overlap obtained between query terms and structured image captions, highlighting the use of userās search terms for objects within the foreground of an image.
Limitations and Implications: This research looks in-depth into one case of interaction and one image repository. Despite this limitation, the discussed results are likely to be valid across other languages and image repository.
Value: The growing quantity of digital visual material in digital libraries offers the potential to apply techniques from CLIR to provide cross-language information access services. However, to develop effective systems requires studying userās search behaviours, particularly in digital image libraries. The value of this paper is in the provision of empirical evidence to support recommendations for effective cross-language image retrieval system design.</p
Automatic Quality Estimation for ASR System Combination
Recognizer Output Voting Error Reduction (ROVER) has been widely used for
system combination in automatic speech recognition (ASR). In order to select
the most appropriate words to insert at each position in the output
transcriptions, some ROVER extensions rely on critical information such as
confidence scores and other ASR decoder features. This information, which is
not always available, highly depends on the decoding process and sometimes
tends to over estimate the real quality of the recognized words. In this paper
we propose a novel variant of ROVER that takes advantage of ASR quality
estimation (QE) for ranking the transcriptions at "segment level" instead of:
i) relying on confidence scores, or ii) feeding ROVER with randomly ordered
hypotheses. We first introduce an effective set of features to compensate for
the absence of ASR decoder information. Then, we apply QE techniques to perform
accurate hypothesis ranking at segment-level before starting the fusion
process. The evaluation is carried out on two different tasks, in which we
respectively combine hypotheses coming from independent ASR systems and
multi-microphone recordings. In both tasks, it is assumed that the ASR decoder
information is not available. The proposed approach significantly outperforms
standard ROVER and it is competitive with two strong oracles that e xploit
prior knowledge about the real quality of the hypotheses to be combined.
Compared to standard ROVER, the abs olute WER improvements in the two
evaluation scenarios range from 0.5% to 7.3%
- ā¦