Fuzzy matching in translation memories (TM) is mostly string-based in current CAT tools.
These tools look for TM sentences highly similar to an input sentence, using edit distance to
detect the differences between sentences. Current CAT tools use limited or no linguistic
knowledge in this procedure. In the recently started SCATE project, which aims at improving
translators’ efficiency, we apply syntactic fuzzy matching in order to detect abstract similarities
and to increase the number of fuzzy matches. We parse TM sentences in order to create
hierarchical structures identifying constituents and/or dependencies. We calculate TER
(Translation Error Rate) between an existing human translation of an input sentence and the
translation of its fuzzy match in TM. This allows us to assess the usefulness of syntactic
matching with respect to string-based matching. First results hint at the potential of syntactic
matching to lower TER rates for sentences with a low match score in a string-based setting.status: publishe