Abstract
Introduction
The new features of IA-64 multimedia instructions make it difficult to abstract the semantics of machine instructions to higher-level intermediate representations, which is important to binary translation [1, 2] . This paper presents an effective approach to the problem, improving existing technologies according to characteristics of IA-64 multimedia instruction set [3] . The techniques described here are suitable not only to binary translators but also to decompilers.
New features of IA-64 multimedia instructions
A new set of multimedia instructions introduced into IA-64 architecture implements SIMD (Single Instruction Multiple Data) parallelism, which obtains a significant performance improvement in multimedia applications [3] . IA-64 SIMD Extensions provides new data representation and enhanced instruction set to enhance the performance of applications.
In IA-64, an integer multimedia instruction can be 1, 2 or 4 bytes form, thus multiple subwords can be accommodated in a single register, which is 64-bit long.
The new definition also contains new floating-point instructions, which operate on two single-precision floating-point values packed in bits <63:32> and <31:0> of the 82-bit floating-point registers [3] . The SIMD instructions significantly accelerate 3D graphics and video encoding and decoding.
Semantic abstraction of IA-64 multimedia instructions
UQBT (The University of Queensland Binary Translator) developed a simple and parsable Semantic Specification Language (SSL) [1, 2] . The syntax of SSL is defined in EBNF (Extended-Backus-NaurForm), and semantics of SSL is described in natural language integrated with examples from the 80286 and SPARC architectures. However, SSL is limited in its ability to model other architectural issues.
To overcome the problem, we study the characteristics of IA-64 instruction set and semantic specification technology of UQBT. Our approach improves in two aspects: first of all, we modify and extend the syntax of SSL according to the new features of IA-64 architecture. Secondly, the semantics of some multimedia instructions is so complicated that we add new algorithms to meet the demand of semantic abstraction effectively. 
Semantic abstraction

An optimal algorithm
As an example, we present a practical method to optimize code of the important motion estimation algorithm in real-time video MPEG-2 encoding in instruction level. Motion estimation has a great deal of parallelism that we could slice in its processing of 16*16 block matching. We replace groups of instructions by a single multimedia instruction performing the same operation: Taking the padd4 instruction for example, we present the semantic abstraction of integer multimedia instructions using extended SSL. The padd4 is the four_byte, modulo_form of the paddn instruction. The sets of subwords from the two source operands are added in parallel, and the results are placed in the destination register. The following fragment of SSL abstracts the semantics of the psad4 instruction, which operates on two 32-bit subwords in parallel: 
Implementation and preliminary results
We have implemented our techniques in a static binary translator from IA-64 to Alpha (I2A). We reverse the executable programs in ELF (Executable and Linking Format) binary format to the target C codes on IA-64 and the output C files are converted to machine codes on Alpha. Source binary programs are compiled using gcc 2.95. 
Conclusions and Future work
In this paper a semantic abstraction approach has been presented. Future work includes the future development of a more compact and optimized model to reduce the codes of intermediate representation and more multimedia applications to evaluate the techniques.
