Implementation of IP Core of Fast Sine and Cosine Operation through FPGA  by Shang, Yalei
Energy Procedia 16 (2012) 1253 – 1258
1876-6102 © 2011 Published by Elsevier B.V. Selection and/or peer-review under responsibility of International Materials Science Society.
doi:10.1016/j.egypro.2012.01.200
 
Available online at www.sciencedirect.com
 
Energy
Procedia  
          Energy Procedia  00 (2011) 000–000 
www.elsevier.com/locate/procedia
2012 International Conference on Future Energy, Environment, and Materials 
Implementation of IP Core of Fast Sine and Cosine Operation 
through FPGA 
Yalei Shang 
Hebi College of Vocation and Technology, Electronic Information Engineering Department. Hebi, China, 458030 
 
Abstract 
Through analysis of CORDIC algorithm principle, the CORDIC algorithm structure of pipeline architecture is 
designed in rotate mode. And the IP core of sine and cosine trigonometric function calculation from 0°to 360°is 
implemented through hardware description language VHDL based on the QuartusII software environment. 
 
© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of [name organizer] 
 
Keywords: two-dimensional modeling, CPLD／FPGA, CORDIC algorithm structure 
1. Design requirements:  
The evaluation of trigonometric function is an important operation in scientific calculation and 
engineering design, frequently used in scientific and technological fields such as trigonometry, evaluation 
of quadratic equation, two-dimensional modeling, error calculation, numerical analysis, probability 
statistics, image processing, etc. The operational process is complex and arithmetic speed is significantly 
lower than other operations. In particular, it is difficult to implement hardware. For this reason, people are 
always looking for an algorithm, which is preferable to hardware implementation, so as to improve the 
arithmetic speed of trigonometric function. CORDIC algorithm[1] is a currently accepted relatively ideal 
algorithm. 
At present, the hardware implementation of a certain algorithm mainly relies on complex 
programmable logic device and field － programmable gate array (CPLD ／ FPGA, Complex 
Programmable Logic Device／Field Programmable Logic Device). Arithmetic speed and occupied chip 
area are main technical indicators measuring the quality of a certain algorithm. Although COImIC 
algorithm has been able to evaluate trigonometric function with hardware satisfactorily, there might be 
still a certain room for improvement due to its complexity. Thus, the arithmetic speed of trigonometric 
Available online at www.sciencedirect.com
© 2011 Published by Elsevier B.V. Selection and/or peer-review under responsibility of International Materials Science Society.
Open access under CC BY-NC-ND license.
Open access under CC BY-NC-ND license.
1254  Yalei Shang / Energy Procedia 16 (2012) 1253 – 1258 Author name / Energy Procedia 00 (2011) 000–000 
function can be increased and occupied chip area can be reduced. Therefore, the research work of this 
paper has certain theoretical significance and practical application value. 
1.1. CORDIC algorithm principle in rotate mode 
    The basic idea of CORDIC algorithm is to approach required rotation angle with shifting and 
addition and subtraction through a series of fixed and cardinal number-related angle deflections. Functions, 
that can be calculated with different implementation modes of this algorithm (like rotate mode, hyperbolic 
curve mode, linear mode), includes multiply-divide sine square root, cosine arc tangent vector rotation 
(namely complex multiplication), and exponent arithmetic, etc.   
Set a vector (x1, y1) and a new vector (x2, y2) can be acquired after the rotation of θ -degree, shown in 
Figure 2-1. According to coordinate transformation rules, the two vectors have the following relationship:  
       
⎥⎦
⎤
⎢⎣
⎡
⎥⎦
⎤
⎢⎣
⎡
+
−=⎥⎦
⎤
⎢⎣
⎡
⎥⎦
⎤
⎢⎣
⎡
+
−=⎥⎦
⎤
⎢⎣
⎡
1
1
1
1
2
2
y
x
1tan
tan1
cos
y
x
cossin
sincosx
，
，
，
，
θ
θθθθ
θθ
y
                            (2-1) 
 
Figure 2-1 
Resolve the rotation angle θ  into the sum of N diminishing small rotation angles, 
namelyθ =
∑−
=
1N
0i
iiθδ
and iθ ≥0. When iθ  is in clockwise rotation, 1i −=δ . When iθ  is in counterclockwise 
rotation, 1i =δ .  If ( ) 1-N...2,1,0i2arctan ii ，， == −θ . If there is an angleθ  coinciding with vector (xn, yn), 
the angleθ  can be achieved after N steps of rotation from starting vector (x0, 0), which is illustrated as: 
⎥⎦
⎤
⎢⎣
⎡
⎟⎟⎠
⎞
⎥⎥⎦
⎤
⎢⎢⎣
⎡ −
⎜⎜⎝
⎛=⎥⎦
⎤
⎢⎣
⎡
⎟⎟⎠
⎞
⎥⎥⎦
⎤
⎢⎢⎣
⎡ −
⎜⎜⎝
⎛=⎥⎦
⎤
⎢⎣
⎡
−
−
==−
−
=
∏∏∏ 0
x
*
12
21
*cos
0
x
12
21
cos
x 0
i
i
i
i
1-N
0i
1-N
0i
i
0
i
i
i
i
1-N
0i
i
n
n
，
，
，
，
δ
δθδ
δθ
y          (2-2) 
Here, suppose K=∏
=
1-N
0i
icosθ , K stands for scale correction factor. When N→∞, K→0.607253. So, K can 
be seen as a constant. Thus, rotation operation of each step is: 
）（
，，，
i
ii1i
i
iii1i
i
iii1i
2arctan
1-N...2,1,0i2xyy
2yxx
−
+
−
+
−
+
−=
=+=
−=
δθθ
δ
δ
                 20                   (2-3) 
There are only dextroposition, addition and subtraction algorithm in formula 2-3, which is particularly 
suitable to be implemented with FPGA. For each step, arctan2^(-i) is a constant that can be calculated in 
advance and called through table search.  
Yalei Shang / Energy Procedia 16 (2012) 1253 – 1258 1255 Author name / Energy Procedia 00 (2011) 000–000  
                                            0
sin)sincos(/1y
cos)sincos(/1x
0000n
0000n
→
=+→
=−→
nz
zxzyK
zyzxK
θ
θ
                               (2-4) 
1.2. CORDIC design of pipeline organization
Implementation schemes of rotate mode CORDIC include iterative structure and pipeline organization. 
The scheme based on iterative structure requires one group of units of shifting and addition and 
subtraction calculation. The hardware expense is low but it is relatively difficult to control. Besides, it 
requires several clock periods to finish one CORDIC operation. It is difficult to meet the requirements of 
high speed. And it also requires to be latched. As for the scheme based on pipeline organization, each 
level of iterative operation uses an individual set of operation unit. Compared to the scheme based on 
feedback structure, the pipeline organization has a fairly high processing speed. A group of results will be 
calculated in each clsock period when pipeline is full, and it is easy to control.  
 
Figure 3-2 CORDIC of iterative structure 
 
Figure 3-3 CORDIC of pipeline organization 
1256  Yalei Shang / Energy Procedia 16 (2012) 1253 – 1258 Author name / Energy Procedia 00 (2011) 000–000 
2. The implementation of CORDIC based on FPGA 
Set the data width as 16bit, definitions of input and output signals in each pipeline level of CORDIC 
operation are shown in Figure 4-1: 
Table 4-1 Definitions of input and output signals 
Name property Function  
x_i[15..0] Input  16bit cosine，top digit is sign bit 
y_i[15..0] Input  16bit sine，top digit is sign bit 
z_i[15..0] Input  16bit angle value，z [15] is sign bit，z[14..12]is t he interagal part of 
angle value (radian range 0-2π)，z[11..0]is the decimal part of angle value
n[3..0] Input  Current level of pipeline operation 
Sign_i Input  Angle of output of former level (input of this level) z_i plus-minus symbol
x_o[15..0] Output  16bit cosine，top digit is sign bit 
y_o[15..0] Output  16bit sine，top digit is sign bit 
z_o[15..0] Output  16bit angle value，z [15] is sign bit，z[14..12]is t he interagal part of 
angle value (radian range 0-2π)，z[11..0]is the decimal part of angle value
m[3..0] Output  Next level of pipeline operation 
Sign_o Output  Current output angle z_o plus-minus symbol 
 
In Quartus II software environrment, as shown in Figure 4-1, program left-shifting module shift_r 
with VHDL language and formulate LPM_add_sub module to implement addition and substraction 
operation. The module constan_z output constant arctan2^(-i) required by each level of pipeline. The 
module add_i makes operation levels have autoincrement 1 and outputs to the next level. The module sign 
decides plus-minus symbol of output angle z_o of the current level.  
 
VCC
X_i[15..0] INPUT
VCC
n[3..0] INPUT
VCC
Y_i[15..0] INPUT
VCC
sign_i INPUT
VCC
Z_i[15..0] INPUT
X_o[15..0]OUTPUT
m[3..0]OUTPUT
Y_o[15..0]OUTPUT
sign_oOUTPUT
Z_o[15..0]OUTPUT
i[3..0] out_constant[15..0]
constant_z
inst1
input[15..0]
n[3..0]
output[15..0]
shif t_r
inst2
X[15..0] sign_o
sign
inst3
A
B
A+B/A-Bdataa[15..0]
datab[15..0]
add_sub
result[15..0]
lpm_add_sub0
inst4
A
B
A+B/A-Bdataa[15..0]
datab[15..0]
add_sub
result[15..0]
lpm_add_sub0
inst5
input[15..0]
n[3..0]
output[15..0]
shif t_r
inst6
A
B
A+B/A-Bdataa[15..0]
datab[15..0]
add_sub
result[15..0]
lpm_add_sub0
inst7
NOT
inst10
X_i[3..0] X_o[3..0]
add_i
inst
 
Figure 4-1 Single level operation stucture of CORDIC 
Yalei Shang / Energy Procedia 16 (2012) 1253 – 1258 1257 Author name / Energy Procedia 00 (2011) 000–000  
Pack the above modules as the first level module. Same modules of N levels in series are able to 
consitute the CORDIC sine and cosine IP core of N levels of pipeline. Here, saved values of module 
constan_z are shown in Figure 4-2. From the figure, it can be seen that the four top digits of z_i[15..0] are 
used to represent symbol and integral parts. So, only low 12 digits can represent decimal part, that is to 
say, the actual pipeline level is 12. 
Table 4-2 Data of arctan(2^(-i)) 
Pipeline level arc tangent Radian value 16bit data of binary system 
0 arctan(2^(0)) 0.785398 0000110010010000 
1 arctan(2^(-1)) 0.463647 0000011101101011 
2 arctan(2^(-2)) 0.244978 0000001111101011 
3 arctan(2^(-3)) 0.124354 0000000111111101 
4 arctan(2^(-4)) 0.062418 0000000011111111 
5 arctan(2^(-5)) 0.031239 0000000001111111 
6 arctan(2^(-6)) 0.015623 0000000000111111 
7 arctan(2^(-7)) 0.007812 0000000000011111 
8 arctan(2^(-8)) 0.003906 0000000000001111 
9 arctan(2^(-9)) 0.001953 0000000000000111 
10 arctan(2^(-10)) 0.000976 0000000000000011 
11 arctan(2^(-11)) 0.000488 0000000000000001 
The maximum rotation angle θ = °=∑
−
=∞→
88.99limit
1N
0i
in
θ , which fails to cover the range from 0° to 
360°. Therefore, pre-process units need to be added, so that IP core could calculate sine 
and cosine of any angle value from 0° to 360°. Set the input angle as z. specific rules are 
discussed in the following four situations. Transformational rules are shown in Figure 4-
3. 
Table 4-3 Rules of angle transformation and function value restoration 
Input angle  Transformed angle 16bit transformation cosine sine 
0°≤ z≤ 90° z z x y 
90°≤ z≤ 180° z-90° z-0x1921 -y x 
180°≤ z≤ 270° z-180° z-0x3243 -x -y 
270°≤ z≤ 360° z-270° z-0x4B64 y -x 
The overall design after quadrant conversion by adding pre-process and post-process units is shown in 
Figure 4-2.  
 
Figure 4-2 The principle diagram of quadrant conversion 
1258  Yalei Shang / Energy Procedia 16 (2012) 1253 – 1258 Author name / Energy Procedia 00 (2011) 000–000 
References 
[1]   Xiangping  Meng,  Yan  Gao.  Electric  Systems  Analysis  [M]. Beijing: Higher Education Press, 2004. 3-21.  
[2]  Yu   Li,  Jingsen  Liu.  Mechanism  and  Improvement  of  Direct Anonymous Attestation Scheme[J]. Journal of Henan 
University,  
[3]  Nie Yi. The Analyses and Compensating of the Time Error SCM Timer In Error[J]. Control and Automation, 2002, (4). 
[4]  SONG Yifei, ZhAO Youyi, LIN Yinan. Temperature Test System Based on Sensor DS1820[J]. Electro-Optic Technology 
Application, 2009, (3). 
