FONT SIZE : AAA
Building on this, the DES core developed previously was used as the core for a Triple-DES implementation. The idea of triple DES is that data is encrypted three times. The rationale for choosing three iterations and the advantages and disadvantages of this are explained in [5]. A common form of Triple DES is known as EDE2, which means data is encrypted, decrypted and then encrypted again using two different keys. The first key is used for both encryptions and the second key for the decryption. There are obviously a number of different trade-offs
Figure 19.5 Area vs. throughput for all DES designs.
that can be made in this design. Each of these is examined in the following sections. In all cases, the smallest implementation (design (4)) was used as the DES core.
To achieve a minimum area implementation, a single DES core is used for all three stages. The data is passed through this core three times with the different permutations of keys and encryption mode to achieve the EDE2 algorithm. Two different styles of VHDL were tried. These differed in the method used to select the different inputs for each encryption step. The first style used a case statement and the second style used indexed arrays. The case statement style results in the following VHDL design:
library ieee;
use ieee.std_logic_1164.all;
entity tdes_ede2_iterative is
port(
plaintext : in std_logic_vector(1 to 64);
key1 : in std_logic_vector(1 to 64);
key2 : in std_logic_vector(1 to 64);
encrypt : in std_logic;
go : in std_logic;
ciphertext : out std_logic_vector(1 to 64);
done : out std_logic);
end;
architecture behavior of tdes_ede2_iterative is
.
begin
process
variable data : vec64;
variable key : vec56;
variable mode : std_logic;
begin
wait until go = 1;
done <= 0;
wait for 0 ns;
data := plaintext;
for i in 0 to 2 loop
case i is
when 1 =>
key := key_reduce(key2);
mode := not encrypt;
when others =>
key := key_reduce(key1);
mode := encrypt;
end case;
data := des_core(data,key,mode);
end loop;
ciphertext <= data;
done <= 1;
end process;
end;
It can be seen that this uses a case statement to select the appropriate key and encryption mode for each iteration. The characteristics of the case statement solution are shown in Table 19.2 in the line labeled (5). The core DES algorithm accounts for 48 cycles (3 iterations of 16 rounds with 1 cycle per round), leaving an additional overhead of 3 cycles. This additional 3 cycles is due to the case statement selection of the key, which adds an extra cycle per iteration of the core. The second style uses arrays to store the keys and modes and then indexes these arrays to set the key and mode for each iteration. The process becomes:
process
.
type keys_type is array (0 to 2) of vec56;
variable keys : keys_type;
type modes_type is array (0 to 2) of std_logic;
variable modes : modes_type;
begin
.
modes := (encrypt, not encrypt, encrypt);
keys := (key_reduce(key1),
key_reduce(key2),
key_reduce(key1));
for i in 0 to 2 loop
data := des_core(data,keys(i),modes(i));
end loop;
.
It was found that the latency was the same as the case statement solution but the area was approximately 25% larger. This overhead is mostly due to the use of the register arrays, which add up to about 200 extra flip-flops. Clearly the case statement design is the most efficient of the two and so this solution was kept and the array style solution discarded.
Minimum Latency Pipelined
To achieve minimum latency between samples, three DES cores are used to form a pipeline. Data samples can then be fed into the pipeline every 18 cycles (the latency of the single core), although the time taken for a result to be generated is 50 cycles because of the pipeline length. The circuit is simply three copies of the single-DES process:
architecture behavior of tdes_ede2_pipe is
.
signal intermediate1, intermediate2 : vec64;
begin
process
begin
wait until go = 1;
intermediate1 <=
des_core(plaintext,key_reduce(key1),encrypt);
end process;
process
begin
wait until go = 1;
intermediate2 <=
des_core(intermediate1,key_reduce(key2),not
encrypt);
end process;
process
begin
wait until go = 1;
done <= 0;
wait for 0 ns;
ciphertext <=
des_core(intermediate2,key_reduce(key1),
encrypt);
done <= 1;
end process;
end;
Note how the done output is driven only by one of the cores; ⣔ this will give the right result provided all three cores synthesize to the same delay, which in practice they will. This design decision alleviates the need to have handshaking between the cores. MOODS predicts that this design has the area and delay characteristics shown in Table 19.2 in the line labeled (6). The state machine in Figure 19.6 shows the three independent processes. For example, the first
Figure 19.6 Control state machine for pipelined triple-DES.
process is represented by states c2, c3, and c4. The first two states perform the handshaking on go and c4 implements the DES core with its 16 iterations. State c7 is the second DES core and c10 the third.
Manufacturer:Xilinx
Product Categories: FPGAs (Field Programmable Gate Array)
Lifecycle:Obsolete -
RoHS:
Manufacturer:Xilinx
Product Categories:
Lifecycle:Obsolete -
RoHS: No RoHS
Manufacturer:Xilinx
Product Categories:
Lifecycle:Obsolete -
RoHS: No RoHS
Manufacturer:Xilinx
Product Categories:
Lifecycle:Any -
RoHS:
Manufacturer:Xilinx
Product Categories: FPGAs (Field Programmable Gate Array)
Lifecycle:Active Active
RoHS: No RoHS
Support