16th Annual Computer Security Applications Conference
December 11-15, 2000
New Orleans, Louisiana
The Chinese Remainder Theorem and its Application in a High-Speed RSA Crypto-Chip
Johann Großschädl
IAIK, Graz University of Technology
Austria
The performance of RSA hardware is primarily determined by an efficient
implementation of the long integer modular arithmetic and the ability to
utilize the Chinese Remainder Theorem (CRT) for the private key
operations. This paper presents the multiplier architecture of the
RSAΓ crypto chip, a high-speed hardware accelerator for long integer
modular arithmetic. The RSAΓ multiplier datapath is reconfigurable
to execute either one 1024 bit modular exponentiation or two 512 bit
modular exponentiations in parallel. Another significant characteristic
of the multiplier core is its high degree of parallelism. The actual
RSAΓ prototype contains a 1056*16 bit word-serial multiplier which
is optimized for modular multiplications according to Barret's modular
reduction method. The multiplier core is dimensioned for a clock frequency
of 200 MHz and requires 227 clock cycles for a single 1024 bit modular
multiplication. Pipelining in the highly parallel long integer unit allows
to achieve a decryption rate of 560 kbit/s for a 1024 bit exponent. In
CRT-mode, the multiplier executes two 512 bit modular exponentiations in
parallel, which increases the decryption rate by a factor of 3.5 to almost
2 Mbit/s.
Read Paper (in PDF)