TY - JOUR

T1 - Operation-saving VLSI architectures for 3D geometrical transformations

AU - Karagianni, Konstantina

AU - Paliouras, Vassilis

AU - Diamantakos, George

AU - Stouraitis, Thanos

N1 - Funding Information:
Thanos Stouraitis received the PhD degree in electrical engineering from the University of Florida in 1986 (for which he received the Outstanding PhD Dissertation Award), the MSc degree in electrical and computer engineering from the University of Cincinnati in 1983, the MS degree in electronic automation from the Uni-versity of Athens, Greece, in 1981, and the BS degree in physics from the University of Athens, Greece, in 1979. He is a professor of electrical and computer engineering at the University of Patras, Greece, where he directs an international graduate program on Digital Signal Processing Systems. He also serves as the director of the Electronics and Computers Division of the ECE Department. He has served on the faculties of Ohio State University and the University of Florida. His current research interests include signal and image processing, application-specific processor technology and design, design and architecture of optimal digital systems, and computer arithmetic. He leads several DSP processor design projects funded by the European Union, American organizations, and the Greek government and industry. He has authored or coauthored more than 130 technical papers. He holds one patent on DSP processor design. He has authored several book chapters, the University of Patras Press book Digital Signal Processing, and coauthored the Marcel Dekker Inc. book Digital Filter Design Software for the IBM PC. He is a senior member of the IEEE. He serves as a regional editor for the Journal of Circuits, Systems, and Computers, associate editor for the IEEE Transactions on Circuits and Systems I and II, associate editor for the IEEE Transactions on VLSI, editor for the IEEE Interactive Magazines, editor-at-large for Marcel Dekker Inc., has served as an associate editor for the Journal of Circuits, Systems, and Computers, and as a consultant for various industries. He also reviews proposals for US National Science Foundation, the European Commission, and other agencies. He has served as chair of the VLSI Systems and Applications (VSA) Technical Committee and as a member of the DSP and the Multimedia Technical Committees of the IEEE Circuits and Systems Society. He is the chair of the IEEE Signal Processing chapter in Greece. He was the general chair of the 1996 IEEE International Conference on Electronics, Circuits, and Systems (ICECS) and the technical program chair of Eusipco '98 and ICECS '99. He has served as chair or as a member of the Technical Program Committees of a multitude of IEEE conferences, including ISCAS (program committee track chair). He is the general chair of ISCAS 2006. He received the 2000 IEEE Circuits and Systems Society Guillemin-Cauer Award for the paper “Multi-Function Architectures for RNS Processors.”

PY - 2001/6

Y1 - 2001/6

N2 - Two VLSI architectures for the computationally efficient implementation of the elementary 3D geometrical transformations are introduced. The first one is based on a single floating-point multiply/add unit, while the other one comprises a four processing-element vector unit. By exploiting the structure of the elementary transformation matrices, some of the elements of which are ones and zeros, the proposed architectures avoid full-matrix multiplication for the matrix multiplications involved in the calculation of the transformation matrix by treating them as updates of specific elements, the new values of which are obtained by scalar operations in the case of the single-processor architecture or by simple vector operations in the case of the processor array. Thus, the floating-point operation count and the number of memory accesses required by a transformation are reduced and, therefore, the performance of the circuit which computes the transformation matrix, in terms of execution time, is improved at minimal hardware cost. Furthermore, a circuit is proposed which, for each sequence of transformations, selects the most appropriate direction for computing the product of the matrices in the corresponding stack of transformation matrices in order to further reduce the number of floating-point operations compared to the case where the direction of the computation of the successive matrix products is predetermined. The proposed single-processor architecture is suitable for low-cost applications, while the parallel execution scheme implemented by the introduced parallel processor may be implemented by any four-PE processor with small overhead.

AB - Two VLSI architectures for the computationally efficient implementation of the elementary 3D geometrical transformations are introduced. The first one is based on a single floating-point multiply/add unit, while the other one comprises a four processing-element vector unit. By exploiting the structure of the elementary transformation matrices, some of the elements of which are ones and zeros, the proposed architectures avoid full-matrix multiplication for the matrix multiplications involved in the calculation of the transformation matrix by treating them as updates of specific elements, the new values of which are obtained by scalar operations in the case of the single-processor architecture or by simple vector operations in the case of the processor array. Thus, the floating-point operation count and the number of memory accesses required by a transformation are reduced and, therefore, the performance of the circuit which computes the transformation matrix, in terms of execution time, is improved at minimal hardware cost. Furthermore, a circuit is proposed which, for each sequence of transformations, selects the most appropriate direction for computing the product of the matrices in the corresponding stack of transformation matrices in order to further reduce the number of floating-point operations compared to the case where the direction of the computation of the successive matrix products is predetermined. The proposed single-processor architecture is suitable for low-cost applications, while the parallel execution scheme implemented by the introduced parallel processor may be implemented by any four-PE processor with small overhead.

KW - Elementary geometrical transformations

KW - Graphics processor

KW - Vector unit

KW - VLSI architecture

UR - http://www.scopus.com/inward/record.url?scp=0035365363&partnerID=8YFLogxK

U2 - 10.1109/12.931896

DO - 10.1109/12.931896

M3 - Article

AN - SCOPUS:0035365363

SN - 0018-9340

VL - 50

SP - 609

EP - 622

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

IS - 6

ER -