Parallel H.264/AVC fast rate-distortion optimized motion estimation by using a graphics processing unit and dedicated hardware

Muhammad Usman Shahid, Ashfaq Ahmed, Maurizio Martina, Guido Masera, Enrico Magli

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

Heterogeneous systems on a single chip composed of a central processing unit, graphics processing unit (GPU), and field-programmable gate array (FPGA) are expected to emerge in the near future. In this context, the system on chip can be dynamically adapted to employ different architectures for execution of data-intensive applications. Motion estimation (ME) is one such task that can be accelerated using FPGA and GPU for high-performance H.264/Advanced Video Coding encoder implementation. This paper presents an inherent parallel low-complexity rate-distortion (RD) optimized fast ME algorithm well suited for parallel implementations, eliminating various data dependencies caused by a reliance on spatial predictions. In addition, this paper provides details of the GPU and FPGA implementations of the parallel algorithm by using OpenCL and Very High Speed Integrated Circuits (VHSIC) Hardware Descriptive Language (VHDL), respectively, and presents a practical performance comparison between the two implementations. The experimental results show that the proposed scheme achieves significant speedup on GPU and FPGA, and has comparable RD performance with respect to sequential fast ME algorithm.

Original languageBritish English
Article number6882206
Pages (from-to)701-715
Number of pages15
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume25
Issue number4
DOIs
StatePublished - 1 Apr 2015

Keywords

  • Field-programmable gate array (FPGA)
  • graphics processing unit (GPU)
  • H.264/Advanced Video Coding (AVC)
  • OpenCL
  • parallel fast motion estimation (ME)

Fingerprint

Dive into the research topics of 'Parallel H.264/AVC fast rate-distortion optimized motion estimation by using a graphics processing unit and dedicated hardware'. Together they form a unique fingerprint.

Cite this