GPU-based video processing in the context of TV broadcasting

Fink, Heinrich

doi:10.34726/hss.2013.21643

Record link:

https://doi.org/10.34726/hss.2013.21643
http://hdl.handle.net/20.500.12708/8085

Title:

GPU-based video processing in the context of TV broadcasting

Citation:

Fink, H. (2013). GPU-based video processing in the context of TV broadcasting [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2013.21643

reposiTUm DOI:

10.34726/hss.2013.21643

CatalogPlus:

AC11042725

Publication Type:

Thesis - Diplomarbeit

Language:

English

Authors:

Fink, Heinrich

Advisor:

Wimmer, Michael

Organisational Unit:

E186 - Institut für Computergraphik und Algorithmen

Date (published):

2013

Number of Pages:

127

Abstract:

Diese Arbeit beschäftigt sich mit GPU-basierter Verarbeitung von Video im Kontext des Grafiksystems eines Live-TV Senders. Kommende TV Spezifikationen, wie UHD-1, haben eine besonders hohe Datenrate an Bildinformationen zur Folge. Die Echtzeitverarbeitung solcher Daten- raten stellt eine besondere Herausforderung für die Implementierung eines Software-basierten TV-Grafiksystems dar. Um die erforderten Datenraten zu erreichen, muss das Programm seine Berechnungen auf Haupt- und Grafikprozessor (CPU und GPU) parallel ausführen. Insbesondere müssen die Übertragungen der Videobilder zwischen Haupt- und Grafikspeicher über den PCIe-Bus mit den Berechnungen der CPU und GPU überlappt werden, um eine effiziente Ausführung zu garantieren. Diese Arbeit beschäftigt sich daher mit der Frage, welche Methoden für die Implementierung eines solchen Grafikprogramms zur Verfügung stehen, und welche Daten- raten damit effektiv erzielt werden können. Um diese Fragen zu beantworten, implementieren wir den Prototypen einer Software für das Rendering von TV-Grafiken. Dabei setzen wir die Programmierschnittstelle OpenGL ein, um die Fähigkeiten des Grafikprozessors für die effiziente Verarbeitung von Bilddaten auszunützen. Wir zeigen fortgeschrittene Methoden der OpenGL-Programmierung, welche die Bearbeitung von professionellem Videomaterial erleichtern, und helfen, den maximalen Grad an Parallelität in der Ausführung des Grafikprozessors zu erreichen. Insbesondere zeigen wir die GPU-basierte Verarbeitung des Studioformates V210, das im Vergleich zu herkömmlichen Bildformaten besondere Herausforderungen an die Implementierung stellt. Unser Prototyp basiert auf dem Softwaremodell einer Pipeline. Das Programm ist dadurch in der Lage, einzelne Schritte der Bildverarbeitung zu parallelisieren, und auf mehrere Prozessoren dynamisch zu verteilen. Dadurch können wir verschiedene Optimierungsverfahren einsetzen, um den Datendurchsatz des Programms zu maximieren. Um diese Verfahren und generell die Implementierung des Prototyps zu analysieren, integrieren wir die Messung des Laufzeitverhaltens direkt in unsere Software. Das ermöglicht die automatisierte Erstellung von Profilen verschiedener Testszenarios, deren Analyse die Basis für die Resultate dieser Arbeit bilden. Unser Prototyp zeigt, dass die in dieser Arbeit vorgestellten Methoden die Echtzeitverarbeitung von hochauflösendem Videomaterial in hoher Qualität ermöglichen. Unsere Ergebnisse zeigen auch, dass für verschiedene GPU-Architekturen unterschiedliche Optimierungsverfahren eingesetzt werden müssen, um den optimalen Durchsatz zu erreichen. Die Fähigkeit unserer Software, die Optimierung der Videopipeline dynamisch anzupassen, ist also eine besonders wichtige Eigenschaft, und letztendlich Vorraussetzung für die Implementierung eines marktreifen Grafikproduktes, das auf unterschiedlichen Hardware-Konfigurationen effizient laufen soll.

This thesis investigates GPU-based video processing in the context of a graphics system for live TV broadcasting. Upcoming TV standards like UHD-1 result in much higher data rates than existing formats. Processing such data rates while satisfying the real-time requirement of live TV poses a particular challenge for the implementation of a software-based broadcast graphics system. In order to reach the required data rates, the software needs to process image data concurrently on the central processing unit (CPU) and graphics processing unit (GPU) of the machine. In particular, the transfers of image data between main and graphics memory need to be overlapped with CPU-based and GPU-based executions in order to maximize data throughput. In this thesis, we therefore investigate the following questions: Which methods are available to a software implementation in order to reach this level of parallelism? Which data rates can actually be reached using these methods? In order to answer these questions, we implement a prototype of a software for rendering TV graphics. To take advantage of the GPU-s ability to efficiently process image data, we use the OpenGL application programming interface (API). We use advanced methods of OpenGL programming to render high-quality video and increase the level of employed parallelism of the GPU. We implement the transcoding between RGB and the professional video format V210, which is more complex to process than conventional consumer-oriented image formats. In our software, we apply the pipeline programming pattern in order to distribute stages of the video processing algorithm to different threads. As a result, those stages execute concurrently on different hardware units of the system. Our prototype exposes the applied degree of concurrency to the user as a collection of different optimization settings. In order to evaluate these optimizations, we integrate a profiling mechanism directly into the execution of the pipeline. This allows us to automatically create performance profiles while running our prototype with various test scenarios. The results of this thesis are based on the analysis of these traces. Our prototype shows that the methods described in this thesis enable a software program to process high-resolution video in high quality. The results of our evaluations also show that there is no single best optimization setting for every GPU architecture. Different driver implementations and hardware features require our prototype to apply different optimization settings for each device. The ability of our software structure to dynamically change the degree of concurrency is therefore an important feature. For broadcasting software that is expected to perform well on a range of hardware devices, this is ultimately an essential feature.

Additional information:

Abweichender Titel laut Übersetzung der Verfasserin/des Verfassers
Zsfassung in dt. Sprache

License:

In Copyright

Appears in Collections:

Thesis