GPUImage – Overview

As a quite famous and professional framework for image processing in iOS, GPUImage has been released for a long time (the first version were published in 2013). In this post, I would like to share my own ideas and thoughts about it. And the cover is the one of works created by my first application with GPUImage.

When I first heard about GPUImage, it was in the autumn of 2015, because of a new application project. Before that, I did my first job as an iOS developer in quite tranditional IT company in Shanghai, which mainly provides services for commercial firms. So you can image that my daily work was so tedious, all kinds of uses of TableView and CollectionView, data request and UI layout adaption, that’s it. Until now I still feel not good for dealing with the layout in iPad when it changes its orientation, while there are some useful frameworks can help to address this kind of situation.

After that, I went to Hangzhou. And I always think that was one of the best choise for my life, even if at the beginning the work was so tough. There was only one guy doing coding stuff, yes that was me. Also, the product manager was always late and always liked to change his mind. For that application, a totally new prototype were released every month in average. And I spent couple days to read the source code of the previous versions to grasp the implementation of filters and stickers. CoreImage, a build-in framework can easily implement a simple filter function. And the sticker function are achieved by using normal View — UIView. Those kinds of methods for image processing were usually mentioned by interviewees I met, who thougth they already can handle most issues of image processing in mobile devices.

And someday the product manager pulled a requirement which was to implement a function like the layer in PS. That was a so incredible abstacle for me who only know how to requrest and update data at that time. Obviously, it costed me several days and it was luck that I found a class ‘CALayer’ in ‘CoreAnimation’ framework can do it perfectly. In addition, I was suprised by the functions of its sub-classes, which can deal with almost all kinds of animations and UI effects. And next, the biggest problem was that I had realised that only like some specific image processing frameworks can help me to develop an application with complicated image processing functions like OpenGLES, OpenCV or Metal, while I didn’t have enough time to learn how to use them. Lastly, I found GPUImage from GitHub, and it saved my job.

The wiki of GPUImage in GitHub is not a concrete manual. You only can get three points from it:

  1. GPUImage is based on OpenGLES 2.0;
  2. The performance of GPUImage is better than Coreimage;
  3. The process of adding a filter to a static image, a video or a frame captured by camera and displaying the result on GPUImageView or generate a file with a specifi format.

Maybe it’s because I only can understand these three points, so I still could not know some other things of its execution process, like: why the function ‘processImage‘ must be executed after the code of adding filters; it would crash if the function ‘useNextFrameForImageCapture‘ was not executed before export processing result as an image; and what is glsl used for? Moreover, I could not find more information from the Internet at that time no matter English or Chinese. Hence, I almost gave up to use GPUImage to implement those complicated image processing requirements.

Then, I wanna say some my roughly understanding of GPUImage. If there are some problems or mistakes found, pls correct me, cheers.

1. Fragment Shader and Vertex Shader

OpenGLES processes images through Fragment Shader and Vertex Shader. GPUImage is a encapsulated and extended framework based on an iOS build-in framework ‘GLKit‘ which is encapsulated from OpenGLES. And the version of OpenGLES could be selected in GLKit, while the OpenGLES version in GPUImage is OpenGLES2.0. We should know that there are various differences betweent different versions of OpengGLES. In my opinion, the most filter functions or classes provided by GPUImage are implemented by a series operations through Fragment Shader. This is because the results of the most filters do not change the size and shape of the processed image, and it’s more like to generate new pixels by calculating the original pixels. This is one of the most usual uses of glsl, which can define the process of pixel calculation methods. So glsl is very important part of GPUImage if you wanna use it to generate unique and amazing results.

2. Pipeline

GPUImage provides a concept of pipeline. Unlike the line in Masonry, GPUImage treats each input, filter and output as a piece of pipeline, and only those pieces of pipeline are connected in series, the image information could be transmitted through each independent piece of pipeline as input and finally get to result. So this is my own simple understanding of the execution process of GPUImage.

3. MVP

As I mentioned previously, GPUImage makes every independent piece of pipeline be connected in series by a quite special design pattern in iOS development, which is called MVP. The compulsory classes for the execution process, as well as the classes which are allowed to connected with others, have some common features:

  1. They are all the sub-classes of GPUImageOutput except GPUImageView;
  2. They all obey the protocol of GPUImageInput, except the input classes (GPUImage provides five kinds of input classes which are GPUImagePicture、GPUImageRawDataInput、GPUImageMovie、GPUImageUIElement、GPUImageVideoCamera. It’s easy to know what types of input these classes can deal with).

The class ‘GPUImageOutput‘ is not used directly during the whole process, because all output classes used are its sub-classes. It can be noticed from their names that the general use of GPUImageOutput is to be used as a output and the classes following the GPUImageInput protocol is used as an input. So during the process of connecting each piece of pipeline:

  1. The first object is only as an input, so it must be one of the five input classes and also not follow the GPUImageInput protocol;
  2. The classes which need to be connected in the middle of pipeline, like GPUImageFilter, as well as the father class of all filter classes, not only is the sub-class of GPUImageOutput but also obey the protocol of GPUImageInput. This is because it needs to reveive data from its privous node, and transmit the data to the next node after processing done;
  3. As the last node/object, GPUImageView is not necessary to be the sub-class of GPUImageOutput, in order to no next node at all.

Those are three key points of GPUImage I suppose. Actually I wrote this post in Chinese almost 2 years ago. You see, time is always so fast. And I’ll keep translating other posts about GPUImage in English and hope if there is some chances that I can modify GPUImage or develop a new image processing framework which could implement more interesting effects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.