GPUImage Detailed Analysis (XIII) Multi-way Video Drawing


preamble

Recently doing rendering of multiplexed videos, this article is a pre-study of its rendering scheme. The effect is approximately as follows.

rendering (visual representation of how things will turn out)

main text (as opposed footnotes)

I. Multi-GPUImageView scheme

There is a very simple solution for rendering multiplexed video with GPUImage. Multiple GPUImageView method, each video screen is rendered separately。 A video corresponds to a filter chain, and after getting the video data, it is cropped and displayed directly to the corresponding GPUImageView; multiple GPUImageViews form multiple video screens, and the effect of screen stitching can be achieved by changing the coordinates of the GPUImageView.

The solution is simple, a demo was written at here。 The demo uses two mp4 videos as source data, GPUImageMovie as the start of the filter chain, GPUImageCropFilter to crop the videos, then GPUImageTransformFilter to transpose them, and finally display them on GPUImageView. The advantages and disadvantages of this option are also clear. Pros. Simple implementation, with screen stitching implemented by the API of the UIKit layer. Disadvantages. Increased rendering to the screen, rendering much more frequently than the screen display frame rate.

II. Single GPUImageView Solution

The most obvious problem with the above scenario is that it renders to the screen more times than the screen refreshes, so a natural extension of the scenario is thatOnly one GPUImageView is used, and the video is rendered uniformly to the GPUImageView。 GPUImageView display area is divided into multiple areas, each area corresponds to one way video; multiple video screens are rendered off-screen, drawn to the corresponding area of the texture, and then processed by multiTexFilter; multiTexFilter collects the rendering of multiple videos, uses the last data if it is not updated, then passes the texture to GPUImageView, and finally GPUImageView renders to the screen.

The solution is more complex, and the same demo is implemented at here。 The demo uses two mp4 videos as source data, still using GPUImageMovie as the start of the filter chain, then processed by GPUImageCropFilter, GPUImageTransformFilter, handed to LYMultiTextureFilter, and finally displayed on GPUImageView. The core of this solution lies in the LYMultiTextureFilter class, and let's take a closer look at its composition.

/**
  Render multiple Textures to the same FrameBuffer
 
  To use it, you need to bind the frameBuffer and display area rect first
  At the time of the newFrame callback, the image is drawn on the bound rect according to the index
  Specifically, a mainIndex is formulated, and when this index is ready a newFrame is called to notify the next one in the response chain
 */
@interface LYMultiTextureFilter : GPUImageFilter

- (instancetype)initWithMaxFilter:(NSInteger)maxFilter;

/**
  Set the draw area rect and bind the texture id

  @param rect draw area origin is the starting point, size is the size of the rectangle; (the range of values is 0~1, point(0,0) means lower left corner, point(1,1) means upper right corner, Size(1,1) means maximum area)
 
  @param filterIndex texture id
 */
- (void)setDrawRect:(CGRect)rect atIndex:(NSInteger)filterIndex;

- (void)setMainIndex:(NSInteger)filterIndex;

The LYMultiTextureFilter inherits from the GPUImageFilter and is passed through the-setDrawRect Binds the drawing area of the texture; in particular, a mainIndex can be set, and newFrame can be called to notify the next target of the filter chain when this index is ready. There are several caveats to the specific implementation. 1. Designation of special rendering regions by vertices. 2、owing toGPUImageFramebuffer Will be reused by default, This resulted in the inability to save the last of grain (in wood etc), here Need to modify the logic-renderToTextureWithVertices: harmony-informTargetsAboutNewFrameAtTime that makes LYMultiTextureFilter use the same GPUImageFramebuffer all the time. 3. the outputFramebuffer needs to be freed at dealloc to avoid memory leaks.

Features of this programme. Advantages: uniform rendering, where the number of renderings avoided is greater than the screen frame rate. Disadvantages: complex implementation of multiTexFilter, screen stitching needs to be implemented with shader.

III. Comparison of the two options

The performance is analyzed by comparing the CPU and GPU percentages of the two solutions.

CPU Comparison : The following. The singleImageView approach has a slight advantage of 0.5s (roughly 2%) for the first 30s interval of the selection. The cpu consumption waveforms are similar throughout, and the percentages are roughly the same.

GPU Comparison: options1 ofGPU Consumption ratio programme2 of Less consumption!!! hard believe, Because of the programme2 Yes to the programme1 of optimisation, But why would it take up more ofGPU? pass (a bill or inspection)instruments Tools View, options1 ofpresentRenderBuffer Calls in excess of 1 per second100 sub-, options2 (located) at30 around once。 Review of the programme1 of design: GPUImageMovie => GPUImageCropFilter => GPUImageTransformFilter => GPUImageView Design of option 2. GPUImageMovie => GPUImageCropFilter => GPUImageTransformFilter => LYMultiTextureFilter => GPUImageView Option 2 filter chain has more LYMultiTextureFilter than option 1, and although there are fewer calls to presentRenderBuffer, it is clear that off-screen rendering becomes more frequent. Instead, the path MultiTextureFilter causes more total gl rendering instructions and more GPU consumption because of the extra LYMultiTextureFilter!!! The reason for this is that as a data source GPUImageMovie is out of sync. Even with option 2, the data output of multiple GPUImageMovie is out of sync, and the number of renderings to LYMultiTextureFilter may likewise appear to be greater than the number of uses. Option 2 has the same problem of redundant rendering relative to option 1. So there was a further optimization of the program.

IV. Screen frame-rate driven single GPUImageView solution

First, a larger image of.

As opposed to the previous of options, here introduceCADisplayLink As a rendering of drive (vehicle wheel), Meanwhile the video data is only kept up to date of one frame。 If eachcropFilter Both can drivemultiTexFilter add washes of ink or color to a drawing (Chinese painting), followmultiTexFilter of The rendering frequency will reach up to120FPS, Wasted performance。 To ensure that the frame rate drawn by multiTexFilter does not exceed 30FPS, the cropFilter rendering order is 1~N, and finally cropFilterN is used as the driver for multiTexFilter rendering. When the cropFilterN is done rendering, the rendering of the multiTexFilter is performed, using the previous frame's data if the other cropFilter has no data. The cropFilter fetches data from the DataManager when rendering, or uses the default data for rendering if no data is available.

For comparison purposes, the same demo is implemented at here。 Compared to option 2, there are several differences. 1. CADisplayLink drives rendering and reads only the current latest frame at a time. 2、 introduceLYAssetReader,LYAssetReader It's forAVAssetReader harmonyAVAssetReaderTrackOutput of Simple encapsulation, Makes it possible to play on a loop harmony Retrieve the current latest of video frame; 3、 useGPUImageMovie of-processMovieFrame interface as an input to the filter chain. 4. can control the rendering order of the cropFilter in each rendering.

On balance, tentative option 3 - Screen framerate driven single GPUImageView solution As a production environment of Rendering Solutions。

five、Demo Implementation process of pits

1. Frame cache multiplexing

It's been a while since I've touched GPUImage, which led to several pitfalls in the demo development process, the first being How to ensure continuous screen rendering。 When implementing the demo of option 2, considering that the video screen of a certain way in the multi-way video rendering may not be updated, for example, GPUImageMovie is slow to read the video source data, the display area corresponding to GPUImageMovie cannot redraw at this time, resulting in an abnormal display of the content of the area and a flickering effect. (GPUImage's frame cache is multiplexed) here There are two options: 1. retaining the last frame of information read by GPUImageMovie and passing it back to cropFilters for rendering each time. 2. retain the results of the last rendering of cropFilters. From a performance optimization point of view, it makes more sense to keep the last rendering result. here of The implementation requires the use ofGPUImage andOpenGL knowledgeable, Retaining the rendering result is actually reusing the last of frame cache, Non-callglClear to clean up; and the GPUImage's outputFramebuffer is recycled after rendering, so some modifications are needed, as follows.

    if (!outputFramebuffer) {
        outputFramebuffer = [[GPUImageContext sharedFramebufferCache] fetchFramebufferForSize:[self sizeOfFBO] textureOptions:self.outputTextureOptions onlyTexture:NO];
        glClearColor(backgroundColorRed, backgroundColorGreen, backgroundColorBlue, backgroundColorAlpha);
        glClear(GL_COLOR_BUFFER_BIT);
    }
    else {
        [outputFramebuffer lock];
    }

Cache is requested for outputFramebuffer only the first time; when it is used again later, only a lock is required. here There's a hole.:GPUImageContext offetchFramebufferForSize will be performed once by defaultlock, It will be used again afterunlock; If, on the second use ofoutputFramebuffer of No subsequentlock, would lead toretainCount<0 of conditions。

2. GPUImageMovie handles CMSampleBuffer

Another larger pitfall is GPUImageMovie, where the demo for option 3 encountered a surefire Crash during implementation: after calling GPUImageMovie's-processMovieFrame When processing a CMSampleBuffer read from a video frame, theconvertYUVToRGBOutput ofglDrawArrays report an error, The error type isEXC_BAD_ACCESS, The specific keywords are"gleRunVertexSubmitARM"。 Suspected that there were anomalies in the textures, anomalies in the vertex data, inconsistent processing threads, etc. causing this, none of which were the cause.

Finally, the following code is inserted to locate the problem by dichotomy.

{
                GLenum err = glGetError();
                if (err != GL_NO_ERROR) {
                    printf("glError: %04x caught at %s:%u
", err, __FILE__, __LINE__);
                }
            }
}

The heart of the matter isGPUImageMovie *imageMovie = [[GPUImageMovie alloc] init]; GPUImageMovie ofconvertYUVToRGBOutput Need to use the data in theinitWithUrl: This interface is only initialized!

conclude

By analyzing, We can find three options each of Application Scenarios。 demo Not a large amount of code, But it's forGPUImage of Flexible applications, Could be good. of examineOpenGL ES learn of knowledge-related。 OpenGL ES Don't know much about it yet. of Please click OpenGL ES Anthology


Recommended>>
1、teaser
2、How I search for quality image footage
3、New Internet Healthcare Regulations Explained and Legal Practice
4、Why choose CLAN for integrated cabling products
5、Indian River build and manage new media platforms to ensure network ideological security

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送

    已发送

    朋友将在看一看看到

    确定
    分享你的想法...
    取消

    分享想法到看一看

    确定
    最多200字,当前共

    发送中

    网络异常,请稍后重试

    微信扫一扫
    关注该公众号