-
-
Notifications
You must be signed in to change notification settings - Fork 414
Add option to wait for submitted GL commands before surface destruction #3584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add option to wait for submitted GL commands before surface destruction #3584
Conversation
Can you provide some more context? When is this needed? |
We're seeing a case on specialized hardware where one render thread is going and another one gets stuck shutting down in the surface or context destruction. Specifically in the egl calls. The thinking is flushing out the queue first might help with that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this should not be configurable in my opinion.
Unless it's just intended for testing with the intention to decide on a possible other default behavior later. In that case please add a comment with an explanation and a tracking issue.
mEglSurface = null; | ||
} | ||
} | ||
|
||
public void finish() { | ||
if (DEFER_CONTEXT_DESTRUCTION) { | ||
MapLibreGLSurfaceView view = mGLSurfaceViewWeakRef.get(); | ||
if (view != null && (mEglContext != null || mEglDisplay != null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a race condition here. To fix it, assign the "final" identifiers first, then use those in the "if" condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The surface/context are touched only by the render thread (and in synchronized blocks). Unless I'm missing something this should guarantee that the values are valid in that scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, but that requires non-local knowledge. For example, I am not sure if your analysis is correct.
Meanwhile, if you adopt the suggestion, then we won't need to discuss it. It will locally-provable that there is no race condition. That is preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd like to see if this case is necessary. It's a weird way to work around a driver bug and, if we're lucky, we can get rid of it completely.
If it does have to be merged in, then we should look closer.
factory.destroyContext(mEgl, display, context); | ||
} | ||
|
||
if (display != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the original code this runs whenever display != null
. Here we might skip this based on the earlier "if" condition (line 313). I don't see why.
More generally, there are two ways of finishing, by duplication. What about combining these, e.g. always defining and using the "final" identifiers, so that we either do "the thing" in this thread or we do "the thing" in a cleanup thread, but where we always do the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. You're right. The display cleanup will get skipped when there is no valid mGLSurfaceViewWeakRef
, but at the same time MapLibreGLSurfaceView
is the owner of the EglHelper
(and passes itself as a weak reference) and I'm not sure if there is a scenario where the parent can go out of scope and the helper is still alive (this leads to other problematic leaks).
My main concern when adding the workarounds was to keep the existing surface lifecycle code the same since it has been problematic in the past. If it is confirmed that they solve the existing issues this needs to be extended to cover the texture view variant.
MapLibreGLSurfaceView view = mGLSurfaceViewWeakRef.get(); | ||
if (view != null) { | ||
view.eglWindowSurfaceFactory.destroySurface(mEgl, mEglDisplay, mEglSurface); | ||
if (DEFER_CONTEXT_DESTRUCTION) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: To keep these cases from looking more different than they are, than they are supposed to be, how about swapping the "final" identifier assignments with the "if" statement. We could always call factory.destroySurface(mEgl, display, surface);
either in this thread or in the cleanup thread.
Possible workarounds for some obscure Android ANRs during surface/context destruction (more info #3618).
MapLibreGLSurfaceView.WAIT_COMMANDS_ON_CLEANUP = true
tries to wait the GPU commands since the opengl/egl implementation seems to hang on cleanup, so we're trying to enforce the order. When enabled it adds some wait time since it forces GPU task completion.MapLibreGLSurfaceView.DEFER_CONTEXT_DESTRUCTION = true
moves the blocking calls to a different thread. The main thread still needs to pause the render thread (and wait for it).