From 31ab17fb0c52b1d8aa7b556c0d18927eb92e2208 Mon Sep 17 00:00:00 2001
From: Christian Poetzsch <christian.potzsch@imgtec.com>
Date: Mon, 7 Dec 2015 13:36:22 +0000
Subject: [PATCH] Fix the execution point of onFrameAvailable/onFrameReplaced
 callbacks

In a4650a5 the concept of a maximum frame number allowance for the consumer was
introduced. A call to acquireBuffers will only return buffers when their frame
number is less-than-or-equal-to this maximum frame number. When SurfaceFlinger
is the consumer, this maximum  frame number is calculated in the
onFrameAvailable/onFrameReplaced callbacks. These callbacks are called when a
new buffer is dequeued by the application. The problem is that these callbacks
are called _after_ the fence wait which is used to throttle the frame
production of client apps. When the previous frame needs a long time to draw,
those waits can potentially be a long time. As a result SurfaceFlinger won't do
any composition with the new frame until the wait is over.

Normally this isn't a big problem because there is a queue of buffers for
SurfaceFlinger to work with. However, this changes massively when a client app
is using a swap interval of zero. In this case, a new frame will instantly
replace the previous queued frame. However, SurfaceFlinger doesn't know this
until the onFrameReplaced callback gets called - which is delayed by the fence
wait. If the timing is bad, SurfaceFlinger never gets a chance to pick up a new
frame to do the composition with.

We see this behaviour on our TC development system (slow GPU) with legacy
on-screen benchmarks. Such apps are using a swap interval of zero and sometimes
frames don't get updated for several seconds. This behaviour can be also seen
on a Nexus5, although it isn't as obvious as on our TC.

The fix in this cl is to move the EGL throttling to the end of the queueBuffers
function. This ensures that if a frame gets replaced in the queue, all
consumers who installed the callbacks, get called in a timely fashion.

Change-Id: I36e9ecda162150f41e97d4fb7437963a3d86b371
Signed-off-by: Christian Poetzsch <christian.potzsch@imgtec.com>
---
 libs/gui/BufferQueueProducer.cpp | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/libs/gui/BufferQueueProducer.cpp b/libs/gui/BufferQueueProducer.cpp
index 87e5b4d27..a941e2d29 100644
--- a/libs/gui/BufferQueueProducer.cpp
+++ b/libs/gui/BufferQueueProducer.cpp
@@ -696,15 +696,6 @@ status_t BufferQueueProducer::queueBuffer(int slot,
         mCore->validateConsistencyLocked();
     } // Autolock scope
 
-    // Wait without lock held
-    if (mCore->mConnectedApi == NATIVE_WINDOW_API_EGL) {
-        // Waiting here allows for two full buffers to be queued but not a
-        // third. In the event that frames take varying time, this makes a
-        // small trade-off in favor of latency rather than throughput.
-        mLastQueueBufferFence->waitForever("Throttling EGL Production");
-        mLastQueueBufferFence = fence;
-    }
-
     // Don't send the GraphicBuffer through the callback, and don't send
     // the slot number, since the consumer shouldn't need it
     item.mGraphicBuffer.clear();
@@ -728,6 +719,15 @@ status_t BufferQueueProducer::queueBuffer(int slot,
         mCallbackCondition.broadcast();
     }
 
+    // Wait without lock held
+    if (mCore->mConnectedApi == NATIVE_WINDOW_API_EGL) {
+        // Waiting here allows for two full buffers to be queued but not a
+        // third. In the event that frames take varying time, this makes a
+        // small trade-off in favor of latency rather than throughput.
+        mLastQueueBufferFence->waitForever("Throttling EGL Production");
+        mLastQueueBufferFence = fence;
+    }
+
     return NO_ERROR;
 }