Discussion:
[SDL] Suspend worker thread when in background
rtrussell
2017-01-18 04:34:12 UTC
Permalink
I create a worker thread using SDL_CreateThread and on Android I need to be able to suspend that thread on an SDL_APP_WILLENTERBACKGROUND event. Unfortunately there doesn't seem to be a SDL_SuspendThread or similar function. This must be a common problem so what is the solution?

Richard.
rtrussell
2017-01-19 22:50:03 UTC
Permalink
This must be a common problem...
Well, so I thought but apparently not! Does nobody else use worker threads on Android?

Richard.
Daniel Gibson
2017-01-19 23:00:32 UTC
Permalink
You could use https://wiki.libsdl.org/SDL_CondWait in the worker threads
and signal the condition variable on SDL_APP_WILLENTERFOREGROUND event
so the worker threads continue running.
(Probably only call SDL_CondWait() if some global bool variable you set
on WILLENTERBACKGROUND is true - remember to set it back to false on
WILLENTERFOREGROUND)

Cheers,
Daniel
Post by rtrussell
I create a worker thread using SDL_CreateThread and on Android I need to
be able to suspend that thread on an SDL_APP_WILLENTERBACKGROUND event.
Unfortunately there doesn't seem to be a SDL_SuspendThread or similar
function. This must be a common problem so what is the solution?
Richard.
rtrussell
2017-01-19 23:31:33 UTC
Permalink
You could use SDL_CondWait in the worker threads
I'm already doing the equivalent, but since it's a 'polling' approach (the thread will only be suspended when it next calls SDL_CondWait) and my worker thread does some quite time-consuming things (it might even be in a lengthy memcpy), I cannot guarantee that it will be suspended in a timely fashion. It works most of the time, but my app still crashes intermittently when Android puts it into the background.

Google seems to suggest that asynchronously suspending a thread in Android is deprecated (http://stackoverflow.com/questions/10189289/how-to-suspend-and-resume-threads-in-android), and one should indeed use the kind of polling approach you are advocating. But I don't understand how that can meet the requirement to respond to the SDL_APP_WILLENTERBACKGROUND event 'immediately'.

Richard.
Daniel Gibson
2017-01-20 00:05:02 UTC
Permalink
Err.. about what kind of time spans are we talking here?
memcpy() taking a "long" amount of time seems very weird to me.
Even if it takes like a second until all threads are suspended (or maybe
even a few seconds?) that shouldn't be a problem?
You could use SDL_CondWait in the worker threads
I'm already doing the equivalent, but since it's a 'polling' approach
(the thread will only be suspended when it next calls SDL_CondWait) and
my worker thread does some quite time-consuming things (it might even be
in a lengthy memcpy), I cannot guarantee that it will be suspended in a
timely fashion. It works most of the time, but my app still crashes
intermittently when Android puts it into the background.
Google seems to suggest that asynchronously suspending a thread in
Android is deprecated
<http://stackoverflow.com/questions/10189289/how-to-suspend-and-resume-threads-in-android>,
and one should indeed use the kind of polling approach you are
advocating. But I don't understand how that can meet the requirement to
respond to the SDL_APP_WILLENTERBACKGROUND event 'immediately'.
Richard.
_______________________________________________
SDL mailing list
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
rtrussell
2017-01-20 00:24:37 UTC
Permalink
Even if it takes like a second until all threads are suspended (or maybe even a few seconds?) that shouldn't be a problem?
Shouldn't it? This is what the SDL docs say about the app events (https://wiki.libsdl.org/SDL_EventType#Android.2C_iOS_and_WinRT_Events): "These events must be handled in an event filter, since often the OS needs an immediate response and will terminate your process shortly after sending the event, and if it sits in the SDL event queue, it'll be too late". I interpreted this as meaning that the app must respond in less than a frame period (which is typically how long an SDL app might take before checking the event queue), so even a few milliseconds might be too long.

When I trace the crashes with logcat I see my worker thread trying to access memory that Android has already 'paged out' whilst putting my app into the background. Subjectively this is happening 'instantly', not after a few seconds.

Richard.
Daniel Gibson
2017-01-20 02:30:54 UTC
Permalink
Well ok, this sucks.
Is there a way to tell Android to wait a little with either "paging out"
the memory or even with switching to another app?
Otherwise I have no idea, but maybe someone who (unlike me) has actual
experience with Android can help ;)

Cheers,
Daniel
Shouldn't it? This is what the SDL docs say about the app events
"These events must be handled in an event filter, since often the OS
needs an immediate response and will terminate your process shortly
after sending the event, and if it sits in the SDL event queue, it'll be
too late". I interpreted this as meaning that the app must respond in
less than a frame period (which is typically how long an SDL app might
take before checking the event queue), so even a few milliseconds might
be too long.
When I trace the crashes with logcat I see my worker thread trying to
access memory that Android has already 'paged out' whilst putting my app
into the background. Subjectively this is happening 'instantly', not
after a few seconds.
Richard.
_______________________________________________
SDL mailing list
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
rtrussell
2017-01-20 10:24:04 UTC
Permalink
Is there a way to tell Android to wait a little with either "paging out" the memory or even with switching to another app?
Otherwise I have no idea, but maybe someone who (unlike me) has actual experience with Android can help
Like you my knowledge of Android is very limited, and to a large extent I'm relying on the SDL docs to tell me what to do (not entirely unreasonable I think - it is supposed to be an 'abstraction layer' that isolates the programmer from platform-specific detail!).

What surprises me is that this seems to be an issue that has not been widely encountered before. I would have expected worker threads to be the norm these days - it's effectively the only way to achieve a short latency even if the UI thread is waiting for VSYNC, and to leverage the power of multi-core CPUs.

Richard.
Eric Wing
2017-01-21 01:54:03 UTC
Permalink
Post by rtrussell
Post by Daniel Gibson
Is there a way to tell Android to wait a little with either "paging out"
the memory or even with switching to another app?
Otherwise I have no idea, but maybe someone who (unlike me) has actual
experience with Android can help
Like you my knowledge of Android is very limited, and to a large extent I'm
relying on the SDL docs to tell me what to do (not entirely unreasonable I
think - it is supposed to be an 'abstraction layer' that isolates the
programmer from platform-specific detail!).
What surprises me is that this seems to be an issue that has not been widely
encountered before. I would have expected worker threads to be the norm
these days - it's effectively the only way to achieve a short latency even
if the UI thread is waiting for VSYNC, and to leverage the power of
multi-core CPUs.
Richard.
Android, and particularly the Android NDK, is awful. You can find
many, many, people attest to this, including John Carmack. I give
talks from time to time on Android development and always remember to
call out how bad the NDK is, and I always get NDK developers thanking
me at the end of the talk for publicly saying what we all know. So you
are entering a rite of passage.

- To your original problem of suspending the thread:
I say don’t suspend your threads and just make them end. Design your
threads so you tell them to quit and you do a wait/join.

Then when your app resumes, start up new threads.


- I would have expected worker threads to be the norm these days

Threads + cross-platform has always been hard. So this does temper
devs to some degree.

SDL’s thread library tries to point you in the right direction by
having a very small set of functions. For example, there is no
SDL_SuspendThread because it is not ubiquitous. (As you discovered,
Android doesn’t want to support it.) But unfortunately, it does not
absolve you from understanding things about the platforms. Threads are
among the worst because there are so many subtle differences and they
really leak serious abstraction details. For example, Emscripten/web
browser does not support threads at all (ignore web workers). And as
far as leaky abstractions, one of the most common pitfalls is many
platforms can’t handle doing certain operations on the non-main
thread, so everybody using threads needs to be very careful about
this.



- app events race condition:

Unfortunately, this is a consequence of threads/leaky abstraction
details, and Android just being terrible. The fundamental problem is
that on Android, SDL is actually running on a background thread. So
when an app suspends, backgrounds, resumes, quits, etc., the events
are happening on the main Android thread, and not the SDL thread.
Android expects you to handle them NOW! But since SDL is on a
background thread, there is a potential race condition.

I have worked on many open source and commercial Android projects, and
every one of them made the same mistake and put their stuff on a
background thread. It didn’t help that Google itself encouraged
developers to do this with their Native Activity example. I have
compared notes with a lot of other Android devs who have to push the
envelope on the NDK too, and we all came to the same conclusion that
Native Activity/background thread is a terrible idea.

Every single project I’ve been on has had some hard edge case race
condition with suspend, background, resume, or quit. I’d actually like
to write a new alternative SDLActivity some day that keeps SDL on the
main thread. This would require some changes on how people deal with
the event loop, but SDL actually already has to do this for
Emscripten, and also has a special function to enable this mode on
iOS, so it actually isn’t unprecedented.


But that aside, I thought SDL will quit correctly on Android with
respect to threads because it basically wait/joins the SDL background
thread to end it, blocking the app on the main thread from quitting
until this happens. (Threads are tricky so maybe there is an edge case
I’m not seeing.)

The bug that I usually see about quitting is due to a very different
stupid, “NDK is terrible”, design problem. Android actually doesn’t
necessarily clean up your NDK side memory. It just leaves it there
untouched. So if you have global or static variables, when you restart
the app, the Android system actually skips re-initializing all your
global and static variables because it considers them already
initialized. So these variables remain at their previous value which
can completely break your application logic.

int g_isRunning = 1;

int main(int arcg, char* argv[])
{
SDL_Init(SDL_VIDEO);
// standard SDL event loop
while(g_isRunning)
{
// handle the event loop which will set the g_isRunning to 0 if the user quits
}


SDL_Quit();
return 0;
}


The first time you launch this, it will work.
Then quit the app.
Then restart the app.

g_isRunning will still be 0 from the last run because Android skips
re-initializing the variable to 1.


Basically, this forces you to be very careful, and always explicitly
reinitialize your globals and statics.


int g_isRunning = 1;

int main(int arcg, char* argv[])
{
// forces reinitialization for subsequent starts
g_isRunning = 1;

SDL_Init(SDL_INIT_VIDEO);
// standard SDL event loop
while(g_isRunning)
{
// handle the event loop which will set the g_isRunning to 0 if the user quits
}


SDL_Quit();
return 0;
}



-Eric
Sik the hedgehog
2017-01-21 08:30:57 UTC
Permalink
Post by Eric Wing
I say don’t suspend your threads and just make them end. Design your
threads so you tell them to quit and you do a wait/join.
Then when your app resumes, start up new threads.
If I'm understanding correctly the problem is that the threads crash
because they take too long until the next poll (resulting in resources
being freed up before the thread can respond), so that still wouldn't
solve the issue. Only solution in this case would be to just break
down the tasks so it polls more often, at the expense of
performance...

Also the implication here is that somehow Android is freeing up memory
before actually stopping the thread using it. Which sounds dangerous
o_O
rtrussell
2017-01-21 10:30:29 UTC
Permalink
To your original problem of suspending the thread: I say don’t suspend your threads and just make them end. Design your threads so you tell them to quit and you do a wait/join.
I appreciate that in an ideal world this is preferable, but it's just not practical for me. This is a port of an application with its roots going back 35 years and its codebase virtually unchanged for at least 15 years, and there's far too much state in the worker thread to hope to save it and then restore it again. In any case, if my understanding is correct, there's no satisfactory way even of killing the thread asynchronously in Android so I'm no better off if in order to 'end' the thread it must poll some flag.
many platforms can’t handle doing certain operations on the non-main thread
Tell me about it! This caused me much grief, and taught me that initially developing an SDL app in Windows is a bad idea because it's more tolerant of this than the other platforms. So when I thought I had a pretty much fully tested and working app, I discovered that it totally failed when ported to Linux, Mac OS, Android...
I’d actually like to write a new alternative SDLActivity some day that keeps SDL on the main thread. This would require some changes on how people deal with
the event loop
This sounds like a great idea, so long as it doesn't hit performance (running in a separate thread can be advantageous if it means running on a separate CPU core). Your frustration at the shortcomings of Android are clear and understandable, but it's not going to go away and we need to try to find workarounds like this.
So these variables remain at their previous value which can completely break your application logic.
I've read this before, but it's never affected me despite not coding defensively against it. Whether I have simply been lucky I don't know, but I think I have seen comments to the effect that more recent versions of Android/SDL (I'm not sure which) don't suffer from this issue to the same degree.

Richard.
Eric Wing
2017-01-23 18:03:31 UTC
Permalink
Post by rtrussell
To your original problem of suspending the thread: I say don’t suspend
your threads and just make them end. Design your threads so you tell them
to quit and you do a wait/join.
I appreciate that in an ideal world this is preferable, but it's just not
practical for me. This is a port of an application with its roots going
back 35 years and its codebase virtually unchanged for at least 15 years,
and there's far too much state in the worker thread to hope to save it and
then restore it again. In any case, if my understanding is correct, there's
no satisfactory way even of killing the thread asynchronously in Android so
I'm no better off if in order to 'end' the thread it must poll some flag.
So we’re talking about Android here. Android will happily keep sucking
up CPU cycles and battery on your app’s background threads while
backgrounded. Worst case, you can just keep eating cycles.

But your worker threads don’t have a natural stopping point even when
there is no new incoming data/activity? You can’t add a flag to poll
at the end of their natural work to end the thread? So maybe they eat
some extra cycles for awhile after your app first backgrounds, but if
were talking at most a few dozen seconds, I doubt anybody will notice
or care.


In your case, Application exit is the edge case you need to worry
about. If it is a big problem, hacking in a kill thread may be a
reasonable option.
Post by rtrussell
I’d actually like to write a new alternative SDLActivity some day that
keeps SDL on the main thread. This would require some changes on how
people deal with
the event loop
This sounds like a great idea, so long as it doesn't hit performance
(running in a separate thread can be advantageous if it means running on a
separate CPU core). Your frustration at the shortcomings of Android are
clear and understandable, but it's not going to go away and we need to try
to find workarounds like this.
It *shouldn't* affect performance. The OS scheduler should be doing
load balancing. The main UI thread (your app) shouldn't be doing much
of anything except supporting your app so those are CPU cycles you
need to spend anyway. More often than not, what I see is people
automatically assume more threads == more performance, but that's far
from the truth and often works against you. The main UI thread still
must deal with things like touch events. These things must be passed
to SDL, but now we have the complication of needing to communicate
safely between threads. I don't remember SDL's implementation, but
I've been in many other Android projects, and things like this usually
introduce either locking (which stalls both threads), or more async
which increases latency.

Careful utilization of threads can give you high performance (like for
chugging long pipelines of data). But simply adding random threads
when you aren't CPU bound to begin with, more often than not just
creates more sync points (locking) and unnecessary context switches.
If you need performance, you should be designing in how you want to
use threads to accomplish this.


And also, the Android garbage collector can assert itself and halt all
threads. I worked on a major app that kept hitting this problem,
killing the game's playability. Even though we were on a background
thread, we were affected by the GC constantly blocking all threads for
more than a couple of frames. I'm not sure if Android was actually
halting our native thread, or if it was just a result of something
making a system call calling into Java which had suspended all
threads.
Post by rtrussell
So these variables remain at their previous value which can completely
break your application logic.
I've read this before, but it's never affected me despite not coding
defensively against it. Whether I have simply been lucky I don't know, but
I think I have seen comments to the effect that more recent versions of
Android/SDL (I'm not sure which) don't suffer from this issue to the same
degree.
This problem is an Android NDK behavior and not specific to SDL. It
still remains a problem and has not improved because Google considers
it a “performance feature” in that they do absolutely no work here.
Good libraries like SDL will clean up their global/static variables
and reinitialize them with their Quit/Init functions, but it presumes
that you called them correctly in your code. But there is a lot of
code (especially user code) that is not as meticulous as SDL in this
regard, and the complex application life-cycle of Android doesn’t
always make it easy to call quit/cleanup at the right time.



-Eric
rtrussell
2017-01-23 23:23:09 UTC
Permalink
Android will happily keep sucking up CPU cycles and battery on your app’s background threads while backgrounded. Worst case, you can just keep eating cycles.
What I think I've seen happening is that the worker thread keeps running but the memory that it is using 'disappears', resulting in a segfault as soon as it attempts to access that memory. This was not unexpected, because I assumed it was normal for Android to 'page out' an application's memory when it is in the background, and hence my desire to suspend the worker thread so it won't try to access it.

But since you don't seem to think that leaving a worker thread running should be a major problem (other than wasting CPU cycles) am I wrong in my assumption? What would you expect to happen to memory that the application has allocated (in my case using 'mmap') when it is backgrounded?

Richard.
Mason Wheeler
2017-01-24 02:01:57 UTC
Permalink
Paging is one thing; that refers to swapping unused memory out to the page file as part of virtual memory management.  But what you're describing is unmapping the memory out of the process space, and that is absolutely not something that should be happening over the normal course of events!
Mason

From: rtrussell <***@rtrussell.co.uk>
To: ***@lists.libsdl.org
Sent: Monday, January 23, 2017 6:23 PM
Subject: Re: [SDL] Suspend worker thread when in background

#yiv1863998502 #yiv1863998502 -- #yiv1863998502 #yiv1863998502 body {font-size:11;}#yiv1863998502 #yiv1863998502 font, #yiv1863998502 th, #yiv1863998502 td, #yiv1863998502 p {}#yiv1863998502 p, #yiv1863998502 td {font-size:11;}#yiv1863998502 a:link, #yiv1863998502 a:active, #yiv1863998502 a:visited {}#yiv1863998502 a:hover {text-decoration:underline;}#yiv1863998502 hr {height:0px;border:solid 0px;border-top-width:1px;}#yiv1863998502 h1, #yiv1863998502 h2 {font-size:22px;font-weight:bold;text-decoration:none;line-height:120%;color:#000000;}#yiv1863998502 #yiv1863998502 .yiv1863998502bodyline {border:1px solid;}#yiv1863998502 #yiv1863998502 .yiv1863998502gen {font-size:12px;}#yiv1863998502 .yiv1863998502genmed {font-size:11px;}#yiv1863998502 .yiv1863998502gensmall {font-size:10px;line-height:12px;}#yiv1863998502 .yiv1863998502gen, #yiv1863998502 .yiv1863998502genmed, #yiv1863998502 .yiv1863998502gensmall {}#yiv1863998502 a.yiv1863998502gen, #yiv1863998502 a.yiv1863998502genmed, #yiv1863998502 a.yiv1863998502gensmall {text-decoration:none;}#yiv1863998502 a.yiv1863998502gen:hover, #yiv1863998502 a.yiv1863998502genmed:hover, #yiv1863998502 a.yiv1863998502gensmall:hover {text-decoration:underline;}#yiv1863998502 #yiv1863998502 .yiv1863998502forumlink {font-weight:bold;font-size:12px;}#yiv1863998502 a.yiv1863998502forumlink {text-decoration:none;}#yiv1863998502 a.yiv1863998502forumlink:hover{text-decoration:underline;}#yiv1863998502 #yiv1863998502 .yiv1863998502postbody {font-size:12px;line-height:18px;}#yiv1863998502 a.yiv1863998502postlink:link {text-decoration:none;}#yiv1863998502 a.yiv1863998502postlink:visited {text-decoration:none;}#yiv1863998502 a.yiv1863998502postlink:hover {text-decoration:underline;}#yiv1863998502 #yiv1863998502 .yiv1863998502code {font-size:11px;color:#3FB753;border-style:solid;border-left-width:1px;border-top-width:1px;border-right-width:1px;border-bottom-width:1px;}#yiv1863998502 .yiv1863998502quote {font-size:11px;color:#444444;line-height:125%;border-style:solid;border-left-width:1px;border-top-width:1px;border-right-width:1px;border-bottom-width:1px;}#yiv1863998502



|



| Eric Wing wrote: |




|



| Android will happily keep sucking up CPU cycles and battery on your app’s background threads while backgrounded. Worst case, you can just keep eating cycles. |




What I think I've seen happening is that the worker thread keeps running but the memory that it is using 'disappears', resulting in a segfault as soon as it attempts to access that memory. This was not unexpected, because I assumed it was normal for Android to 'page out' an application's memory when it is in the background, and hence my desire to suspend the worker thread so it won't try to access it.

But since you don't seem to think that leaving a worker thread running should be a major problem (other than wasting CPU cycles) am I wrong in my assumption? What would you expect to happen to memory that the application has allocated (in my case using 'mmap') when it is backgrounded?

Richard.
Eric Wing
2017-01-24 17:50:42 UTC
Permalink
Post by rtrussell
Android will happily keep sucking up CPU cycles and battery on your app’s
background threads while backgrounded. Worst case, you can just keep
eating cycles.
What I think I've seen happening is that the worker thread keeps running but
the memory that it is using 'disappears', resulting in a segfault as soon as
it attempts to access that memory. This was not unexpected, because I
assumed it was normal for Android to 'page out' an application's memory when
it is in the background, and hence my desire to suspend the worker thread so
it won't try to access it.
But since you don't seem to think that leaving a worker thread running
should be a major problem (other than wasting CPU cycles) am I wrong in my
assumption? What would you expect to happen to memory that the application
has allocated (in my case using 'mmap') when it is backgrounded?
Richard.
First, you need to make sure you are distinguishing between a
background event and a quit event (which are you getting or maybe you
are getting both).

Second, generally speaking, Android is very hands off when it comes to
the NDK side, because it is mostly oblivious to what's going on and
Google really doesn't want to be bothered writing stuff for the NDK
one way or the other. In a few respects, this works to the advantage
of the NDK developer. For example, Android imposes a Java heap size
limit on all apps. Back in 2.3-ish, it was some ridiculously small
limit like 64MB for an app, where phones may have shipped 512MB of
RAM. Games written in purely the SDK would get killed if they exceeded
64MB, which you could blow through easily with just a few large
uncompressed textures. But Android being oblivious to the NDK is
unable to enforce the Java heap size limit so you can use all the
memory that is actually available. Hence another compelling reason why
games are mostly written in using the NDK.


When Android needs more memory for another app, it will just reap your
entire process.

I don't know what Android does with mmap, if anything.

For background threads, I remember way, way back (2010-ish?), when we
were first getting OpenAL Soft and ALmixer running on Android, when
the app was backgrounded (not quit), both OpenAL Soft continued
running its mixing background thread, and ALmixer continued running
its background update loop thread. We got complaints from users (and
rightly so) that their CPU process meter apps were showing Corona
based apps eating significant CPU while backgrounded. (Remember, a lot
of users don't quit apps on Android...they just switch between apps
like on iOS.) So we had to write extra code to basically end their
threads on backgrounding. And I remember getting regression bugs or
subtle changes in behavior for some devices or OS versions in how they
traverse the application life cycle, that we might miss the background
event notification so we didn't suspend the threads (and we would get
more complaints to fix it).

-Eric
rtrussell
2017-02-01 20:29:39 UTC
Permalink
First, you need to make sure you are distinguishing between a background event and a quit event (which are you getting or maybe you
are getting both).
By "quit event" I take it you are referring to SDL_APP_TERMINATING, is that right? I am assuming that I will not receive that event as a result of putting my app into the background. The events I am 'expecting' are SDL_APP_WILLENTERBACKGROUND, SDL_APP_DIDENTERBACKGROUND and/or SDL_APP_DIDENTERFOREGROUND.
I don't know what Android does with mmap, if anything.
I have to use mmap rather than an alternative memory allocation function because I need the memory to be executable.

Here's a logcat entry for one of the crashes that happened when my app is backgrounded:


Code:
02-01 20:03:54.947 553 553 F DEBUG : signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x200390a5


It's a segfault for address 0x200390a5, which is within the block allocated to my process by mmap. How can I interpret this other than an indication that the memory has been unmapped?

Richard.

Continue reading on narkive:
Loading...