-
-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New approach for IMAP Idle connection handling #2208
Comments
Related: deltachat/deltachat-android#1573 |
Link to commits, easier to review than the sources: |
@link2xt great to point to commit history. This is much more clear 👍 |
I have looked into the implementation of Filed an issue: smol-rs/async-io#63 |
I found too, that the app is handled different from OS when using a shorter connection timeout value in a loop. I can describe it like this:
At the end loop duration is still not very accurate, but so reliable, that a secure operation is possible. To reach the goal I used a max timeout duration of 23 min instead of 29min which is mail servers limit. It seems that when an app intends to run and not waiting for a semaphore or similar it is called more often and more reliable by OS. => These observations has been done with Android 4.1 but maybe valid for other OS's too. |
imap connectivity has improved since then, it is more smtp that makes some problems meanwhile. leaving this as reference in resurrection, however. if needed, we can split of smaller, actionable tasks (cmp march2022 cleanup) |
These very interesting informations IMHO could be important for further development. So I think it's the best to post them here as an issue. I think it's worth and maybe necessary to discuss my findings. So let me start:
1. Motivation
While bad/flaky network conditions following issues has been come up and lead to bad user experience:
Testing environment: Android 4.1.2
2. Goals
Save battery drain as much as possible => do only really necessary actions!
IMAP Idle timeout length only limited by technical conditions, max 29min.
=> This should be handeled properly by core only (when possible).
Do job handling only if network is available, all other actions will fail anyway and are waste!
Reliable operation under all conditions
(For example: DC not opened, no manual intervention for a long time, device screen off)
** This is a must **
3. Background, Findings and Issues
While examining the approach of connection handling, job handling and use of Android system functions, I found that not a single issue is responsible for unfavorable behaviour of DC, rather there are more factors responsible in conjunction!
In detail:
a) Periodic Work Request (PWR) (Android; interval 15 min)
With the default interval of 15 min, it is not possible to use desired longer idle periods up to 29 min.
At latest after 15 min, all idles are interrupted.
More terrible, the Periodic Work Request is not syncronized with IMAP idle timeout start, so
very often a much shorter idle duration is the result.
Trying a longer interval for PWR (for example 30 min) shows, that this is not accepted by Android
system and only 15 min interval is working and possible.
Maybe this is an Android 4 limitation, but it has been the case.
Finally for all tests with longer timeouts than 15 min, a) Periodic Work Request has been disabled completely or b) triggered actions by that has been skipped by core!
b) One long timeout for IMAP idle connection lets core sleep unpredictable and doesn't show network errors while waiting.
For IMAP idle connection a timeout duration of 23 min is set (23 * 60s). Then core is waiting
for an external interrupt or timeout to expire.
The problem seen is, that this long connection timeout leads to unreliable core behaviour.
Expiration of this timeout is never reliable. Most of time expiration is much much longer than 23 min
or even endless! DC sleeps completely until user wakes it up by a manual trigger!
This leads to the situation that mail server ends connection and core doesn't get aware of that!
Thus, broken connection and/or external connection problems are not detected.
Often there has been minutes or even hours where DC didn't receive any message.
c) Max idle connection timeout depends from network type wifi/mobile
When dealing with really longer timeout periods, I found that maximum connection timeout not only depends from mail server, it depends from type of network too!
At home (wifi) I detected a max length of approx 13 min,
at mobile network I found a stabile operation until up to full 29 min which is possible from mail server.
d) Network status delivered by device not correct/not reliable sometimes
e) Unnecessary job handling, retransmissions and tries, even and especially when network not available.
This is caused by d) and the fact that core doesn't know if network goes up or down by an event.
Interface between UI and core doesn't provide this up/down information (!)
f) Interleaved parallel job handling (old core version)
Job handling not locked properly. When many network events are fired within seconds some (the same) actions are started again in a new thread while first action is still in progress.
=> Maybe this is solved meanwhile by newer core version.
g) Permanent notification (Android) - necessary to keep DC reliable, even with Android 4 (!)
Regardless what actions are chosen, sooner or later it has been come to the situation, that DC doesn't received any messages.
The ONLY possibility to keep DC working reliably is to introduce (force) "Permanent background notification", even for Android 4.
4. Debugging
At the beginning of all these examinations it was very hard to understand what core is really doing.
Preexisting logging was not sufficient to show all necessary information.
=> logging has been extended and reworked (text messages, trigger points and format) to reveal issues and root cause of core issues.
5. Basics of "new approach"
Use Android's "Periodic Work Request" only to check if core is working properly.
Do interrupt only when timing problem is detected.
=> store next necessary timeout duration end for IMAP idle in a variable and check that in a shorter loop!
Handle IMAP idle connection with many short loop timeouts (5s) instead of one long timeout of 23min.
=> This approach guarantees a maximum in-operational duration of 2 min for core!
Dynamic IMAP idle duration, controlled by connection failures, 11-23mins
Extend FFI interface to get on/off status of a network event to core
New internal core connection status flag
=> controlled by device's network events AND connection behavior (error, success).
No job handling when being offline (connection status flag).
Increment job retries only when connection status flag shows "online".
Change retry timer calculation to a predifined list of durations and reduce number of retries.
No Interleaved parallel job handling.
Suppress quick repetetion and overlapping job starts due to fast network events.
Permanent background notification forced (This is a must, even at Android 4!)
6. Experiences and Outlook with new approach
Experiences with flaky network conditions and overall operation:
- Very stable message reception
- Low network traffic
- Very low battery drain (always!)
- No job handling while being in "Flight Mode" or offline
No unsent messages any more.
There is a good potential to optimize even current DC connection handling
(DC 1.14.5, core 1.50.0) I checked current sources and found basic approach unchanged regarding to DC 1.2.1 (core 1.27).
This is a big summary now, but as I mentioned at start: There is not a single issue responsible for an unfavorable behaviour of DC. Meanwhile I'm running the "new approach" for some months with great success.
I would say, it meets the goal :-)
The text was updated successfully, but these errors were encountered: