|  | # EINTR | 
|  |  | 
|  | ## The problem | 
|  |  | 
|  | If your code is blocked in a system call when a signal needs to be delivered, | 
|  | the kernel needs to interrupt that system call. For something like a read(2) | 
|  | call where some data has already been read, the call can just return with | 
|  | what data it has. (This is one reason why read(2) sometimes returns less data | 
|  | than you asked for, even though more data is available. It also explains why | 
|  | such behavior is relatively rare, and a cause of bugs.) | 
|  |  | 
|  | But what if read(2) hasn't read any data yet? Or what if you've made some other | 
|  | system call, for which there is no equivalent "partial" success, such as | 
|  | poll(2)? In poll(2)'s case, there's either something to report (in which | 
|  | case the system call would already have returned), or there isn't. | 
|  |  | 
|  | The kernel's solution to this problem is to return failure (-1) and set | 
|  | errno to `EINTR`: "interrupted system call". | 
|  |  | 
|  | ### Can I just opt out? | 
|  |  | 
|  | Technically, yes. In practice on Android, no. Technically if a signal's | 
|  | disposition is set to ignore, the kernel doesn't even have to deliver the | 
|  | signal, so your code can just stay blocked in the system call it was already | 
|  | making. In practice, though, you can't guarantee that all signals are either | 
|  | ignored or will kill your process... Unless you're a small single-threaded | 
|  | C program that doesn't use any libraries, you can't realistically make this | 
|  | guarantee. If any code has installed a signal handler, you need to cope with | 
|  | `EINTR`. And if you're an Android app, the zygote has already installed a whole | 
|  | host of signal handlers before your code even starts to run. (And, no, you | 
|  | can't ignore them instead, because some of them are critical to how ART works. | 
|  | For example: Java `NullPointerException`s are optimized by trapping `SIGSEGV` | 
|  | signals so that the code generated by the JIT doesn't have to insert explicit | 
|  | null pointer checks.) | 
|  |  | 
|  | ### Why don't I see this in Java code? | 
|  |  | 
|  | You won't see this in Java because the decision was taken to hide this issue | 
|  | from Java programmers. Basically, all the libraries like `java.io.*` and | 
|  | `java.net.*` hide this from you. (The same should be true of `android.*` too, | 
|  | so it's worth filing bugs if you find any exceptions that aren't documented!) | 
|  |  | 
|  | ### Why doesn't libc do that too? | 
|  |  | 
|  | For most people, things would be easier if libc hid this implementation | 
|  | detail. But there are legitimate use cases, and automatically retrying | 
|  | would hide those. For example, you might want to use signals and `EINTR` | 
|  | to interrupt another thread (in fact, that's how interruption of threads | 
|  | doing I/O works in Java behind the scenes!). As usual, C/C++ choose the more | 
|  | powerful but more error-prone option. | 
|  |  | 
|  | ## The fix | 
|  |  | 
|  | ### Easy cases | 
|  |  | 
|  | In most cases, the fix is simple: wrap the system call with the | 
|  | `TEMP_FAILURE_RETRY` macro. This is basically a while loop that retries the | 
|  | system call as long as the result is -1 and errno is `EINTR`. | 
|  |  | 
|  | So, for example: | 
|  | ``` | 
|  | n = read(fd, buf, buf_size); // BAD! | 
|  | n = TEMP_FAILURE_RETRY(read(fd, buf, buf_size)); // GOOD! | 
|  | ``` | 
|  |  | 
|  | ### close(2) | 
|  |  | 
|  | TL;DR: *never* wrap close(2) calls with `TEMP_FAILURE_RETRY`. | 
|  |  | 
|  | The case of close(2) is complicated. POSIX explicitly says that close(2) | 
|  | shouldn't close the file descriptor if it returns `EINTR`, but that's *not* | 
|  | true on Linux (and thus on Android). See | 
|  | [Returning EINTR from close()](https://lwn.net/Articles/576478/) | 
|  | for more discussion. | 
|  |  | 
|  | Given that most Android code (and especially "all apps") are multithreaded, | 
|  | retrying close(2) is especially dangerous because the file descriptor might | 
|  | already have been reused by another thread, so the "retry" succeeds, but | 
|  | actually closes a *different* file descriptor belonging to a *different* | 
|  | thread. | 
|  |  | 
|  | ### Timeouts | 
|  |  | 
|  | System calls with timeouts are the other interesting case where "just wrap | 
|  | everything with `TEMP_FAILURE_RETRY()`" doesn't work. Because some amount of | 
|  | time will have elapsed, you'll want to recalculate the timeout. Otherwise you | 
|  | can end up with your 1 minute timeout being indefinite if you're receiving | 
|  | signals at least once per minute, say. In this case you'll want to do | 
|  | something like adding an explicit loop around your system call, calculating | 
|  | the timeout _inside_ the loop, and using `continue` each time the system call | 
|  | fails with `EINTR`. |