[botanist][tftp] Fix over-send causing timeout during paving

First, for reference, window size is always 256.

For a given window, botanist is currently sending blocks in the range of
the window base (called "seq" in tftp.go), let's say 100 here, so from
[101..357] inclusive (see the changed line in this CL).

This is one more than seems to be required by the protocol. Typically
this is fine, the last block (index 357 here) would typically either be
consumed anyway, or if it arrives out of order, it'll just be dropped.

In one very specific situation though, it causes problems. If the target
is receiving blocks in order and advancing through the window normally,
it will eventually get to a point where it receives block 356 (i.e. the
second last one that botanist will send in this attempt).

At that point, the target has received a full window of 256 (that is,
blocks 101..356 have been received), so it replies with an Ack saying
it's OK to move to the next window. Botanist has to wait on this Ack to
move to the next window, and otherwise continues repeating its send from
[101..357]. If this Ack for block 356 is dropped (this is all UDP so
eventually this will happen), AND the "bonus" 357 block is subsequently
received on the target, then the target will have moved on to the next
window block.

The target is now sitting waiting for block 358, thinking that both
sides have moved on to the next window. Botanist is still sending
[101..357] waiting for an Ack, but the target will never resend an Ack
for an old window. On a timeout, the target Acks 357 to attempt to
resync, but that will be ignored by botanist, because it does not fall
within the range check in window advancement (an indeed it is in the
next window).

So, two fixes are possible: botanist could accept that resync ack that's
one-past the end of the window (but I think that's wrong). Instead, it
should just not be sending block 357, and the loop should be [0..256),
that is, exclusive at the end of the range.

I've tried testing this by artificially making the target drop Acks and
paving now recovers and continues as expected even when that happens,
after this botanist change.

ZX-4146 #comment [botanist][tftp] Fix over-send causing timeout during paving

Change-Id: I85faccf45a1a7b173db5d357ba427a22c4b9205c
1 file changed
tree: 29ec7ea2596b04f0589b610a3346a348d5e7aa11
  1. artifacts/
  2. bloaty/
  3. botanist/
  4. breakpad/
  5. build/
  6. buildbucket/
  7. cache/
  8. cmd/
  9. color/
  10. command/
  11. digest/
  12. elflib/
  13. gcs/
  14. gndoc/
  15. isatty/
  16. logger/
  17. mdns/
  18. memory/
  19. netboot/
  20. netutil/
  21. ninjalog/
  22. qemu/
  23. resultstore/
  24. retry/
  25. runner/
  26. runtests/
  27. secrets/
  28. serial/
  29. sshutil/
  30. symbolize/
  31. tap/
  32. tarutil/
  33. telnet/
  34. testrunner/
  35. testsharder/
  36. tftp/
  37. .gitignore
  38. go.mod
  39. go.sum
  40. LICENSE
  41. manifest
  42. PATENTS
  43. README.md
README.md

tools

This repo contains tools used in Fuchsia build and development.

Go packages from here are automatically built and uploaded to CIPD and Google Storage by bots using the tools recipe. To add a tool to the build:

  • Edit the bot config.
  • Find the builder_mixins section with name: "tools".
  • Edit the JSON in properties_j to add a string to the packages list:
"fuchsia.googlesource.com/tools/cmd/your-new-tool"