[blobfs][minfs] Prevent lookup function from double-acquiring lock

An race condition (most reproducible in blobfs, with the null blob test,
on QEMU + ARM64 without KVM, but possible with any blob) can occur with
the following steps:

1) A Vnode is created, and has data which is being written
back to storage asynchronously.
2) All client-side connections to the Vnode are closed, resulting
in a final reference to the RefPtr from the async writeback
thread.
3) A client-side call to lookup the Vnode is issued. This lookup
call successfully acquires a reference to the Vnode, while holding
a lock on the "Vnode lookup table".
4) The writeback thread releases the reference to the vnode.
5) The lookup call (still holding the lock!) also releases a reference
to the Vnode -- this time, it's the last reference, so the Vnode's
fbl_recycle function is called. This function attempts to lock and
remove the Vnode from "Vnode lookup table", but it is UNSAFE to do
so: the lock is already held from earlier in the lookup call.

This double-lock causes blobfs to hang.

Fix: Avoid releasing references to Vnodes while holding the
|hash_lock_|.

Although this bug can only trigger with blobfs' implementation of
lookup & async writeback (since the minfs lookup implementation
may only release the last reference of unlinked vnodes, which
don't cause a lock to be acquired in the destructor), this
patch also refactors the minfs lookup function to be more resistant
to this class of failure conditions.

ZX-1842 #done

Change-Id: I3b680ed1cc059bdc5c9b47ca1b573d981af7e10a
3 files changed
tree: 5f7cefffbf4319807fcd2f790d80c192f840856b
  1. bootloader/
  2. docs/
  3. kernel/
  4. make/
  5. manifest/
  6. prebuilt/
  7. public/
  8. scripts/
  9. system/
  10. third_party/
  11. .clang-format
  12. .clang-tidy
  13. .dir-locals.el
  14. .gitignore
  15. .travis.yml
  16. AUTHORS
  17. LICENSE
  18. makefile
  19. navbar.md
  20. PATENTS
  21. README.md
README.md

Zircon

Zircon is the core platform that powers the Fuchsia OS. Zircon is composed of a microkernel (source in kernel/...) as well as a small set of userspace services, drivers, and libraries (source in system/...) necessary for the system to boot, talk to hardware, load userspace processes and run them, etc. Fuchsia builds a much larger OS on top of this foundation.

The canonical Zircon Git repository is located at: https://fuchsia.googlesource.com/zircon

A read-only mirror of the code is present at: https://github.com/fuchsia-mirror/zircon

The Zircon Kernel provides syscalls to manage processes, threads, virtual memory, inter-process communication, waiting on object state changes, and locking (via futexes).

Currently there are some temporary syscalls that have been used for early bringup work, which will be going away in the future as the long term syscall API/ABI surface is finalized. The expectation is that there will be about 100 syscalls.

Zircon syscalls are generally non-blocking. The wait_one, wait_many port_wait and thread sleep being the notable exceptions.

This page is a non-comprehensive index of the zircon documentation.