| # Common coding patterns in Starnix |
| |
| This page provides a list of common coding patterns and best practices specific |
| to Starnix development. |
| |
| [Starnix][starnix-concepts], which is essentially a kernel running in userspace, |
| runs Linux programs on Fuchsia. This creates specific considerations for |
| developers who need to ensure that their code produces the results expected by |
| Linux programs. This page aims to clarify some of these patterns and provide |
| best practices, covering topics like testing, managing userspace addresses, |
| handling error messages, and more. |
| |
| The topics are: |
| |
| - [Testing Starnix using Linux binaries](#testing-starnix-using-linux-binaries) |
| - [Representing userspace addresses in Starnix](#representing-userspace-addresses-in-starnix) |
| - [Handling syscall arguments in Starnix](#handling-syscall-arguments-in-starnix) |
| - [Safe error handling in Starnix](#safe-error-handling-in-starnix) |
| - [Creating errors in Starnix](#creating-errors-in-starnix) |
| - [Preventing arithmetic overflow in Starnix](#preventing-arithmetic-overflow-in-starnix) |
| - [Using FIDL proxies in Starnix](#using-fidl-proxies-in-starnix) |
| |
| ## Testing Starnix using Linux binaries {:#testing-starnix-using-linux-binaries} |
| |
| This section covers best practices for testing Starnix functionality, which |
| involves using binaries compiled for Linux. |
| |
| Most of Starnix's test coverage comes from userspace binaries compiled for |
| Linux. The Fuchsia project runs these binaries on both Linux and Starnix to make |
| sure that Starnix matches Linux behavior. |
| |
| Userspace unit tests verify the Linux UAPI, which is what Starnix implements. |
| Verifying Starnix behavior at this level gives us the freedom to refactor |
| Starnix's implementation with confidence. |
| |
| In addition, Starnix kernel unit tests can be useful for verifying internal |
| invariants of the system. However, if you write kernel unit tests, be careful to |
| avoid "change-detector tests." In other words, ensure that you don't write tests |
| that fail when the implementation changes even if the changes are functionally |
| correct. |
| |
| ## Representing userspace addresses in Starnix {:#representing-userspace-addresses-in-starnix} |
| |
| This section covers best practices for representing and validating addresses |
| within the userspace of Starnix. |
| |
| The `UserAddress` and `UserRef` types are used to denote addresses that are in |
| "Linux userspace" in Starnix (that is, the |
| [restricted address space][restricted-mode]). Once it is determined that |
| a `UserAddress` points to an object of type `T`, convert it to a `UserRef<T>`. |
| This approach provides more type information, which makes it easier to read |
| and write to and from userspace. |
| |
| Address validation is performed by Fuchsia's memory management subsystem. In |
| general, code outside of the memory manager should not perform any checks to |
| determine whether or not an address is valid before passing the address to |
| the memory manager (for example, checking that an address is non-null). |
| |
| * {Good} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| pub fn sys_something(current_task: &CurrentTask, user_events: UserRef<epoll_event>) |
| -> Result <(), Errno> { |
| let events = current_task.read_object(user_events)?; |
| ... |
| } |
| ``` |
| |
| * {Bad} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| pub fn sys_something(current_task: &CurrentTask, user_events: UserRef<epoll_event>) |
| -> Result <(), Errno> { |
| if user_events.addr().is_null() { |
| return error!(EFAULT); |
| } |
| |
| let events = current_task.read_object(user_events)?; |
| ... |
| } |
| ``` |
| |
| However, the most common exception to this rule is when a syscall needs to |
| return a specific error when a provided address is null, for example: |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| pub fn sys_something(current_task: &CurrentTask, user_events: UserRef<epoll_event>) |
| -> Result <(), Errno> { |
| if user_events.addr().is_null() { |
| // The memory manager would never return ENOSYS for a read_object at a null |
| // address, so an explicit check is required. |
| return error!(ENOSYS); |
| } |
| |
| let events = current_task.read_object(user_events)?; |
| ... |
| } |
| ``` |
| |
| ## Handling syscall arguments in Starnix {:#handling-syscall-arguments-in-starnix} |
| |
| This section focuses on best practices around `SyscallArg` , which is the |
| default type for all Starnix syscall arguments. |
| |
| All arguments to Starnix syscall implementations start out as `SyscallArg`. This |
| type is then converted into specific syscall argument types using the `into()` |
| trait. For instance, when a syscall is being dispatched, it will be called as: |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| match syscall_nr { |
| __NR_execve => { |
| sys_execve(arg0.into(), arg1.into(), arg3.into()) |
| } |
| } |
| ``` |
| |
| This gives the syscall implementation flexibility to use any type that |
| `SyscallArg` can be converted into. |
| |
| * {Good} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| fn sys_execve( |
| user_path: UserCString, |
| user_argv: UserRef<UserCString>, |
| user_environ: UserRef<UserCString>, |
| ) -> Result<(), Errno> { |
| ... |
| } |
| ``` |
| |
| * {Bad} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| fn sys_execve( |
| user_path: SyscallArg, |
| user_argv: SyscallArg, |
| user_environ: SyscallArg, |
| ) -> Result<(), Errno> { |
| ... |
| } |
| ``` |
| |
| ## Safe error handling in Starnix {:#safe-error-handling-in-starnix} |
| |
| This section discusses the risks of using the `unwrap()` and `expect()` methods |
| within Starnix. |
| |
| Panicking in Starnix is the equivalent of a kernel panic for the container it is |
| running. This means that if a syscall uses APIs like `unwrap()` or `expect()`, |
| it has the potential to panic, not only the process that caused the error, but |
| the entire container. |
| |
| * {Good} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| let value = option.ok_or_else(|| error!(EINVAL))?; |
| ``` |
| |
| This example is good practice because it uses `ok_or_else()` to handle |
| error cases. |
| |
| * {Bad} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| let value = option.unwrap(); |
| ``` |
| |
| This example is bad practice because it has the potential to panic and |
| bring down the entire container, not just the process that caused the invariant |
| to be violated. |
| |
| However, if an error is truly unrecoverable, it is acceptable to use `unwrap()` |
| or `expect()`. However, its use should contain a context string that describes |
| why a kernel panic is the only option. |
| |
| ## Creating errors in Starnix {:#creating-errors-in-starnix} |
| |
| This section provides best practices for creating and translating errors. |
| |
| Starnix uses a wrapper type for Linux error codes called `errno`. This type is |
| useful because it can capture the source location of the error, which is helpful |
| when debugging. When creating new errors, use the `error!()` macro. |
| |
| * {Good} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| if !name.entry.node.is_dir() { |
| return error!(ENOTDIR, "Invalid path provided to sys_chroot"); |
| } |
| ``` |
| |
| This example is good practice because the `errno!()` macro provides more |
| information, which helps debugging. |
| |
| * {Bad} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| return error!(EINVAL); |
| ``` |
| |
| This example is bad practice because it returns an error code without |
| any context or information about the error. |
| |
| Plus, when translating one error into another, it may be convenient to use |
| `map_err()` with the `errno!()` macro. |
| |
| * {Good} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| let s = mm.read_c_string_to_vec(user_string, elem_limit).map_err(|e| { |
| if e.code == ENAMETOOLONG { |
| errno!(E2BIG) |
| } else { |
| e |
| } |
| })?; |
| ``` |
| |
| * {Bad} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| let s = match mm.read_c_string_to_vec(user_string, elem_limit) { |
| Err(e) if e.code == ENAMETOOLONG => { |
| errno!(E2BIG) |
| }, |
| Err(e) => { |
| e |
| }, |
| ok => ok, |
| }?; |
| ``` |
| |
| ## Preventing arithmetic overflow in Starnix {:#preventing-arithmetic-overflow-in-starnix} |
| |
| This section emphasizes the importance of using checked math operations when |
| dealing with numerical values originating from userspace. |
| |
| Always use checked math (for example, `checked_add()` and `checked_mul()`) with |
| numerical values that come from userspace. This prevents bad values in userspace |
| from overflowing arithmetic in the kernel. |
| |
| * {Good} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| pub fn sys_something(user_value: u32) -> Result<(), Errno> { |
| let value = user_value.get()?; |
| let result = value.checked_mul(2).ok_or_else(|| error!(EOVERFLOW))?; |
| ... |
| } |
| ``` |
| |
| This example is good practice because the code uses `checked_mul()` to |
| perform multiplication on a value retrieved from userspace. This ensures that if |
| the multiplication overflows, an `EOVERFLOW` error is returned instead of |
| potentially causing unexpected behavior or crashes. |
| |
| * {Bad} |
| |
| ```cpp {:.devsite-disable-click-to-copy} |
| pub fn sys_something(user_value: u32) -> Result<(), Errno> { |
| let value = user_value.get()?; |
| let result = value * 2; // Potential overflow here! |
| ... |
| } |
| ``` |
| |
| This example is bad practice because the code directly multiplies the |
| user-supplied value by 2 without checking for potential overflow. If the value |
| is large enough, the multiplication could overflow, leading to incorrect results |
| or a system crash. |
| |
| ## Using FIDL proxies in Starnix {:#using-fidl-proxies-in-starnix} |
| |
| Note: The topic in this section is covered in more detail in |
| [IPC patterns in Starnix][ipc-patterns-in-starnix]. |
| |
| This section explains why Starnix, unlike some Fuchsia components, typically |
| uses synchronous proxies when interacting with FIDL protocols. |
| |
| Starnix typically uses synchronous proxies because of its execution model. |
| Specifically, when servicing a Linux system call, Starnix code runs on the |
| thread of the user program that invoked the Linux system call. |
| |
| Since the thread belongs to a Linux program, Starnix must perform the requested |
| work and then return control back to the Linux program. |
| |
| This constraint means that the work Starnix is doing needs to be completed |
| synchronously before returning control. |
| |
| Since the work needs to be completed before returning, a synchronous proxy |
| is the simplest solution. A synchronous proxy is also more performant, because |
| it avoids context switching to another thread and back (using an asynchronous |
| proxy would require a separate thread, for the asynchronous executor to use). |
| |
| To learn more about the execution model for Starnix, please see |
| * [Making Linux syscalls in Fuchsia](/docs/concepts/starnix/making-linux-syscalls-in-fuchsia.md) |
| * [RFC 0261: Fast and efficient user space kernel emulation](/docs/contribute/governance/rfcs/0261_fast_and_efficient_user_space_kernel_emulation.md) |
| |
| <!-- Reference links --> |
| |
| [starnix-concepts]: /docs/concepts/starnix/README.md |
| [restricted-mode]: /docs/concepts/starnix/making-linux-syscalls-in-fuchsia.md#running-a-linux-program-in-restricted-mode |
| [ipc-patterns-in-starnix]: ipc_patterns_in_starnix.md |