docs/concepts/drivers/hardware.md - fuchsia - Git at Google



 <!--
     (C) Copyright 2018 The Fuchsia Authors. All rights reserved.
     Use of this source code is governed by a BSD-style license that can be
     found in the LICENSE file.
 -->

 # Hardware Interfacing

 This document is part of the [Driver Development Kit tutorial](ddk-tutorial.md) documentation.

 ## Overview

 In past chapters, we saw how the protocol stack was organized within a devhost,
 and some of the work that goes into binding the individual driver protocols into
 a device driver.

 In this section, we'll look at practical considerations of dealing with hardware
 such as determining configuration, binding to interrupts, allocating memory,
 and performing DMA operations.

 Here, we'll look at the concepts involved, and show snippets of code as required.
 Complete working code is shown in subsequent chapters (e.g., [Ethernet Devices](ethernet.md)).

 For the most part, we'll focus on the PCI bus, and we'll cover the following
 functions:

 * Access related:
     *   **pci_map_bar()**
 * Interrupt related:
     *   **pci_map_interrupt()**
     *   **pci_query_irq_mode()**
     *   **pci_set_irq_mode()**
 * DMA related:
     *   **pci_enable_bus_master()**
     *   **pci_get_bti()**

 # Configuration

 Hardware peripherals are attached to the CPU via a bus, such as the PCI bus.

 During bootup, the BIOS (or equivalent platform startup software)
 discovers all of the peripherals attached to the PCI bus.
 Each peripheral is assigned resources (notably interrupt vectors,
 and address ranges for configuration registers).

 The impact of this is that the actual resources assigned to each peripheral may
 be different across reboots.
 When the operating system software starts up, it enumerates
 the bus and starts drivers for all supported devices.
 The drivers then call PCI functions in order to obtain configuration information about
 their device(s) so that they can map registers and bind to interrupts.

 ## Base address register

 The Base Address Register (**BAR**) is a configuration register that exists on each
 PCI device.
 It's where the BIOS stores information about the device, such as the assigned interrupt vector
 and addresses of control registers.
 Other, device specific information, is stored there as well.

 Call **pci_map_bar()**
 to cause the BAR register to be mapped into the devhost's address space:

 ```c
 zx_status_t pci_map_bar(const pci_protocol_t* pci, uint32_t bar_id,
                         uint32_t cache_policy, void** vaddr, size_t* size,
                         zx_handle_t* out_handle);
 ```

 The first parameter, `pci`, is a pointer to the PCI protocol.
 Typically, you obtain this in your **bind()** function via
 **device_get_protocol()**.

 The second parameter, `bar_id`, is the BAR register number, starting with `0`.

 The third parameter, `cache_policy`, determines the caching policy for access,
 and can take on the following values:

 `cache_policy` value                | Meaning
 ------------------------------------|---------------------
 `ZX_CACHE_POLICY_CACHED`            | use hardware caching
 `ZX_CACHE_POLICY_UNCACHED`          | disable caching
 `ZX_CACHE_POLICY_UNCACHED_DEVICE`   | disable caching, and treat as device memory
 `ZX_CACHE_POLICY_WRITE_COMBINING`   | uncached with write combining

 Note that `ZX_CACHE_POLICY_UNCACHED_DEVICE` is architecture dependent
 and may in fact be equivalent to `ZX_CACHE_POLICY_UNCACHED` on some architectures.

 The next three arguments are return values.
 The `vaddr` and `size` return a pointer (and length) of the register region, while
 `out_handle` stores the created handle to the
 [VMO](/docs/reference/kernel_objects/vm_object.md).

 ## Reading and writing memory

 Once the **pci_map_bar()**
 function returns with a valid result, you can access the BAR via simple pointer
 operations, for example:

 ```c
 volatile uint32_t* base;
 ...
 zx_status_t rc;
 rc = pci_map_bar(dev->pci, 0, ZX_CACHE_POLICY_UNCACHED_DEVICE, &base, &size, &handle);
 if (rc == ZX_OK) {
     base[REGISTER_X] = 0x1234;  // configure register X for deep sleep mode
 }
 ```

 It's important to declare `base` as `volatile` &mdash; this tells the compiler not to
 make any assumptions about the contents of the data that `base` points to.
 For example:

 ```c
 int timeout = 1000;
 while (timeout-- > 0 && !(base[REGISTER_READY] & READY_BIT)) ;
 ```

 is a typical (bounded) polling loop, intended for short polling sequences.
 Without the `volatile` keyword in the declaration, the compiler would have no reason
 to believe that the value at `base[REGISTER_READY]` would ever change, so it would
 cause it to be read only once.

 # Interrupts

 An interrupt is an asynchronous event, generated by a device when it needs servicing.
 For example, an interrupt is generated when data is available on a serial port,
 or an ethernet packet has arrived.
 Interrupts allow a driver to know about an event as soon as it
 occurs, but without the driver spending time polling (actively waiting) for it.

 The general architecture of a driver that uses interrupts is that a background
 Interrupt Handling Thread (**IHT**) is created during the driver startup / binding
 operation.
 This thread waits for an interrupt to happen, and, when it does, performs some
 kind of servicing action.

 As an example, consider a serial port driver.
 It may receive interrupts due to any of the following events happening:

 *   one or more characters have arrived,
 *   room is now available to transmit one or more characters,
 *   a control line (like `DTR`, for example) has changed state.

 The interrupt wakes up the IHT.
 The IHT determines the cause of the event, usually by reading some status registers.
 Then, it runs an appropriate service function to handle the event.
 Once done, the IHT goes back to sleep, waiting for the next interrupt.

 For example, if a character arrives, the IHT wakes up, reads a status register that
 indicates "data is available," and then calls a function that drains all available
 characters from the serial port FIFO into the driver's buffer.

 ## No kernel-level code required

 You may be familiar with other operating systems which use Interrupt
 Service Routines (**ISR**).
 These are kernel-level handlers that run in privileged mode and interface with
 the interrupt controller hardware.

 In Fuchsia, the kernel deals with the privileged part of the interrupt
 handling, and provides thread-level functions for driver use.

 The difference is that the IHT runs at thread level, whereas the ISR runs
 at kernel level in a very restricted (and sometimes fragile) environment.
 A principal advantage is that if the IHT crashes, it takes out only the
 driver, whereas a failing ISR can take out the entire operating system.

 ## Attaching to an interrupt

 Currently, the only bus that provides interrupts is the PCI bus.
 It supports two kinds: legacy and Message Signaled Interrupts (**MSI**).

 Therefore, in order to use interrupts on PCI:

 1.  determine which kind your device supports (legacy or MSI),
 2.  set the interrupt mode to match,
 3.  get a handle to your device's interrupt vector (usually one, but may be multiple),
 4.  start IHT background thread,
 5.  arrange for IHT thread to wait for interrupts (on handle(s) from step 3).

 Steps `1` and `2` are usually done closely together, for example:

 ```c
 // Query whether we have MSI or Legacy interrupts.
 uint32_t irq_cnt = 0;
 if ((pci_query_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_MSI, &irq_cnt) == ZX_OK) &&
     (pci_set_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_MSI, 1) == ZX_OK)) {
     // using MSI interrupts
 } else if ((pci_query_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_LEGACY, &irq_cnt) == ZX_OK) &&
            (pci_set_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_LEGACY, 1) == ZX_OK)) {
     // using legacy interrupts
 } else {
     // an error
 }
 ```

 The **pci_query_irq_mode()**
 function takes three arguments:

 ```c
 zx_status_t pci_query_irq_mode(const pci_protocol_t* pci,
                                zx_pci_irq_mode_t mode,
                                uint32_t* out_max_irqs);
 ```

 The first argument, `pci`, is a pointer to the PCI protocol stack bound to your device
 just like we saw above, in the BAR documentation.

 The second argument, `mode`, is the kind of interrupt that you are interested in;
 it's one of the two constants shown in the example.

 > @@@ there's also a `ZX_PCIE_IRQ_MODE_MSI_X` in the syscalls/pci.h file; should I say anything about that? How would we use it in the above case, just make a third condition?

 The third argument is a pointer to integer that returns how many
 interrupts of the specified type your device supports.

 Having determined the kind of interrupt supported, you then call
 **pci_set_irq_mode()**
 to indicate that this is indeed the kind of interrupt that you wish to use.

 Finally, you call **pci_map_interrupt()**
 to create a handle to the selected interrupt. Note that
 **pci_map_interrupt()** has the following prototype:

 ```c
 zx_status_t pci_map_interrupt(const pci_protocol_t* pci,
                               int which_irq,
                               zx_handle_t* out_handle);
 ```

 The first argument is the same as in the previous call, the second argument, `which_irq`
 indicates the device-relative interrupt number you'd like, and the third argument
 is a pointer to the created interrupt handle.

 You now have an interrupt handle.

 > Note that the vast majority of devices have just one interrupt, so simply passing
 > `0` for `which_irq` is normal.
 > If your device does have more than one interrupt, the common practice is to run the
 > **pci_map_interrupt()** function in a `for` loop
 > and bind handles to each interrupt.

 ## Waiting for the interrupt

 In your IHT, you call [**zx_interrupt_wait()**](/docs/reference/syscalls/interrupt_wait.md)
 to wait for the interrupt.
 The following prototype applies:

 ```c
 zx_status_t zx_interrupt_wait(zx_handle_t handle,
                               zx_time_t* out_timestamp);
 ```

 The first argument is the handle you obtained via the call to
 **pci_map_interrupt()**,
 and the second parameter can be `NULL` (typical), or it can be a pointer to a time
 stamp that indicates when the interrupt was triggered (in nanoseconds,
 relative to the clock source `ZX_CLOCK_MONOTONIC`).

 Therefore, a typical IHT would have the following shape:

 ```c
 static int irq_thread(void* arg) {
     my_device_t* dev = arg;
     for (;;) {
         zx_status_t rc;
         rc = zx_interrupt_wait(dev->irq_handle, NULL);
         // do stuff
     }
 }
 ```

 The convention is that the argument passed to the IHT is your device context block.
 The context block has a member (here `irq_handle`) that is the handle you obtained via
 **pci_map_interrupt()**.

 ## Edge vs level interrupt mode

 The interrupt hardware can operate in one of two modes; "edge" or "level".

 In edge mode, the interrupt is armed on the active-going edge (when the hardware
 signal goes from inactive to active), and works as a one-shot.
 That is, the signal must go back to inactive before it can be recognized again.

 In level mode, the interrupt is active when the hardware signal is in the
 active state.

 Typically, edge mode is used when the interrupt is dedicated, and level mode is
 used when the interrupt is shared by multiple devices (because you want the
 interrupt to remain active until *all* devices have de-asserted their request line).

 The Zircon kernel automatically masks and unmasks the interrupt as appropriate.
 For level-triggered hardware interrupts,
 [**zx_interrupt_wait()**](/docs/reference/syscalls/interrupt_wait.md)
 masks the interrupt before returning, and unmasks it when called the next time.
 For edge-triggered interrupts, the interrupt remains unmasked.

 > The IHT should not perform any long-running tasks.
 > For drivers that perform lengthy tasks, use a worker thread.

 ## Shutting down a driver that uses interrupts

 In order to cleanly shut down a driver that uses interrupts, you can use
 [**zx_interrupt_destroy()**](/docs/reference/syscalls/interrupt_destroy.md)
 to abort the
 [**zx_interrupt_wait()**](/docs/reference/syscalls/interrupt_wait.md)
 call.

 The idea is that when the foreground thread determines that the driver should be
 shut down, it simply destroys the interrupt handle, causing the IHT to shut down:

 ```c
 static void main_thread() {
     ...
     if (shutdown_requested) {
         // destroy the handle, this will cause zx_interrupt_wait() to pop
         zx_interrupt_destroy(dev->irq_handle);

         // wait for the IHT to finish
         thrd_join(dev->iht, NULL);
     }
     ...
 }

 static int irq_thread(void* arg) {
     ...
     for(;;) {
         zx_status_t rc;
         rc = zx_interrupt_wait(dev->irq_handle, NULL);
         if (rc == ZX_ERR_CANCELED) {
             // we are being shut down, do any cleanups required
             ...
             return;
         }
         ...
     }
 }
 ```

 The main thread, when requested to shut down, destroys the interrupt handle.
 This causes the IHT's
 [**zx_interrupt_wait()**](/docs/reference/syscalls/interrupt_wait.md)
 call to wake up with an error code.
 The IHT looks at the error code (in this case, `ZX_ERR_CANCELED`) and makes
 the decision to end.
 Meanwhile, the main thread is waiting to join the IHT via the call
 to **thrd_join()**.
 Once the IHT exits, **thrd_join()** returns, and the main
 thread can finish its processing.

 The advanced reader is invited to look at some of the other interrupt related
 functions available:

 *   [**zx_interrupt_ack()**](/docs/reference/syscalls/interrupt_ack.md)
 *   [**zx_interrupt_bind()**](/docs/reference/syscalls/interrupt_bind.md)
 *   [**zx_interrupt_create()**](/docs/reference/syscalls/interrupt_create.md)
 *   [**zx_interrupt_trigger()**](/docs/reference/syscalls/interrupt_trigger.md)

 # DMA

 Direct Memory Access (**DMA**) is a feature that allows hardware to access
 memory without CPU intervention.
 At the highest level, the hardware is given the source and destination of the
 memory region to transfer (along with its size) and told to copy the data.
 Some hardware peripherals even support the ability to do multiple
 "scatter / gather" style operations, where several copy operations
 can be performed, one after the other, without additional CPU intervention.

 ## DMA considerations

 In order to fully appreciate the issues involved, it's important to
 keep the following in mind:

 *   each process operates in a virtual address space,
 *   an MMU can map a contiguous virtual address range onto multiple,
     discontiguous physical address ranges (and vice-versa),
 *   each process has a limited window into physical address space,
 *   some peripherals support their own virtual addresses
     via an Input / Output Memory Management Unit (**IOMMU**).

 Let's discuss each point in turn.

 ### Virtual, physical, and device-physical addresses

 The addresses that the process has access to are virtual; that is, they are
 an illusion created by the CPU's Memory Management Unit (**MMU**).
 A virtual address is mapped by the MMU into a physical address.
 The mapping granularity is based on a parameter called "page size," which
 is at least 4k bytes, though larger sizes are available on modern processors.

 ![Figure: Relationship between virtual and physical addresses](dma-000-cropped.png)

 In the diagram above, we show a specific process (process 12) with a number of
 virtual addresses (in blue).
 The MMU is responsible for mapping the blue virtual addresses into CPU physical
 bus addresses (red).
 Each process has its own mapping; so even though process 12 has a virtual address
 `300`, some other process may also have a virtual address `300`.
 That other process's virtual address `300` (if it exists) would be mapped
 to a different physical address than the one in process 12.

 > Note that we've used small decimal numbers as "addresses" to keep the discussion simple.
 > In reality, each square shown above represents a page of memory (4k or more),
 > and is identified by a 32 or 64 bit value (depending on the platform).

 The key points shown in the diagram are:

 1.  virtual addresses can be allocated in groups (three are shown, `300`-`303`, `420`-`421`,
     and `770`-`771`),
 2.  virtually contiguous (e.g., `300`-`303`) is not necessarily physically contiguous.
 3.  some virtual addresses are not mapped (for example, there is no virtual address
     `304`)
 4.  not all physical addresses are available to each process (for example, process
     `12` doesn't have access to physical address `120`).

 Depending on the hardware available on the platform, a device's address space
 may or may not follow a similar translation.
 Without an IOMMU, the addresses that the peripheral uses are the same as
 the physical addresses used by the CPU:

 ![Figure: A device that doesn't use an IOMMU](dma-002-cropped.png)

 In the diagram above, portions of the device's address space (for example, a
 frame buffer, or control registers), appear directly in the CPU's physical
 address range.
 That is to say, the device occupies physical addresses `122` through `125`
 inclusive.

 In order for the process to access the device's memory, it would need to create
 an MMU mapping from some virtual addresses to the physical addresses `122` through
 `125`.
 We'll see how to do that, below.

 But with an IOMMU, the addresses seen by a peripheral may be different than
 the CPU's physical addresses:

 ![Figure: A device that uses an IOMMU](dma-001-cropped.png)

 Here, the device has its own "device-physical" addresses that it knows about,
 that is, addresses `0` through `3` inclusive.
 It's up to the IOMMU to map the device-physical addresses `0` through `3`
 into CPU physical addresses `109`, `110`, `101`, and `119`, respectively.

 In this scenario, in order for the process to use the device's memory, it needs
 to arrange two mappings:

 *   one set from the virtual address space (e.g., `300` through `303`) to the
     CPU physical address space (`109`, `110`, `101`, and `119`, respectively),
     via the MMU, and
 *   one set from the CPU physical address space (addresses `109`, `110`, `101`,
     and `119`) to the device-physical addresses (`0` through `3`) via the IOMMU.

 While this may seem complicated, Zircon provides an abstraction that removes
 the complexity.

 Also, as we'll see below, the reason for having an IOMMU, and the benefits provided,
 are similar to those obtained by having an MMU.

 ### Contiguity of memory

 When you allocate a large chunk of memory (e.g. via **calloc()**),
 your process will, of course, see a large, contiguous virtual address range.
 The MMU creates the illusion of contiguous memory at the virtual addressing
 level, even though the MMU may choose to back that memory area with physically
 discontiguous memory at the physical address level.

 Furthermore, as processes allocate and deallocate memory, the mapping of
 physical memory to virtual address space tends to become more
 complex, encouraging more "swiss cheese" holes to appear (that is,
 more discontiguities in the mapping).

 Therefore, it's important to keep in mind that contiguous virtual addresses
 are not necessarily contiguous physical addresses, and indeed that contiguous
 physical memory becomes more precious over time.

 ### Access controls

 Another benefit of the MMU is that processes are limited in their view of
 physical memory (for security and reliability reasons).
 The impact on drivers, though, is that a process has to specifically request
 a mapping from virtual address space to physical address space, and
 have the requisite privilege in order to do so.

 ### IOMMU

 Contiguous physical memory is generally preferred.
 It's more efficient to do one transfer (with one source address and one
 destination address) than it is to set up and manage multiple individual
 transfers (which may require CPU intervention between each transfer in
 order to set up the next one).

 The IOMMU, if available, alleviates this problem by doing the same thing for
 the peripherals that the CPU's MMU does for the process &mdash; it gives the peripheral
 the illusion that it's dealing with a contiguous address space by
 mapping multiple discontiguous chunks into a virtually contiguous space.
 By limiting the mapping region, the IOMMU also provides security (in the same way as
 the MMU does), by preventing the peripheral from accessing memory that's not "in scope"
 for the current operation.

 ### Tying it all together

 So, it may appear that you need to worry about virtual, physical, and device-physical
 address spaces when you are writing your driver.
 But that's not the case.

 ## DMA and your driver

 Zircon provides a set of functions that allow you to cleanly deal with all of the
 above.
 The following work together:

 *   a Bus Transaction Initiator ([BTI](/docs/reference/kernel_objects/bus_transaction_initiator.md)), and
 *   a Virtual Memory Object ([VMO](/docs/reference/kernel_objects/vm_object.md)).

 The [BTI](/docs/reference/kernel_objects/bus_transaction_initiator.md)
 kernel object provides an abstraction of the model, and an API to deal with
 physical (or device-physical) addresses associated with
 [VMO](/docs/reference/kernel_objects/vm_object.md)s.

 In your driver's initialization, call
 **pci_get_bti()**
 to obtain a [BTI](/docs/reference/kernel_objects/bus_transaction_initiator.md) handle:

 ```c
 zx_status_t pci_get_bti(const pci_protocol_t* pci,
                         uint32_t index,
                         zx_handle_t* bti_handle);
 ```

 The **pci_get_bti()**
 function takes a `pci` protocol pointer (just like all the other **pci_...()** functions
 discussed above) and an `index` (reserved for future use, use `0`).
 It returns a [BTI](/docs/reference/kernel_objects/bus_transaction_initiator.md)
 handle through the `bti_handle` pointer argument.

 Next, you need a [VMO](/docs/reference/kernel_objects/vm_object.md).
 Simplistically, you can think of the [VMO](/docs/reference/kernel_objects/vm_object.md)
 as a pointer to a chunk of memory,
 but it's more than that &mdash; it's a kernel object that represents a set
 of virtual pages (that may or may not have physical pages committed to them),
 which can be mapped into the virtual address space of the driver process.
 (It's even more than that, but that's a discussion for a different chapter.)

 Ultimately, these pages serve as the source or destination of the DMA transfer.

 There are two functions,
 [**zx_vmo_create()**](/docs/reference/syscalls/vmo_create.md)
 and
 [**zx_vmo_create_contiguous()**](/docs/reference/syscalls/vmo_create_contiguous.md)
 that allocate memory and bind it to a [VMO](/docs/reference/kernel_objects/vm_object.md):

 ```c
 zx_status_t zx_vmo_create(uint64_t size,
                           uint32_t options,
                           zx_handle_t* out);

 zx_status_t zx_vmo_create_contiguous(zx_handle_t bti,
                                      size_t size,
                                      uint32_t alignment_log2,
                                      zx_handle_t* out);
 ```

 As you can see, they both take a `size` parameter indicating the number of bytes required,
 and they both return a [VMO](/docs/reference/kernel_objects/vm_object.md) (via `out`).
 They both allocate virtually contiguous pages, for a given size.

 > Note that this differs from the standard C library memory allocation functions,
 > (e.g., **malloc()**), which allocate virtually contiguous memory, but without
 > regard to page boundaries. Two small **malloc()** calls in a row might allocate
 > two memory regions from the *same* page, for instance, whereas
 > the [VMO](/docs/reference/kernel_objects/vm_object.md)
 > creation functions will always allocate memory starting with a *new* page.

 The
 [**zx_vmo_create_contiguous()**](/docs/reference/syscalls/vmo_create_contiguous.md)
 function does what
 [**zx_vmo_create()**](/docs/reference/syscalls/vmo_create.md)
 does, *and* ensures that the pages are suitably
 organized for use with the specified [BTI](/docs/reference/kernel_objects/bus_transaction_initiator.md)
 (which is why it needs the [BTI](/docs/reference/kernel_objects/bus_transaction_initiator.md) handle).
 It also features an `alignment_log2` parameter that can be used to specify a minimum
 alignment requirement.
 As the name suggests, it must be an integer power of 2 (with the value `0` indicating
 page aligned).

 At this point, you have two "views" of the allocated memory:

 *   one contiguous virtual address space that represents memory
     from the point of view of the driver, and
 *   a set of (possibly contiguous, possibly committed) physical pages
     for use by the peripheral.

 Before using these pages, you need to ensure that they are present in memory (that is,
 "committed" &mdash; the physical pages are accessible to your process), and that the
 peripheral has access to them (via the IOMMU if present).
 You will also need the addresses of the pages (from the point of view of the device)
 so that you can program the DMA controller on your device to access them.

 The
 [**zx_bti_pin()**](/docs/reference/syscalls/bti_pin.md)
 function is used to do all that:

 ```c
 #include <zircon/syscalls.h>

 zx_status_t zx_bti_pin(zx_handle_t bti, uint32_t options,
                        zx_handle_t vmo, uint64_t offset, uint64_t size,
                        zx_paddr_t* addrs, size_t addrs_count,
                        zx_handle_t* pmt);
 ```

 There are 8 parameters to this function:

 Parameter       | Purpose
 ----------------|------------------------------------
 `bti`           | the [BTI](/docs/reference/kernel_objects/bus_transaction_initiator.md) for this peripheral
 `options`       | options (see below)
 `vmo`           | the [VMO](/docs/reference/kernel_objects/vm_object.md) for this memory region
 `offset`        | offset from the start of the [VMO](/docs/reference/kernel_objects/vm_object.md)
 `size`          | total number of bytes in [VMO](/docs/reference/kernel_objects/vm_object.md)
 `addrs`         | list of return addresses
 `addrs_count`   | number of elements in `addrs`
 `pmt`           | returned [PMT](/docs/reference/kernel_objects/pinned_memory_token.md) (see below)

 The `addrs` parameter is a pointer to an array of `zx_paddr_t` that you supply.
 This is where the peripheral addresses for each page are returned into.
 The array is `addrs_count` elements long, and must match the count of
 elements expected from
 [**zx_bti_pin()**](/docs/reference/syscalls/bti_pin.md).

 > The values written into `addrs` are suitable for programming the peripheral's
 > DMA controller &mdash; that is, they take into account any translations that
 > may be performed by an IOMMU, if present.

 On a technical note, the other effect of
 [**zx_bti_pin()**](/docs/reference/syscalls/bti_pin.md)
 is that the kernel will ensure those pages are not decommitted
 (i.e., moved or reused) while pinned.

 The `options` argument is actually a bitmap of options:

 Option                  | Purpose
 ------------------------|--------------------------------
 `ZX_BTI_PERM_READ`      | pages can be read by the peripheral (written by the driver)
 `ZX_BTI_PERM_WRITE`     | pages can be written by the peripheral (read by the driver)
 `ZX_BTI_COMPRESS`       | (see "Minimum contiguity property," below)

 For example, refer to the diagrams above showing "Device #3".
 If an IOMMU is present, `addrs` would contain `0`, `1`, `2`, and `3` (that is,
 the device-physical addresses).
 If no IOMMU is present, `addrs` would contain `109`, `110`, `101`, and `119` (that is,
 the physical addresses).

 ### Permissions

 Keep in mind that the permissions are from the perspective
 *of the peripheral*, and not the driver.
 For example, in a block device **write** operation, the device **reads** from memory pages and
 therefore the driver specifies `ZX_BTI_PERM_READ`, and vice versa in the block device read.

 ### Minimum contiguity property

 By default, each address returned through `addrs` is one page long.
 Larger chunks may be requested by setting the `ZX_BTI_COMPRESS` option
 in the `options` argument.
 In that case, the length of each entry returned corresponds to the "minimum contiguity" property.
 While you can't set this property, you can read it via
 [**zx_object_get_info()**](/docs/reference/syscalls/object_get_info.md).
 Effectively, the minimum contiguity property is a guarantee that
 [**zx_bti_pin()**](/docs/reference/syscalls/bti_pin.md)
 will always be able to return addresses that are contiguous for at least that many bytes.

 For example, if the property had the value 1MB, then a call to
 [**zx_bti_pin()**](/docs/reference/syscalls/bti_pin.md)
 with a requested size of 2MB would return at most two physically-contiguous runs.
 If the requested size was 2.5MB, it would return at most three physically-contiguous runs,
 and so on.

 ### Pinned Memory Token (PMT)

 [`zx_bti_pin()`](/docs/reference/syscalls/bti_pin.md) returns a Pinned Memory Token
 ([PMT](/docs/reference/kernel_objects/pinned_memory_token.md))
 upon success in the *pmt* argument.
 The driver must call [`zx_pmt_unpin()`](/docs/reference/syscalls/pmt_unpin.md) when the device is done with
 the memory transaction to unpin and revoke access to the memory pages by the device.

 ## Advanced topics

 ### Cache Coherency

 On fully DMA-coherent architectures, hardware ensures the data in the CPU cache is the same
 as the data in main memory without software intervention. Not all architectures are
 DMA-coherent. On these systems, the driver must ensure the CPU cache is made coherent by
 invoking appropriate cache operations on the memory range before performing DMA operations,
 so that no stale data will be accessed.

 To invoke cache operations on the memory represented by [VMO](/docs/reference/kernel_objects/vm_object.md)s, use the
 [`zx_vmo_op_range()`](/docs/reference/syscalls/vmo_op_range.md)
 syscall.
 Prior to a peripheral-read
 (driver-write) operation, clean the cache using `ZX_VMO_OP_CACHE_CLEAN` to write out dirty
 data to main memory. Prior to a peripheral-write (driver-read), mark the cache lines
 as invalid using `ZX_VMO_OP_CACHE_INVALIDATE` to ensure data is fetched from main
 memory on the next access.