commit	0e29d3d0e01e31385df2637419ce4dce34f850e1	[log] [tgz]
author	John Grossman <johngro@fuchsia.infra.roller.google.com>	Mon Jan 29 21:49:25 2024 +0000
committer	Copybara-Service <copybara-worker@google.com>	Mon Jan 29 13:51:06 2024 -0800
tree	dfabd4bccf20c558b8a610a1cd48bb00e0bbc121
parent	2ba5b62c5ec4803de6ccd789eaddd6da549d59a5 [diff]

[roll] Roll fuchsia [kernel][time] Add synchronized timer access at the platform level.

Add a platform level routine which lets code observe the lowest level
ticks reference for the system, while explicitly specifying how the
observation needs to be synchronized against the instruction pipeline.

Background:
The C++ memory model is pretty spiffy, but it only specifies the
synchronization behavior of accessing _memory_, both in the local
instruction pipeline, and the relationship of observations across
multiple concurrently executing instruction pipelines.  It does not
make any guarantees about accessing registers (whether the be MMIO
registers, or registers accessed using special architecture specific
instructions).

As it turns out, the behavior of an observation of the ticks reference
is all over the map on this front.  For example, when using the `TSC`
on `Intel/AMD`, we generally use the `RDTSC` instruction, which is not
synchronized against the instruction pipeline at all.  For example, if
we had code which did something like:

```
// acquire a mutex in a way which had `Acquire` semantics.  None of
// the loads after this observe any can move ahead of stores which
// happened before stores done before the previous ReleaseMutex()
// which this acquire sync'ed with.
AcquireMutex();
gFooTicks = rdtsc();
// Release the shared mutex in a way which has `Release` semantics.
ReleaseMutex();
```

`rdtsc` provides little in the way of guarantees here.  Technically,
it is allowed to happen before (in the local instruction pipeline)
both the load and the store which were used to acquire the shared
mutex.  The observed value of the `TSC` cannot "happen after" the
release of the shared mutex, but perhaps not for the reasons you might
initially think.  The resolution of the `RDTSC` instruction itself
could typically "happen-after" the release of the mutex; it is only
the store to the global variable which would prevent this.  Basically,
the store to the global is a memory access which depends on the read
of the TSC being globally visible.  The `gFooTicks` store cannot be
observed to move past the store (with release) which Releases the
mutex.

So, in the code above, the value of TSC can never be observed to have
been taken after the release, but _could_ actually be observed to
happen-before the mutex was entered because of pipelineing issue.  If
the synchronization construct happened to be a shared sequence lock
transaction instead, then the observation of the TSC could seem to
happen either before or after a successful transaction.  There is
nothing to contain the observation to what seems like the reasonable
bounds of the transaction.

Most of the time, this does not matter.  Systems with concurrent
observations of time tend to happen in situations where strictly
asserting that an observation of what time it is happened between two
specific instructions is not all that important.  Rarely, however, it
does actually matter (see the associated bug number).

The Change:
So, for those rare situations where it matters, we introduce a new way
to access the lowest level `ticks` reference,
`platform_current_ticks_synchronized`.  The function can be
instantiated with a set of template flags which allow the user to
request that the observation of the ticks reference is bounded by
specific things relative to the instruction pipeline.  In particular,
users may require that that their observation take place:

++ After any previous loads.
++ After any previous stores.
++ Before any subsequent loads.
++ Before any subsequent stores.
++ Or, any combination of the above.

Based on the combination of architecture, reference timer, and
user-specified requirements, the platform is responsible for inserting
the proper barriers or data dependencies (if any) to achieve the
desired result.  Typically, this involved adding special barrier
instructions to the pipeline which force previous instructions to
complete and become "globally visible" (to use the Intel term) before
the read of the time reference can start, or force the read of the
time reference to complete before starting subsequent instructions.

This explicit pipeline barriers _are absolutely not free_.  Users
should only ever reach for this tool when it is *required* to make
their system's behavior correct, from an externally observable
standpoint.

Original-Bug: 136323
Original-Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/979875
Original-Revision: 04a65d39ba3d9738220525a3f778f64265b5d5cb
GitOrigin-RevId: 9f499bd8b2ed66a9a1878fda242f913776879c04
Change-Id: Id4db173c5a30073a0de316360946a3d3d2f4cc6d

stem[diff]

1 file changed

tree: dfabd4bccf20c558b8a610a1cd48bb00e0bbc121

README.md

Integration

This repository contains Fuchsia's Global Integration manifest files.

Making changes

All changes should be made to the internal version of this repository. Our infrastructure automatically updates this version when the internal one changes.

Currently all changes must be made by a Google employee. Non-Google employees wishing to make a change can ask for assistance via the IRC channel #fuchsia on Freenode.

Obtaining the source

First install Jiri.

Next run:

$ jiri init
$ jiri import minimal https://fuchsia.googlesource.com/integration
$ jiri update

Third party

Third party projects should have their own subdirectory in ./third_party.