Revert "[kernel][vm] Robust PageRequest batching"
This reverts commit 2f5df2c170150b19c0fd8adfae34817b8f7a6566.
Reason for revert: causing kernel panic in CI
Original change's description:
> [kernel][vm] Robust PageRequest batching
>
> Changes the batching state from being inferred from the return value of
> PageSource::GetPage, to being queryable from the PageSource. This has
> three benefits
> 1. If an in progress batch request is given to the wrong PageSource it
> can be detected and safely handled. This cannot happen today, but could
> happen when there are additional implementors of PageRequestInterface.
> Specifically with an anonmyous page requester you could have a
> copy-on-write child of a pager backed VMO that has CommitRange called.
> The first page could be present in the parent, but missing in the
> child, requiring a page allocation, which could fail causing the
> PageRequest to be filled in. Since its in batch mode we would continue
> and then possibly find a page missing from the root vmo, and attempt
> to fill in the PageRequest using the PageSource.
> 2. The ZX_ERR_NEXT state that previously meant further batching was
> possible is also used for loop control everywhere in the VM system.
> Propagating GetPage statuses up to the root caller therefore has the
> chance of colliding with an attempt to perform a ForEveryPage or
> equivalent iteration. This does not happen today, but this is a hazard
> that would not be easily detected.
> 3. Additional assertions can be added using this state to ensure the
> page request API is used correctly.
>
> Bug: 99890
> Run-All-Tests: True
> Change-Id: Id35c487c52591f709656c412e5f13933e0225883
> Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/679849
> Commit-Queue: Adrian Danis <adanis@google.com>
> Fuchsia-Auto-Submit: Adrian Danis <adanis@google.com>
> Reviewed-by: Rasha Eqbal <rashaeqbal@google.com>
Bug: 99890
Change-Id: Ib30888a3c8e328841fca0b67e86add0230a77286
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/681056
Commit-Queue: Dangyi Liu <dangyi@google.com>
Reviewed-by: RubberStamper 🤖 <android-build-ayeaye@system.gserviceaccount.com>
diff --git a/zircon/kernel/vm/include/vm/page_source.h b/zircon/kernel/vm/include/vm/page_source.h
index a3c73eb..14af77e 100644
--- a/zircon/kernel/vm/include/vm/page_source.h
+++ b/zircon/kernel/vm/include/vm/page_source.h
@@ -201,8 +201,8 @@
// tolerant of having already been detached/closed.
virtual zx_status_t WaitOnRequest(PageRequest* request) = 0;
- // Called to complete a batched PageRequest if the last call to GetPage returned
- // ZX_ERR_SHOULD_WAIT *and* the |request->BatchAccepting| is true.
+ // Called to complete a batched PageRequest if the last call to GetPage
+ // returned ZX_ERR_NEXT.
//
// Returns ZX_ERR_SHOULD_WAIT if the PageRequest will be fulfilled after
// being waited upon.
@@ -246,12 +246,11 @@
// Sends a request to the backing source to provide the requested page.
//
// Returns ZX_OK if the request was synchronously fulfilled.
+ // Returns ZX_ERR_SHOULD_WAIT if the request will be asynchronously
+ // fulfilled. The caller should wait on |req|.
+ // Returns ZX_ERR_NEXT if the PageRequest is in batch mode and the caller
+ // can continue to add more pages to the request.
// Returns ZX_ERR_NOT_FOUND if the request cannot be fulfilled.
- // Returns ZX_ERR_SHOULD_WAIT if the request will be asynchronously fulfilled. If
- // |req->BatchAccepting| is true then additional calls to |GetPage| may be performed to add more
- // pages to the request, or if no more pages want to be added the request should be finalized by
- // |req->FinalizeRequest|. If |BatchAccepting| was false, or |req| was finalized, then the caller
- // should wait on |req|.
zx_status_t GetPage(uint64_t offset, PageRequest* req, VmoDebugInfo vmo_debug_info,
vm_page_t** const page_out, paddr_t* const pa_out);
@@ -364,20 +363,23 @@
// Helper that adds page at |offset| to |request| and potentially forwards it to the provider.
// |request| must already be initialized. |offset| must be page-aligned.
//
- // Returns ZX_ERR_NEXT if |internal_batching| is true and more pages can be added to the request,
- // in which case the caller of this function within PageSource *must* handle ZX_ERR_NEXT itself
- // before returning from PageSource. This option is used for request types that want to operate in
- // batch mode by default (e.g. DIRTY requests), where pages are added to the batch internally in
- // PageSource without involving the external caller.
+ // Returns ZX_ERR_SHOULD_WAIT if the request will be asynchronously fulfilled. The caller should
+ // wait on |request|.
//
- // Otherwise this method always returns ZX_ERR_SHOULD_WAIT, and transitions the
- // PageRequest::batch_state_ as required.
+ // Returns ZX_ERR_NEXT if the request is in batch mode and the caller can continue
+ // to add more pages to the request. The request can be in batch mode under two scenarios:
+ // 1) The request was created with |allow_batching_| set. The external caller into PageSource
+ // will handle the ZX_ERR_NEXT in this case, and add more pages.
+ // 2) |internal_batching| is true, in which case the caller of this function within PageSource
+ // *must* handle ZX_ERR_NEXT itself before returning from PageSource. This option is used for
+ // request types that want to operate in batch mode by default (e.g. DIRTY requests), where pages
+ // are added to the batch internally in PageSource without involving the external caller.
// TODO(rashaeqbal): Figure out if internal_batching can be unified with allow_batching_.
zx_status_t PopulateRequestLocked(PageRequest* request, uint64_t offset,
bool internal_batching = false) TA_REQ(page_source_mtx_);
// Helper used to complete a batched page request if the last call to PopulateRequestLocked
- // left the page request in the BatchRequest::Accepting state.
+ // returned ZX_ERR_NEXT.
zx_status_t FinalizeRequestLocked(PageRequest* request) TA_REQ(page_source_mtx_);
// Sends a request to the backing source, or adds the request to the overlap_ list if
@@ -415,8 +417,7 @@
public:
// If |allow_batching| is true, then a single request can be used to service
// multiple consecutive pages.
- explicit PageRequest(bool allow_batching = false)
- : batch_state_(allow_batching ? BatchState::Accepting : BatchState::Unbatched) {}
+ explicit PageRequest(bool allow_batching = false) : allow_batching_(allow_batching) {}
~PageRequest();
// Returns ZX_OK on success, or a permitted error code if the backing page provider explicitly
@@ -426,12 +427,6 @@
// Forwards to the underlying PageRequestInterface::FinalizeRequest, see that for details.
zx_status_t FinalizeRequest();
- // Returns |true| if this is a batch request that can still accept additional requests. If |true|
- // the |FinalizeRequest| method must be called before |Wait| can be used. If this is |false| then
- // either this is not a batch request, or the batch request has already been closed and does not
- // need to be finalized.
- bool BatchAccepting() const { return batch_state_ == BatchState::Accepting; }
-
DISALLOW_COPY_ASSIGN_AND_MOVE(PageRequest);
private:
@@ -441,20 +436,7 @@
void Init(fbl::RefPtr<PageRequestInterface> src, uint64_t offset, page_request_type type,
VmoDebugInfo vmo_debug_info);
- // The batch state is used both to implement a stateful query of whether a batch page request is
- // finished taking new requests or not, and to implement assertions to catch misuse of the request
- // API.
- enum class BatchState {
- // Does not support batching.
- Unbatched,
- // Supports batching and can keep taking new requests. A request in this state must have
- // FinalizeRequest called before it can be waited on.
- Accepting,
- // This was a batched request that has been finalized and may be waited on.
- Finalized
- };
-
- BatchState batch_state_;
+ const bool allow_batching_;
// The page source this request is currently associated with.
fbl::RefPtr<PageRequestInterface> src_;
diff --git a/zircon/kernel/vm/page_source.cc b/zircon/kernel/vm/page_source.cc
index 802fd1c..e5c4824 100644
--- a/zircon/kernel/vm/page_source.cc
+++ b/zircon/kernel/vm/page_source.cc
@@ -230,15 +230,6 @@
request->Init(fbl::RefPtr<PageRequestInterface>(this), offset, page_request_type::READ,
vmo_debug_info);
LTRACEF_LEVEL(2, "%p offset %lx\n", this, offset);
- } else {
- // The user should never have been trying to keep using a PageRequest that ended up in any state
- // other than Accepting.
- DEBUG_ASSERT(request->batch_state_ == PageRequest::BatchState::Accepting);
- if (request->src_.get() != static_cast<PageRequestInterface*>(this)) {
- // Ask the correct page source to finalize the request as we cannot add our page to it, and
- // the caller should stop trying to use it.
- return request->FinalizeRequest();
- }
}
return PopulateRequestLocked(request, offset);
@@ -259,9 +250,9 @@
#endif // DEBUG_ASSERT_IMPLEMENTED
bool send_request = false;
- zx_status_t res = ZX_ERR_SHOULD_WAIT;
+ zx_status_t res;
DEBUG_ASSERT(!request->provider_owned_);
- if (request->batch_state_ == PageRequest::BatchState::Accepting || internal_batching) {
+ if (request->allow_batching_ || internal_batching) {
// If possible, append the page directly to the current request. Else have the
// caller try again with a new request.
if (request->offset_ + request->len_ == offset) {
@@ -289,23 +280,21 @@
if (end_batch) {
send_request = true;
- } else if (internal_batching) {
+ res = ZX_ERR_SHOULD_WAIT;
+ } else {
res = ZX_ERR_NEXT;
}
} else {
send_request = true;
+ res = ZX_ERR_SHOULD_WAIT;
}
} else {
request->len_ = PAGE_SIZE;
send_request = true;
+ res = ZX_ERR_SHOULD_WAIT;
}
if (send_request) {
- if (request->batch_state_ == PageRequest::BatchState::Accepting) {
- // Sending the request means it's finalized and we do not want the caller attempting to add
- // further pages or finalizing.
- request->batch_state_ = PageRequest::BatchState::Finalized;
- }
SendRequestToProviderLocked(request);
}
@@ -344,9 +333,7 @@
void PageSource::SendRequestToProviderLocked(PageRequest* request) {
LTRACEF_LEVEL(2, "%p %p\n", this, request);
DEBUG_ASSERT(request->type_ < page_request_type::COUNT);
- DEBUG_ASSERT(request->offset_ != UINT64_MAX);
DEBUG_ASSERT(page_provider_->SupportsPageRequestType(request->type_));
- DEBUG_ASSERT(request->batch_state_ != PageRequest::BatchState::Accepting);
// Find the node with the smallest endpoint greater than offset and then
// check to see if offset falls within that node.
auto overlap = outstanding_requests_[request->type_].upper_bound(request->offset_);
@@ -513,9 +500,6 @@
DEBUG_ASSERT(type < page_request_type::COUNT);
type_ = type;
src_ = ktl::move(src);
- if (batch_state_ != BatchState::Unbatched) {
- batch_state_ = BatchState::Accepting;
- }
event_.Unsignal();
}
@@ -523,9 +507,6 @@
zx_status_t PageRequest::Wait() {
lockdep::AssertNoLocksHeld();
VM_KTRACE_DURATION(1, "page_request_wait", offset_, len_);
- // Ensure this request was both initialized, and if it's a batched request that it was finalized.
- ASSERT(offset_ != UINT64_MAX);
- ASSERT(batch_state_ != BatchState::Accepting);
zx_status_t status = src_->WaitOnRequest(this);
VM_KTRACE_FLOW_END(1, "page_request_signal", reinterpret_cast<uintptr_t>(this));
if (status != ZX_OK && !PageSource::IsValidInternalFailureCode(status)) {
@@ -536,8 +517,6 @@
zx_status_t PageRequest::FinalizeRequest() {
DEBUG_ASSERT(src_);
- DEBUG_ASSERT(batch_state_ == BatchState::Accepting);
- batch_state_ = BatchState::Finalized;
return src_->FinalizeRequest(this);
}
diff --git a/zircon/kernel/vm/vm_cow_pages.cc b/zircon/kernel/vm/vm_cow_pages.cc
index 18ede15..b93e276 100644
--- a/zircon/kernel/vm/vm_cow_pages.cc
+++ b/zircon/kernel/vm/vm_cow_pages.cc
@@ -2548,34 +2548,32 @@
zx_status_t res = LookupPagesLocked(offset, flags, DirtyTrackingAction::None, 1, &page_list,
page_request, &lookup_info);
if (unlikely(res == ZX_ERR_SHOULD_WAIT)) {
- if (page_request->get()->BatchAccepting()) {
- // In batch mode, will need to finalize the request later.
- if (!have_page_request) {
- // Stash how much we have committed right now, as we are going to have to reprocess this
- // range so we do not want to claim it was committed.
- *committed_len = offset - start_offset;
- have_page_request = true;
- }
- } else {
- // We can end up here in two cases:
- // 1. We were in batch mode but had to terminate the batch early.
- // 2. We hit the first missing page and we were not in batch mode.
- //
- // If we do have a page request, that means the batch was terminated early by
- // pre-populated pages (case 1). Return immediately.
- //
- // Do not update the |committed_len| for case 1 as we are returning on encountering
- // pre-populated pages while processing a batch. When that happens, we will terminate the
- // batch we were processing and send out a page request for the contiguous range we've
- // accumulated in the batch so far. And we will need to come back into this function again
- // to reprocess the range the page request spanned, so we cannot claim any pages have been
- // committed yet.
- if (!have_page_request) {
- // Not running in batch mode, and this is the first missing page (case 2). Update the
- // committed length we have so far and return.
- *committed_len = offset - start_offset;
- }
- return ZX_ERR_SHOULD_WAIT;
+ // We can end up here in two cases:
+ // 1. We were in batch mode but had to terminate the batch early.
+ // 2. We hit the first missing page and we were not in batch mode.
+ //
+ // If we do have a page request, that means the batch was terminated early by pre-populated
+ // pages (case 1). Return immediately.
+ //
+ // Do not update the |committed_len| for case 1 as we are returning on encountering
+ // pre-populated pages while processing a batch. When that happens, we will terminate the
+ // batch we were processing and send out a page request for the contiguous range we've
+ // accumulated in the batch so far. And we will need to come back into this function again
+ // to reprocess the range the page request spanned, so we cannot claim any pages have been
+ // committed yet.
+ if (!have_page_request) {
+ // Not running in batch mode, and this is the first missing page (case 2). Update the
+ // committed length we have so far and return.
+ *committed_len = offset - start_offset;
+ }
+ return ZX_ERR_SHOULD_WAIT;
+ } else if (unlikely(res == ZX_ERR_NEXT)) {
+ // In batch mode, will need to finalize the request later.
+ if (!have_page_request) {
+ // Stash how much we have committed right now, as we are going to have to reprocess this
+ // range so we do not want to claim it was committed.
+ *committed_len = offset - start_offset;
+ have_page_request = true;
}
} else if (unlikely(res != ZX_OK)) {
VMO_VALIDATION_ASSERT(DebugValidatePageSplitsHierarchyLocked());