[Preprocessor] Fix incorrect token caching that occurs when lexing _Pragma
in macro argument pre-expansion mode when skipping a function body

This commit fixes a token caching problem that currently occurs when clang is
skipping a function body (e.g. when looking for a code completion token) and at
the same time caching the tokens for _Pragma when lexing it in macro argument
pre-expansion mode.

When _Pragma is being lexed in macro argument pre-expansion mode, it caches the
tokens so that it can avoid interpreting the pragma immediately (as the macro
argument may not be used in the macro body), and then either backtracks over or
commits these tokens. The problem is that, when we're backtracking/committing in
such a scenario, there's already a previous backtracking position stored in
BacktrackPositions (as we're skipping the function body), and this leads to a
situation where the cached tokens from the pragma (like '(' 'string_literal'
and ')') will remain in the cached tokens array incorrectly even after they're
consumed (in the case of backtracking) or just ignored (in the case when they're
committed). Furthermore, what makes it even worse, is that because of a previous
backtracking position, the logic that deals with when should we call
ExitCachingLexMode in CachingLex no longer works for us in this situation, and
more tokens in the macro argument get cached, to the point where the EOF token
that corresponds to the macro argument EOF is cached. This problem leads to all
sorts of issues in code completion mode, where incorrect errors get presented
and code completion completely fails to produce completion results.

rdar://28523863

Differential Revision: https://reviews.llvm.org/D28772

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@296140 91177308-0d34-0410-b5e6-96231b3b80d8
(cherry picked from commit b35cf4128a9e78994ff33b963728456db9471d3a)
diff --git a/include/clang/Lex/Preprocessor.h b/include/clang/Lex/Preprocessor.h
index 66ff490..0216901 100644
--- a/include/clang/Lex/Preprocessor.h
+++ b/include/clang/Lex/Preprocessor.h
@@ -1085,6 +1085,24 @@
   /// \brief Disable the last EnableBacktrackAtThisPos call.
   void CommitBacktrackedTokens();
 
+  struct CachedTokensRange {
+    CachedTokensTy::size_type Begin, End;
+  };
+
+private:
+  /// \brief A range of cached tokens that should be erased after lexing
+  /// when backtracking requires the erasure of such cached tokens.
+  Optional<CachedTokensRange> CachedTokenRangeToErase;
+
+public:
+  /// \brief Returns the range of cached tokens that were lexed since
+  /// EnableBacktrackAtThisPos() was previously called.
+  CachedTokensRange LastCachedTokenRange();
+
+  /// \brief Erase the range of cached tokens that were lexed since
+  /// EnableBacktrackAtThisPos() was previously called.
+  void EraseCachedTokens(CachedTokensRange TokenRange);
+
   /// \brief Make Preprocessor re-lex the tokens that were lexed since
   /// EnableBacktrackAtThisPos() was previously called.
   void Backtrack();
diff --git a/lib/Lex/PPCaching.cpp b/lib/Lex/PPCaching.cpp
index 4742aae..79776f4 100644
--- a/lib/Lex/PPCaching.cpp
+++ b/lib/Lex/PPCaching.cpp
@@ -35,6 +35,29 @@
   BacktrackPositions.pop_back();
 }
 
+Preprocessor::CachedTokensRange Preprocessor::LastCachedTokenRange() {
+  assert(isBacktrackEnabled());
+  auto PrevCachedLexPos = BacktrackPositions.back();
+  return CachedTokensRange{PrevCachedLexPos, CachedLexPos};
+}
+
+void Preprocessor::EraseCachedTokens(CachedTokensRange TokenRange) {
+  assert(TokenRange.Begin <= TokenRange.End);
+  if (CachedLexPos == TokenRange.Begin && TokenRange.Begin != TokenRange.End) {
+    // We have backtracked to the start of the token range as we want to consume
+    // them again. Erase the tokens only after consuming then.
+    assert(!CachedTokenRangeToErase);
+    CachedTokenRangeToErase = TokenRange;
+    return;
+  }
+  // The cached tokens were committed, so they should be erased now.
+  assert(TokenRange.End == CachedLexPos);
+  CachedTokens.erase(CachedTokens.begin() + TokenRange.Begin,
+                     CachedTokens.begin() + TokenRange.End);
+  CachedLexPos = TokenRange.Begin;
+  ExitCachingLexMode();
+}
+
 // Make Preprocessor re-lex the tokens that were lexed since
 // EnableBacktrackAtThisPos() was previously called.
 void Preprocessor::Backtrack() {
@@ -51,6 +74,13 @@
 
   if (CachedLexPos < CachedTokens.size()) {
     Result = CachedTokens[CachedLexPos++];
+    // Erase the some of the cached tokens after they are consumed when
+    // asked to do so.
+    if (CachedTokenRangeToErase &&
+        CachedTokenRangeToErase->End == CachedLexPos) {
+      EraseCachedTokens(*CachedTokenRangeToErase);
+      CachedTokenRangeToErase = None;
+    }
     return;
   }
 
diff --git a/lib/Lex/Pragma.cpp b/lib/Lex/Pragma.cpp
index 1ce4b43..0fb5cc0 100644
--- a/lib/Lex/Pragma.cpp
+++ b/lib/Lex/Pragma.cpp
@@ -142,12 +142,23 @@
 
   ~LexingFor_PragmaRAII() {
     if (InMacroArgPreExpansion) {
+      // When committing/backtracking the cached pragma tokens in a macro
+      // argument pre-expansion we want to ensure that either the tokens which
+      // have been committed will be removed from the cache or that the tokens
+      // over which we just backtracked won't remain in the cache after they're
+      // consumed and that the caching will stop after consuming them.
+      // Otherwise the caching will interfere with the way macro expansion
+      // works, because we will continue to cache tokens after consuming the
+      // backtracked tokens, which shouldn't happen when we're dealing with
+      // macro argument pre-expansion.
+      auto CachedTokenRange = PP.LastCachedTokenRange();
       if (Failed) {
         PP.CommitBacktrackedTokens();
       } else {
         PP.Backtrack();
         OutTok = PragmaTok;
       }
+      PP.EraseCachedTokens(CachedTokenRange);
     }
   }
 
diff --git a/test/CodeCompletion/pragma-macro-token-caching.c b/test/CodeCompletion/pragma-macro-token-caching.c
new file mode 100644
index 0000000..432706e
--- /dev/null
+++ b/test/CodeCompletion/pragma-macro-token-caching.c
@@ -0,0 +1,18 @@
+
+#define Outer(action) action
+
+void completeParam(int param) {
+    ;
+    Outer(__extension__({ _Pragma("clang diagnostic push") }));
+    param;
+}
+
+// RUN: %clang_cc1 -fsyntax-only -code-completion-at=%s:7:1 %s | FileCheck %s
+// CHECK: param : [#int#]param
+
+void completeParamPragmaError(int param) {
+    Outer(__extension__({ _Pragma(2) })); // expected-error {{_Pragma takes a parenthesized string literal}}
+    param;
+}
+
+// RUN: %clang_cc1 -fsyntax-only -verify -code-completion-at=%s:16:1 %s | FileCheck %s