[engine] Avoid calling git ls-files unnecessarily

When files are specified on the command line, only those files should be
included in the `scm` object, and as an optimization we should avoid
calling `git ls-files` just to discover shac.star files, since it adds
notable latency on large repositories.

Instead, traverse the filesystem to discover shac.star files that may
apply to the listed files.

The overhead of the `Run` function (from the start until after all
checks have been registered) goes from ~670ms to ~5ms for fuchsia.git
when a single file is specified on the command line.

The only side effect I'm aware of is that when files are specified on
the command line, some `shac.star` files will now be run that wouldn't
normally be included. For example:
1. If a `dir1/shac.star` is git-ignored then it wouldn't normally be
   run, but it will now be run if a file in `dir1` is specified. This is
   probably going to be very rare, I don't see a reason for individual
   shac.star files to be gitignored.
2. If a file in a submodule or entire git-ignored directory is
   specified, then shac files from that directory may be run. This is
   more of an issue, as it's plausible that a dependency submodule could
   also use shac and have its own shac.star files that shouldn't be run,
   because shac should only ever run checks that are registered in the
   root repository's shac.star file.

We can fix both of these eventually by skipping shac.star files that
aren't known to the underlying scm, for now they're rare enough that we
can ignore them.

Change-Id: I4e6e6a81bd6eccb0d3ee3fd32bf9b68d1ceef933
Reviewed-on: https://fuchsia-review.googlesource.com/c/shac-project/shac/+/912734
Reviewed-by: Marc-Antoine Ruel <maruel@google.com>
Commit-Queue: Auto-Submit <auto-submit@fuchsia-infra.iam.gserviceaccount.com>
Fuchsia-Auto-Submit: Oliver Newman <olivernewman@google.com>
3 files changed
tree: af038a95850338f0e3377d0d6181b42117f0c1ef
  1. .github/
  2. checks/
  3. doc/
  4. images/
  5. internal/
  6. scripts/
  7. vendor/
  8. .gitignore
  9. AUTHORS
  10. codecov.yml
  11. CONTRIBUTING.md
  12. go.mod
  13. go.sum
  14. LICENSE
  15. main.go
  16. OWNERS
  17. PATENTS
  18. README.md
  19. shac.star
  20. shac.textproto
README.md

shac

Shac (Scalable Hermetic Analysis and Checks) is a unified and ergonomic tool and framework for writing and running static analysis checks.

Shac checks are written in Starlark.

usage demonstration

Usage

go install go.fuchsia.dev/shac-project/shac@latest
shac check
shac doc shac.star | less

Documentation

Road map

Planned features/changes, in descending order by priority:

  • [x] Configuring files to exclude from shac analysis in shac.textproto
  • [x] Include unstaged files in analysis, including respecting unstaged shac.star files
  • [x] Automatic fix application with handling for conflicting suggestions
  • [ ] Provide a .shac cache directory that checks can write to
  • [ ] Mount checkout directory read-only
    • [x] By default
    • [ ] Unconditionally
  • [ ] Give checks access to the commit message via ctx.scm
  • [ ] Built-in formatting of Starlark files
  • [ ] Configurable “pass-throughs” - non-default environment variables and mounts that can optionally be passed through to the sandbox
  • [ ] Add glob arguments to ctx.scm.{all,affected}_files() functions for easier filtering
  • [ ] Filesystem sandboxing on MacOS
  • [ ] Windows sandboxing
  • [ ] Testing framework for checks

Contributing

⚠ The source of truth is at https://fuchsia.googlesource.com/shac-project/shac.git and uses Gerrit for code review.

See CONTRIBUTING.md to submit changes.