| # Creating static analyzers for Fuchsia |
| |
| Shac (Scalable Hermetic Analysis and Checks) is a unified and ergonomic tool and |
| framework for writing and running static analysis checks. The tool’s source can |
| be found in the [shac-documentation]. Shac checks are written in |
| [Starlark]. |
| |
| ## Setup |
| |
| Shac script implementations live in Fuchsia’s `//scripts/shac` directory. |
| |
| * A shac check is implemented as a starlark functions which takes a ctx |
| argument. Use this ctx argument to access the shac standard library. |
| * If your check is language specific, it should go in one of the language |
| specific files (Eg: `rust.star`, `go.star`, `fidl.star`). If it’s language |
| specific but does not have a `language.star` file, then create one. If it’s generic, |
| use `title.star` (where title is the name of the check function). |
| |
| ### Simple Example |
| |
| The following example is a static analyzer on all files that creates a |
| non-blocking, gerrit warning comment on changes where the string “http://” |
| exists, pointing the user to use “https://” instead. |
| |
| ```python |
| def http_links(ctx): |
| for path, meta in ctx.scm.affected_files().items(): |
| for num, line in meta.new_lines(): |
| matches = ctx.re.allmatches(r"(http://)\w+", line) |
| if not matches: |
| continue |
| for match in matches: |
| ctx.emit.finding( |
| message = "Avoid http:// links, prefer https://", |
| # Change to "error" if the check should block presubmit. |
| level = "warning", |
| filepath = path, |
| line = num, |
| col = match.offset + 1, |
| end_col = match.offset + 1 + len(match.groups[1]), |
| replacements = ["https://"], |
| ) |
| ``` |
| |
| Learn more about shac’s implementation of [emit.findings]. |
| |
| Note: Shac does not automatically discover checks. In order for a check to run, |
| a check function must be passed to `shac.register_check()` in |
| `//scripts/shac/main.star`: |
| |
| ```python |
| load("./http_links.star", "http_links") # NEW |
| |
| ... |
| |
| def register_all_checks(): |
| ... |
| shac.register_check(http_links) # NEW |
| ... |
| ``` |
| |
| Note: When implementing a new check in a file that already contains other |
| checks, you may be able to register the new check within that file. For |
| example, `//scripts/shac/fidl.star` has a `register_fidl_checks()` function |
| that gets called from `//scripts/shac/main.star`. Add new FIDL |
| checks to `fidl.star` and register them in the `register_fidl_checks()` |
| function in the same file. |
| |
| ### Advanced example |
| |
| Using a subprocess is useful if there’s an existing tool that does the check or |
| if the logic of the check is complex (e.g. more than just a substring search). |
| Starlark is intentionally feature-limited to encourage writing complicated |
| business logic in a self-contained tool with its own unit tests. |
| |
| The following is an example of a JSON formatter implemented in a separate Python |
| script and run as a subprocess. |
| |
| Rather than rewriting badly formatted files, the check computes the formatted |
| contents and passes them to the `replacements` argument of the |
| `ctx.emit.finding()` function. All formatting checks must be implemented this |
| way, for the following reasons: |
| |
| * Subprocesses run by checks are not allowed to write to files in the checkout |
| directory. This prevents badly behaved tools from making unexpected changes, and |
| ensures that it's safe to run multiple checks in parallel without risking race |
| conditions. (Note that filesystem sandboxing is only enforced on Linux). |
| * Shac is designed to integrate easily with other automation that needs to |
| propose the change to the user (e.g. in Gerrit) rather than automatically |
| applying the change, so in order for these use cases to work the diff must be |
| passed into shac rather than applied by a subprocess. |
| |
| ```python |
| import json |
| import sys |
| |
| |
| def main(): |
| # Accepts one positional argument referring to the file to format. |
| path = sys.args[1] |
| with open(path) as f: |
| original = f.read() |
| # Always use 2-space indents and a trailing blank line. |
| formatted = json.dumps(json.loads(original), indent=2) + "\n" |
| if formatted == original: |
| sys.exit(0) |
| else: |
| print(json.dumps(doc, indent=2) + "\n") |
| sys.exit(1) |
| |
| |
| if __name__ == "__main__": |
| main() |
| ``` |
| |
| ```python |
| load("./common.star", "FORMATTER_MSG", "cipd_platform_name", "get_fuchsia_dir", "os_exec") |
| |
| def json_format(ctx): |
| # Launch processes in parallel. |
| procs = {} |
| for f in ctx.scm.affected_files(): |
| if not f.endswith(".json"): |
| continue |
| # Call fuchsia-specific `os_exec` function instead of |
| # `ctx.os.exec()` to ensure proper executable resolution. |
| # `os_exec` starts the subprocess but does not block. |
| procs[f] = os_exec(ctx, [ |
| "%s/prebuilt/third_party/python3/%s/bin/python3" % ( |
| get_fuchsia_dir(ctx), |
| cipd_platform_name(ctx), |
| ), |
| "scripts/shac/json_format.py", |
| f, |
| ]) |
| |
| for f, proc in procs.items(): |
| # wait() blocks until the process completes. |
| res = proc.wait() |
| if proc.retcode != 0: |
| ctx.emit.finding( |
| level = "error", |
| filepath = f, |
| # FORMATTER_MSG is the standard message for formatters |
| # in fuchsia.git. |
| message = FORMATTER_MSG, |
| # json_format.py prints the formatted file contents to stdout. |
| # Passing it to `replacements` is necessary for shac to know |
| # how to apply the fix. |
| replacements = [res.stdout], |
| ) |
| |
| # TODO: call this somewhere |
| shac.register_check(shac.check( |
| json_format, |
| # Mark the check as a formatter. Only checks with `formatter = True` |
| # get run by `fx format-code`. |
| formatter = True, |
| )) |
| |
| ``` |
| |
| ##### Performance optimization |
| |
| Some formatters have built-in support for validating the formatting of many |
| files at a time, which is often parallelized internally and therefore much |
| faster than launching a separate subprocess to check every file. In this case, |
| you can run the formatter once on all files in "check" mode to get a list of |
| badly formatted files, and then iterate over only the badly formatted files to |
| get the formatted result (as opposed to iterating over all files). |
| |
| Example: for [rustfmt] first run `rustfmt --check --files-with-diff |
| <all rust files>` to get a list of badly formatted files, then run `rustfmt` |
| separately on each file to get the formatted result. |
| |
| If the formatter does not have a dry-run mode to print the formatted result to |
| `stdout`: The formatter subprocesses will not be able to write to the checkout. |
| However, some formatters unconditionally write files. In this case, you'll need |
| to copy each file into a tempdir, to which the subprocess can write, format the |
| temp file, and report its contents, as an example see [buildifier]. |
| |
| By default, `os_exec` raises an un-recoverable error if the subprocess produces |
| a nonzero return code. If non-zero return codes are expected, you can use the |
| ok_retcodes parameter, e.g. `ok_retcodes = [0, 1]` may be appropriate if the |
| formatter produces a return code of 1 when the file is unformatted. |
| |
| ### Locally running checks |
| |
| During local check development it’s recommended to test your check by running |
| shac directly via `fx host-tool shac check <file>`. Let’s create a scenario in |
| which we can test the `http_links` check described above: |
| |
| 1. Find a file that currently violates the check, or create a new one if one |
| doesn't exist, eg: `echo "http://example.com" > temp.txt` |
| 1. `fx host-tool shac check --only http_links temp.txt` |
| * This should fail and print the file contents with "http://" highlighted |
| * `--only` causes shac to only run the http_links check, excluding other |
| checks because in this instance we only care about testing http_links and |
| don't care about results from other checks |
| 1. `fx host-tool shac fix --only http_links temp.txt` should change the http:// |
| to https:// |
| 1. `fx host-tool shac check --only http_links temp.txt` Should now pass |
| 1. `fx host-tool shac check --only http_links --all` |
| * Runs on all files in the tree (except git-ignored or ignored in |
| `//shac.textproto`), not just changed files |
| * If this fails with errors, then you'll need to fix those errors in the |
| offending files either in the same commit or in a separate commit |
| (preferable if there are more than ~10 files to fix) before landing your |
| check. |
| * Alternatively, land the check as non-blocking, fix the errors, then |
| switch it to blocking |
| * If your check emits warnings, note how many warnings there are. If there |
| is a very large number (more than 100s) this will lead to many noisy |
| Gerrit comments and may be disruptive to other contributors. Consider doing a |
| bulk fix-up beforehand, reducing the scope of the check or reconsidering the |
| check’s usefulness. |
| 1. Finally, upload your check to Gerrit, run pre-submit, examine the failures |
| with the goal of 0 failures. (Presubmit’s behavior is the same as running `fx |
| host-tool shac check --all`) |
| |
| It is recommended that you document your check if it is opt-in (not run in pre-submit) or there's a non-obvious |
| opt-out mechanism. All documentation should be added to `//docs/development/source_code/presubmit_checks.md` |
| |
| <!-- Reference links --> |
| |
| [starlark]: https://bazel.build/rules/language |
| [emit.findings]: https://fuchsia.googlesource.com/shac-project/shac/+/HEAD/doc/stdlib.md#ctx_emit_finding |
| [shac-documentation]: https://fuchsia.googlesource.com/shac-project/shac/+/refs/heads/main/doc/stdlib.md |
| [rustfmt]: https://cs.opensource.google/fuchsia/fuchsia/+/main:scripts/shac/rust.star |
| [buildifier]: https://cs.opensource.google/fuchsia/fuchsia/+/main:scripts/shac/starlark.star;l=7 |