tree eb2e9e9e09de0b9db8adbfbd9f94d97b2664c1dc
parent f20db3b6f339e35246aac1036d696aec48edd6fa
author Bill Neubauer <wcn@google.com> 1644893008 -0800
committer Bill Neubauer <wcn@google.com> 1647470327 -0700

Modify the classfier to use the new asset layout for input files.

assets was created in cl/427532245 creating a directory structure to more
easily manage content extensibly. The structure of a filename is now

Category/Name_of_content/name_of_variant.txt

Accordingly, all licenses are in License, headers are in Header, and new
content types can be made just by adding the directory in assets. All variants
in a content directory are a match on the content, which gets its name from the
directory.

Future work will involve placing pristine license copies (called pristine.txt)
in each directory, so we have a clear reference copy that can be made available
for other purposes. This solves an issue where modified copies of licenses were
getting pulled into manifests for cloud projects.

The changes to the code replace the filename parsing scheme with the new
directory-based scheme. This also fixes a problem with LicenseRef-iccjpeg in
that the original license file didn't end with .txt and wasn't actually picked
up by the classifer. expected.txt was modified to pick up these new hits.

PiperOrigin-RevId: 428660046
