Include unlicensed files in scanner results #9435
Labels
configuration
About configuration topics
enhancement
Issues that are considered to be enhancements
scanner
About the scanner tool
What is the existing functionality and how should it be enhanced?
Currently, the scanner does not include files without licenses.
Problems
Improvement
To make sure that all files are considered in the scanning phase, and that those can potentially be curated, I propose:
include-unlicensed: boolean
) to the scan phase to include in the objectscan_results.summary.licenses
all files that were found to NOT have a license. Since this happens when calling the scan phase, these results should be recorded in thescan-result.json
file.NONE
(or whatever is the default for an unknown license)What is the use-case for your enhancement?
Source SBOMs may need to include all files in a repo. At the moment, generation of SBOM includes also the files without license, but one cannot have the option to curate files that should have a specific license. By adding the flag
include-unlicensed: true
, the scanner includes unlicensed files in ORT scanning result and gives the possibility to developers to curate those files, if needed.As an example, projects with a single license at the top can enable this to include all files with
NONE
license, and apply a curation to all files that should have MIT license.I believe this is a quite common case, examples include the Elixir programming language (https://github.com/elixir-lang/elixir), Gleam
(https://github.com/gleam-lang/gleam), Django Web Framework (this shows an example of a file without license, so no license applied AFAIK), Rails Web Framework (Rails) where each folder contains the expected license that applies
Alternatives you have considered
I have a script that parses ORT scanner for files with licenses and all files with SHA1. Takes the set difference and adds the missing files to the corresponding scanner field with license
NONE
. This works, but I am not sure how maintainable it is in the future. It means I need to run ORT analysis and scanner, then run a custom script, then run the evaluator to get some results and apply curations.Additional context
--
The text was updated successfully, but these errors were encountered: