From a complexity perspective: It is much easier to ignore submodules.
Consider the following scenario:
Project Foo is licensed under Apache-2.0. Project Bar is a submodule of Project Foo, licensed under MIT. Both projects use a `.reuse/dep5` file to license some files.
If a tool wants to verify the compliance of Project Foo *including* the submodule, it has to:
- Detect that Bar is a submodule. - Separately (and recursively!) lint Bar. - Combine the results of that lint into the results of the lint of Foo.
Even more interesting scenario:
Project Foo has a .reuse/dep5 file that contains a glob that licenses everything in submodule Bar as being licensed under 0BSD. This used to be true, but Bar changed its licensing to MIT.
- Should the .reuse/dep5 implication be valid, even though everything in Bar is licensed under MIT these days? - If yes, should there be a LICENSES/0BSD.txt file in Bar? - Should there be a LICENSES/0BSD.txt file in Foo?
---
It's not impossible to handle submodules, but it does entail complexity that is unnecessary: You can simply independently lint the projects. For that reason, I'd rather that submodules are ignored for REUSE compliance.
The Specification could mention this to clarify.
Yours with kindness, Carmen