Add optional schema_type for data format hinting (ala CVE) #134

joshbuker · 2023-03-29T20:55:42Z

Revisiting #51

Fixes #30

joshbuker · 2023-03-29T21:13:12Z

This is particularly useful to distinguish things that look like OSV, but are actually custom. e.g. some of the entries in ruby-advisory-db

joshbuker · 2023-03-30T00:19:52Z

This change is fully backward compatible, as schema_type is unused historically and not required / can be ignored.

Signed-off-by: Josh Buker <[email protected]>

oliverchang · 2023-03-30T07:39:55Z

Hmm, I'm not sure if we arrived at any conclusions in #51. I recall that the main point against this is that there is no standardised way to label JSON fields with their type, so for a parser to detect what format a JSON entry is, it'd need specialised knowledge of the format anyway. If you get an OSV from somewhere (e.g. https://osv.dev, or a database export explicitly formatted as OSV), you already typically know that it's formatted as OSV. I can't think of a real use case where this would not be the case. Generally, communicating about types is also typically more the domain of the protocol serving these (e.g. mime types, file extensions).

For databases like GSD which deals with many different types of data, labelling the data can be done more consistenctly through the namespacing approach instead to wrap the various different types with the type it actually is.

I realise that this seems like a small change to be adding, but one of the guiding principles of OSV is to be minimal, and have each field serve intentional use cases that we've encountered.

joshbuker · 2023-03-30T12:37:23Z

This is primarily so that the data is explicit even without the context of a file server, i.e. the json/yaml can be parsed standalone without any doubts.

For example, if GSD extends the OSV schema to require fields such as summary, details, and schema_version, we would want to use a schema_type of OSV-GSD so that data could clearly be distinguished between OSV and the slightly expanded format.

This can definitely be solved with wrappers and server hinting. However in the scenario where those file servers shutdown and someone wants to do some archaeology on the archived json/yaml, this type of format hinting could be invaluable. It also reduces institutional knowledge required, by including that hinting in the data directly rather than service documentation. It also simplifies tooling, as you can dynamically scan data and deterministically validate it against the related schema (dependant of course on the schema providing declarative hinting, like CVE and CSAF - the only other common vuln id formats - currently do).

kurtseifried · 2023-04-05T04:05:32Z

To confirm has this been rejected?

joshbuker · 2023-04-19T02:53:57Z

@chrisbloom7 @oliverchang Checking in on status. Is this officially rejected?

oliverchang · 2023-04-19T07:13:33Z

@chrisbloom7 @oliverchang Checking in on status. Is this officially rejected?

I don't see any of the arguments against this in the original #51 addressed, so this is a no from me.

joshbuker · 2023-04-24T17:46:48Z

@chrisbloom7 @oliverchang Checking in on status. Is this officially rejected?

I don't see any of the arguments against this in the original #51 addressed, so this is a no from me.

@oliverchang What are the current arguments against, just for clarity?

I feel my earlier comment addresses the pushback on doing this exclusively with a wrapper/server hinting, and I have yet to get feedback on why OSV can't match what the other two formats (CVE & CSAF) are currently doing when it comes to format hinting.

oliverchang · 2023-04-26T22:26:24Z

The main point against is that this doesn't really offer any reliable way for an automated system to determine what format this is, because to make use of it you'd need to have pre-existing knowledge of the OSV format to begin with. This would ideally be addressed at a protocol level of some form.

For your use case of identifying random files lying around without any other metadata about what they are: the same can also likely be achieved just by running the JSON validator using the latest version of the schema and also making sure that all fields exist in the schema. We can likely also improve the JSON schema / validator to make this easier (e.g. setting additionalProperties to false among other things).

We've also not seen any other requests for this, across all of our existing producers and consumers of OSV data.

Adding fields to the OSV schema is very expensive, and we need to make sure every field added (even if they're optional) serves an intentional purpose.

joshbuker mentioned this pull request Mar 29, 2023

OSV-GSD Extended Schema CloudSecurityAlliance/gsd-tools#197

Closed

joshbuker added 2 commits March 29, 2023 17:56

Add optional schema_type for data format hinting (ala CVE)

fab176b

Signed-off-by: Josh Buker <[email protected]>

Add documentation for schema_type field

bf57f98

Signed-off-by: Josh Buker <[email protected]>

joshbuker force-pushed the schema/schema-type branch from bbf7b99 to bf57f98 Compare March 30, 2023 00:59

Use tabs instead of spaces for consistency

f70b688

Signed-off-by: Josh Buker <[email protected]>

oliverchang closed this Apr 26, 2023

joshbuker deleted the schema/schema-type branch April 27, 2023 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional schema_type for data format hinting (ala CVE) #134

Add optional schema_type for data format hinting (ala CVE) #134

joshbuker commented Mar 29, 2023 •

edited

Loading

joshbuker commented Mar 29, 2023

joshbuker commented Mar 30, 2023

oliverchang commented Mar 30, 2023

joshbuker commented Mar 30, 2023

kurtseifried commented Apr 5, 2023

joshbuker commented Apr 19, 2023

oliverchang commented Apr 19, 2023

joshbuker commented Apr 24, 2023

oliverchang commented Apr 26, 2023 •

edited

Loading

Add optional schema_type for data format hinting (ala CVE) #134

Add optional schema_type for data format hinting (ala CVE) #134

Conversation

joshbuker commented Mar 29, 2023 • edited Loading

joshbuker commented Mar 29, 2023

joshbuker commented Mar 30, 2023

oliverchang commented Mar 30, 2023

joshbuker commented Mar 30, 2023

kurtseifried commented Apr 5, 2023

joshbuker commented Apr 19, 2023

oliverchang commented Apr 19, 2023

joshbuker commented Apr 24, 2023

oliverchang commented Apr 26, 2023 • edited Loading

joshbuker commented Mar 29, 2023 •

edited

Loading

oliverchang commented Apr 26, 2023 •

edited

Loading