Don't rely on .sqlite metadata #97

dralley · 2023-02-01T17:27:46Z

Currently mdapi is the only consumer of .sqlite metadata. Previously it was used by "yum" on EL7, but yum is perfectly capable of working without it anyway. DNF does not use it. The "repoview" tool used it, but it has been defunct for years, and AFAIK hasn't been packaged in Fedora since Fedora 27.

This blocks the discontinuation of .sqlite metadata https://pagure.io/releng/issue/10745.

The possible approaches:

Process the XML directly. This might not be as performant or light on memory as querying the sqlite databases so it may not be a good option.
Create the sqlite metadata from the XML metadata using sqliterepo_c if it is not present
Always create the sqlite metadata from the XML metadata using sqliterepo_c, this would make the code simpler over the long term as you would expect repos that provide sqlite metadata to gradually become a minority
Process the XML ourselves and create a single combined sqlite database from it manually, this might simplify some things (only one file to track) and make new APIs possible but would perhaps be the most work. The only reason not to do this is that sqliterepo_c already exists

The text was updated successfully, but these errors were encountered:

dralley · 2023-02-03T04:21:23Z

@t0xic0der I'd give implementing this a shot, but I'm curious if you have an preference on the particular approach to take?

dralley · 2023-02-15T14:31:22Z

@t0xic0der I would like your input please ^^

gridhead · 2023-02-28T04:38:11Z

@dralley,

Hi, you have my sincere apologies for getting back to you late.

We would want to really make sure that the API remains as lightweight and fast in operation as possible so any approaches that could potentially slow it down, are something we would not want to take.

I am assigning this issue ticket to you, to begin with. You can find me on the internal slack platform for a more synchronous conversation and we can take the discussion around the details forward there.

dralley · 2023-02-28T04:55:41Z

Cool, I can do that. That rules out option 1, but the latter 3 options should be equivalent or very nearly equivalent to what exists currently.

gridhead · 2023-02-28T04:57:48Z

Perfect, I would let you pick which one of the latter 3 approaches would suit the stated conditions properly.

Thank you for taking this up.

dralley · 2023-02-28T05:05:07Z

I will probably go with option 3, in that case. Thanks!

Is the updating of databases a bottleneck at all? That is to say, generating the sqlite metadata locally will add runtime during that step, but as it is separate from the actual queries taking place it may not matter.

gridhead · 2023-02-28T05:24:48Z

Fetching those databases is not a bottleneck. Or we ensure that it does not end up becoming one as we run the service for fetching the databases as a periodic job and only when we have confirmed that the database file has been downloaded successfully by verifying the hashes, that we replace the existing database with a new one.

Failing to download a database successfully would only lead to the incompletely downloaded database file being disposed of and the last instance of the successfully fetched database would be retained and read from by the backend. As we are downloading a lot of databases for a bunch of branches, the only time this ends up being a bottleneck is the first time of its execution.

gridhead assigned dralley Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't rely on .sqlite metadata #97

Don't rely on .sqlite metadata #97

dralley commented Feb 1, 2023 •

edited

Loading

dralley commented Feb 3, 2023

dralley commented Feb 15, 2023

gridhead commented Feb 28, 2023

dralley commented Feb 28, 2023

gridhead commented Feb 28, 2023

dralley commented Feb 28, 2023

gridhead commented Feb 28, 2023

Don't rely on .sqlite metadata #97

Don't rely on .sqlite metadata #97

Comments

dralley commented Feb 1, 2023 • edited Loading

dralley commented Feb 3, 2023

dralley commented Feb 15, 2023

gridhead commented Feb 28, 2023

dralley commented Feb 28, 2023

gridhead commented Feb 28, 2023

dralley commented Feb 28, 2023

gridhead commented Feb 28, 2023

dralley commented Feb 1, 2023 •

edited

Loading