-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chart by package version #132
Comments
That would be nice. We would need to collect download numbers per version though, which we don't. |
Oh, I see. I was assuming that since nuget.org shows the per-version download numbers that you'd have access to that data as well. |
@MarkPflug it might be available by one of nuget.org's API but right now we hit a single package (not a version of it) per day once, and get the total number (across all versions). So we need to change that. That said unless we can fetch the whole thing with a single hit to their API, it likely will need some redesign on the job, it takes 1 or 2 hours to go through the 220000+ packages right now. Probably a good chance to simplify the backend. |
You should be able to get the downloads by version using your current approach. The search response contains a breakdown of downloads by version. For example: https://azuresearch-usnc.nuget.org/query?q=packageid:Newtonsoft.Json&take=1 {
...
"totalHits": 1,
"data": [
{
...
"id": "Newtonsoft.Json",
"version": "12.0.3",
"totalDownloads": 824781418,
...
"versions": [
{
"version": "3.5.8",
"downloads": 586170,
"@id": "https://api.nuget.org/v3/registration5-semver1/newtonsoft.json/3.5.8.json"
},
...
{
"version": "12.0.3",
"downloads": 83014646,
"@id": "https://api.nuget.org/v3/registration5-semver1/newtonsoft.json/12.0.3.json"
}
]
}
]
} |
You should be able to get the downloads by version by calling nuget-trends/src/NuGetTrends.Scheduler/DailyDownloadWorker.cs Lines 189 to 194 in d114dc6
FYI, the method is async but it doesn't do anything expensive like additional web requests when using the V3 protocol (see this). P.S. Nice CSV library @MarkPflug :) |
Thanks for the pointers @loic-sharma. The only question left is: Do we want to do that in the current architecture? I wonder how much more data per day we'll be dumping into pgsql. @clairernovotny mentioned the foundation can host the site on Azure so maybe we can use blob storage to dump these numbers given they are immutable, or some other strategy. We can probably get rid of rabbitmq too which is used only to queue the batch of ids to hit nuget.org. Some other way to have reentrancy would be needed so we can restart the job not having to start from the beginning. |
It would be nice if the graph could be narrowed down to a specific version of the package. Maybe even a stacked-bar chart with version. If this was paired with an x-axis overlay of version releases (via a colored vertical line) it would help visualize how quickly new versions are adopted and how much old versions are still in use.
The text was updated successfully, but these errors were encountered: