-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation, UI and examples #15
Comments
Thank you for the feedback! This is very valuable to me, as those are issues I didn't see myself since I'm blinded by already knowing how it's supposed to be used 😄 Empty properties/values should be an error or default to the hint text, and property/id comparision shouldn't be case sensitive (the second example has lowercase So there are three tasks here:
I will hopefully be able to fix the first one this week, so that at least there is an error if you enter invalid data. |
Please don't forget the limit options. Those seem to be important for me and should probably be a default so that the first dumps will only run a few minutes to check that things are as expected and than the limits can be relaxed. |
Limit and preview options would be great, there are a lot of empty requests that keep running for days on end just wasting resources because whoever requested them couldn't figure out how to use this tool. These things should be easy to implement as well |
Yesterday I came up with this query:
it takes 3.5 hours on my local copy of Wikidata see http://wiki.bitplan.com/index.php/WikiData_Import_2020-08-15. I'd love to have this as a regular dump e.g. monthly but I'd not know how to create a dump from a query. I think the dumper should be changed to accept SPARQL queries as input. |
Any news on this? would be something i'd love to specify. |
wdumper looks lik a very promising and potentially very helpful tool.
When trying out wdumper i was not able to achieve what i wanted. I had expected if i specifiy P31 "instanceof" and "Q13442814" https://www.wikidata.org/wiki/Q13442814 scholarly article that I'd get a dump with triples of all scholarly articles (hopefully with all their properties).
The dump ended up to be:
https://tools.wmflabs.org/wdumps/dump/414
took hours to be finished and included just 38 triples after processing 86949976 items.
So i tried again this time after seing P31 was not used so i ended up with:
https://tools.wmflabs.org/wdumps/dump/415
with same timing and result. I find this very frustrating since a simple "give me all entities of type xy1,xy2,xy3" should be straight forward. It would be great to have improved documentation, UI and examples and it would save a lot of waste processing time that others might want to use. A very importan factor should be "limit" options which make sure that for tests only a subset of the data and only a subset of the result can be specified to speed up the processing of finding out what a query should look like.
The text was updated successfully, but these errors were encountered: