Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature suggestion] HLTB link suggestion for the missing matches #91

Open
enchained opened this issue Mar 6, 2020 · 1 comment
Open
Labels
Scrapers Issues related to scraping games, stores and external content Suggested To do in backlog.rip Those issues are good ideas to make on the rework

Comments

@enchained
Copy link

Hi. I'm not sure about how are you matching the HLTB data, but HLTB+Steam Reviews data combination is a very important thing to me, sadly a lot of games don't have that data on your site, for example, GRIS https://steam-backlog.com/game/gris. I suppose it can be a matching problem, and one way to solve it would be having a report form for the game. It could contain a field for the HLTB link, and you could report other issues about the game there too. Since it'll require passing premoderation to be made public, it would be nice if the data was added just to that users db when he made the request, so he can work with sorting and filtering right away.

Another way to solve a problem would be to adjust the matching system - in case there is no direct match, use a fuzzy search algorithm, take the top result, parse the page from HLTB and compare things like Alias, Year, Developer, Publisher.

Also a nice Steam-HLTB matches database already exist: https://www.howlongtobeatsteam.com/
I'm not sure how active it is nowadays, and will the author be fine with someone parsing it, and it shows only the entered user library matches. I wrote to the creator asking about parsing just in case.
Other than auto-parsing by user ids, maybe something can be done to import a json from the site's internal API network response, so users could just paste it to import they own library matches as list of requests to be added to your database.

@gsabater
Copy link
Owner

gsabater commented Mar 8, 2020

Hello @enchained

Data from HLTB comes from one of my partners, and in fact the method used to populate that information is from users who enter the HLTB ID manually. That is why not all games have that information available.

You suggest a report form page for games where missing data can be filled, and I also think that this is the best option to fix some data like missing HLTB links, wrong release dates and some others. I have wanted to make this page for so long, but never ended doing it because I don't have many active users who would spend their time filling forms.

The only issue with this solution would be what you also suggest, to add this data to the user before adding it to the master database. It could be done and I have an idea of how to make it, but it will definitely take some time.

Finally, I contacted the guy behind howlongtobeatsteam about two years ago. And even he was cool and things started to run, it finally got cold and we parted ways. But I ended working with his database and it didn't quite fit the project.

Thanks a lot for your feedback, even if I haven't helped you too much giving you a time or a confirmation of how to solve this issue, I really appreciate the interest and I'm going to think about this. I will leave the issue open for a while.

Thanks

@gsabater gsabater added Scrapers Issues related to scraping games, stores and external content Suggested labels Mar 8, 2020
@gsabater gsabater added the To do in backlog.rip Those issues are good ideas to make on the rework label Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scrapers Issues related to scraping games, stores and external content Suggested To do in backlog.rip Those issues are good ideas to make on the rework
Projects
None yet
Development

No branches or pull requests

2 participants