Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sitemap ping failing #3780

Open
veganstraightedge opened this issue Feb 26, 2024 · 7 comments
Open

Sitemap ping failing #3780

veganstraightedge opened this issue Feb 26, 2024 · 7 comments
Labels
📣 reach Everything needed to reach more people and get easily find online

Comments

@veganstraightedge
Copy link
Contributor

Seen in output from:

heroku releases:output --remote heroku
Pinging with URL 'https://crimethinc.com/sitemap.xml.gz':
Ping failed for Google: #<OpenURI::HTTPError: 404 Sitemaps ping is deprecated. See https://developers.google.com/search/blog/2023/06/sitemaps-lastmod-ping.> (URL http://www.google.com/webmasters/tools/ping?sitemap=https%3A%2F%2Fcrimethinc.com%2Fsitemap.xml.gz)
@just1602
Copy link
Collaborator

Is it possible with the heroku CLI to list the files in the dyno public/ directory?

@just1602
Copy link
Collaborator

just1602 commented Jun 6, 2024

The sitemap_generator gem doesn't seem to be maintained anymore. The right solution would probably be to generate our own sitemap with a template, like we do for the atom feed.

I just don't know if we should do it in a rake task and save it on disk like the gem does or expose a endpoint that we would cache.

@just1602
Copy link
Collaborator

just1602 commented Jun 7, 2024

I was thinking about that today, and we should probably move the sitemap generation in a background job that is trigger at deployment, but also every time we publish or update all type of content. Because if you read the developers.google.com page in the warning, they said that if the sitemap lastmod attribute isn't up-to-date and accurate, they'll stop trusting it.

@bensheldon
Copy link
Contributor

bensheldon commented Jun 7, 2024

Just an idea (and not entirely trivial), but I've been wanting to convert my personal sites from that gem to something like this: https://www.johnnunemaker.com/rails-easy-sitemaps/

@just1602
Copy link
Collaborator

just1602 commented Jun 7, 2024

My fear was that it would be a slow everyone, but I guess I can pour some caching in the template base on the lastmod value.

Otherwise, I'm not sure to understand why it needs a sitemap of sitemap (the index and pages actions) by I really like the general idea.

@bensheldon
Copy link
Contributor

I'm not sure to understand why it needs a sitemap of sitemap (the index and pages actions)

A single sitemap file is only allowed to hold a maximum of 50k URLs, so (for an arbitrarily large/growing site) it is necessary to break down into multiple sitemap files plus an index file to reference the multiple sitemap files. The idea of breaking the sitemap files down by month is so the index can be generated without having to tablescan every record to do a numeric group-by or something that would require checking the presence of a record to generate the index.

My fear was that it would be slow

Same, same 🤗 They could even be cached with a 1-day TTL and be no worse than really than the sitemap_generator's static sitemap.

@just1602
Copy link
Collaborator

just1602 commented Jun 7, 2024

Thanks for the clarification @bensheldon ! That totally make sense, I'll really try to give this a try unless you have some spare time, I won't be able to tackle this super soon. 😃

@just1602 just1602 added the 📣 reach Everything needed to reach more people and get easily find online label Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📣 reach Everything needed to reach more people and get easily find online
Projects
None yet
Development

No branches or pull requests

3 participants