Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.2] Extend field length of route in finder_link #44414

Closed
wants to merge 5 commits into from

Conversation

MacJoom
Copy link
Contributor

@MacJoom MacJoom commented Nov 5, 2024

Pull Request for Issue #44413

Summary of Changes

Extend field length of field 'route' to varchar 500 in finder_links
Since the field is based on the article alias (varchar 400) the route field must be longer since it includes the route and the alias
After the manual update of the field the longest route was 484.

Testing Instructions

Use a website with article aliases close to 400 characters

Actual result BEFORE applying this Pull Request

Goto to Components, Smart Search, Index - click Index to run the indexing process
Process fails with "Data too long for column 'route' at row 1" at some point

Expected result AFTER applying this Pull Request

Index is built completly

Link to documentations

Please select:

  • Documentation link for docs.joomla.org:

  • No documentation changes for docs.joomla.org needed

  • Pull Request link for manual.joomla.org:

  • No documentation changes for manual.joomla.org needed

@Quy
Copy link
Contributor

Quy commented Nov 5, 2024

See #43545 for the suggested fix.

@MacJoom
Copy link
Contributor Author

MacJoom commented Nov 5, 2024

See #43545 for the suggested fix.

What would be the fix? No custom fields involved in my case.

@Quy
Copy link
Contributor

Quy commented Nov 5, 2024

I think it is two separate issues but related. Hopefully @Hackwar will have a solution for the initial issue.

@brianteeman
Copy link
Contributor

yes it is poor practice to have such a long title
but if you do then it is illogical for the two fields to have the same length as route will always be longer than title

@MacJoom
Copy link
Contributor Author

MacJoom commented Nov 6, 2024

I migrated a Joomla 3 website with over 88,000 articles, and the issue appears in 284 of them. However, this problem could occur on any site, as users can copy and paste up to 400 characters into the title/alias fields. This solution is pragmatic; otherwise, we would have to truncate the title/alias, which would be a backward compatibility break and would require additional code and communication to the user.

@brianteeman
Copy link
Contributor

I have tested this item ✅ successfully on 405a214

obvious fix


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/44414.

@richard67
Copy link
Member

@MacJoom I've tested this PR with success. But on PostgreSQL we still can't save an article with an alias close to 400 characters because in PostgreSQL the alias field is still limited to 255 characters:
https://github.com/joomla/joomla-cms/blob/5.2-dev/installation/sql/postgresql/extensions.sql#L154

The alias field has been increased from 191 to 400 characters for MySQL with PR #9131 by @wilsonge . Before that, it was limited to 191 due to the utf8mb4 implementation at that time.

Should we clean that up with this PR here and increase the alias column to 400 for PostgreSQL? Or should we do that with a separate PR?

@MacJoom
Copy link
Contributor Author

MacJoom commented Nov 11, 2024

@richard67 – Thanks for your input. Consistency between PostgreSQL and MySQL is essential. I'm not very experienced with creating and testing for PostgreSQL. Since this also affects another table (content), could you do a separate PR for it?

@richard67
Copy link
Member

@richard67 – Thanks for your input. Consistency between PostgreSQL and MySQL is essential. I'm not very experienced with creating and testing for PostgreSQL. Since this also affects another table (content), could you do a separate PR for it?

@MacJoom Yes, or maybe an issue first in case I don't find the time for a PR soon.

@richard67
Copy link
Member

I have tested this item ✅ successfully on 4dedd60

I've tested this PR with success. The error "Data too long for column 'route' at row 1" is fixed.

But on PostgreSQL it is still not possible to save an article with an alias close to 400 characters because in PostgreSQL the alias field is still limited to 255 characters:
https://github.com/joomla/joomla-cms/blob/5.2-dev/installation/sql/postgresql/extensions.sql#L154

The alias field has been increased from 191 to 400 characters for MySQL with PR #9131 by @wilsonge . Before that, it was limited to 191 due to the utf8mb4 implementation at that time. The size of the field in PostgreSQL has not been changed for ages so that was already 255 at that time.

I will later create either an issue or a PR for that.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/44414.

@richard67
Copy link
Member

RTC


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/44414.

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label Nov 11, 2024
@Hackwar
Copy link
Member

Hackwar commented Nov 12, 2024

I understand what you want to do, but to be honest, I disagree with these gigantic URLs. A URL of 250+ chars seems very wrong and in this case we actually are talking about the non-SEF URL, so simply by using menu items (1024 chars), an alias for the article (400 chars) and the category path (400 chars), we are talking about a potential length of the URL of 1824 chars plus suffix, additional query elements, slashes which I didn't count and the domain itself... When searching for maximum length of a url, the first result claims that the maximum length is 2048 chars, but for SEO it shouldn't really be more than 75... I would shorten the URLs to fit the current 400 chars and to rather shorten the aliases everywhere...

In any case I don't think this is a bugfix and thus shouldn't go into 5.2. Could we move this to 5.3?

@MacJoom
Copy link
Contributor Author

MacJoom commented Nov 13, 2024

I’m not a fan of the long URLs either, but this should have been considered before setting the alias limit up to 400 characters (lucky PostgreSQL users... - the expand was not done with PostgreSQL). Now, we have this frustrating bug where indexing in Smart Search fails due to overly long aliases. Pushing this to version 5.3 won’t improve the situation but would leave alone frustrated users for a longer period. It is clearly a bug which should be fixed ASAP. If you gonna short the alias or the title you have to inform the users and it would be a b/c break - it would need to go into 6.0

@Hackwar
Copy link
Member

Hackwar commented Nov 13, 2024

The thing is that as you can see above, even setting this to 500 wouldn't be enough. DOCman for example is storing far longer URLs by default with the alias and the category path as well, resulting in potentially URLs of 2000 chars. Raising this from 400 to 500 is just a bandaid and no real solution. I'd say the proper solution would be better error management in our indexer to report back which item created an issue and to not fail the whole indexing because of one broken item.

@MacJoom
Copy link
Contributor Author

MacJoom commented Nov 13, 2024

484 was enough in my case for 88000 articles.

@rdeutz
Copy link
Contributor

rdeutz commented Nov 13, 2024

We can do both extend the field and work on the error handeling, it's really odd that it fails completly when one data set is wrong.

@MacJoom MacJoom closed this Nov 13, 2024
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants