Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BadRow not being created for illegal querystring #205

Open
dilyand opened this issue Dec 16, 2021 · 1 comment
Open

Fix BadRow not being created for illegal querystring #205

dilyand opened this issue Dec 16, 2021 · 1 comment

Comments

@dilyand
Copy link
Contributor

dilyand commented Dec 16, 2021

In #73 we introduced a new generic_error bad row, generated whenever a querystring could not be parsed. In #113 we fixed a bug, where the bad row that was being generated was not a valid BadRow.

This was all working as expected in versions until 2.3.1, inclusive. As of 2.4.0, the bad row is no longer generated when deploying a collector (I've tested with kinesis, pubsub and stdout) with the *.minimal.hocon configuration.

The bad row is supposed to be created here, after catching an exception when trying to parse it here.

In deployments of 2.4.0 and later, the exception is replaced with a warning; so it's not caught, and no bad row is created. The offending event is silently dropped, reverting to the behaviour from before #73.

We have a test for generating the bad row, which still passes. This suggests that the bug is not caused by changes in the collector service logic. Rather, it looks like we used to have some silent defaults in the config that changed with 2.4.0, resulting in changed behaviour. The two most suspicious changes are #162 and #163.

In #162, we upgraded akka-http from 10.1.12. In akka-http 10.2.0, there was a change that might have affected this behaviour: percent-encoding URI queries.

It's also possible that the JRE defaults, to which we switched in #163, have changed the behaviour when failing to parse a querystring to warn instead of throw.

Impact

Currently, when a collector is deployed with default configuration, no bad row is created and 400 is returned to the tracker. For trackers with retry logic, it means the event might get stuck in a retry loop. However, the most common source of illegal querystrings is events tracked via a webhook, where there is no retry logic, so that is not a top concern. The biggest issue is that users won't have a chance to diagnose faulty webhook tracking implementations.

@istreeter
Copy link
Contributor

I think the way to fix this is to implement the ParsingErrorHandler. And turn it on by adding to the application.conf see here.

@istreeter istreeter added this to the 2.6.0 milestone Feb 24, 2022
@peel peel removed this from the 2.6.0 milestone Apr 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants