PHOENIX-7434 Extend atomic support to single row delete with condition on non-pk columns #2008

virajjasani · 2024-10-15T22:18:15Z

PHOENIX-7411 introduces Atomic Delete support for the single row deletion. It returns result (row) back to the client (using new API) only if the given client initiated Delete operation deleted the row when the row was alive. In the context of atomicity, this means that the delete operation is already idempotent and will continue to be idempotent, however only the client that was able to successfully delete the row by taking lock on the live row (row without delete marker) is the only one that can return the old row result back.

PHOENIX-7411 support includes only single row if the WHERE clause contains only pk columns as single row point lookup plan. However, user can also add additional non-pk columns in the WHERE clause in addition to all pk columns, this also makes the Delete as single row delete only, but due to addition of non-pk columns, server side delete plan is used with auto-commit connections.

The sever side delete plans goes from DeleteCompiler initiating the Scan with attribute "_DeleteAgg" which allows UngroupedAggregateRegionObserver to execute Delete Mutations on the rows read by the UngroupedAggregateRegionScanner. This Mutation is used by IndexRegionObserver to perform the mutation at the server side.

Given that IndexRegionObserver already has the support for atomic single row delete as part of PHOENIX-7411, the proposal for this Jira is to introduce single row delete with condition on non-pk columns such that atomic delete returning the result can be used by UngroupedAggregateRegionObserver and then sent back to DeleteCompiler which can later on be consumed by the PhoenixStatement API similar to PHOENIX-7411.

…n on non-pk columns

…master

tkhurana · 2024-11-13T17:35:48Z

phoenix-core-client/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java

                                isUpsert = stmt instanceof ExecutableUpsertStatement;
                                isDelete = stmt instanceof ExecutableDeleteStatement;
+                                if (isDelete && connection.getAutoCommit() &&


The whole purpose of this is to identify if it is a ServerSelectDeleteMutation plan and then set an additional scan attribute. You can do this later too along with other checks for singleRowUpdate. You have the plan which has the statementcontext which has the scan. This way you don't have to introduce this specialized compilePlan method.

Sorry, scratch that comment.

tkhurana · 2024-11-13T18:03:03Z

...ore-server/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionScanner.java

+    private Result annotateCommitAndReturnResult(
+            UngroupedAggregateRegionObserver.MutationList mutations) throws IOException {
+        annotateDataMutations(mutations, scan);
+        if (isDelete || isUpsert) {
+            annotateDataMutationsWithExternalSchemaId(mutations, scan);
+        }
+        Result result = ungroupedAggregateRegionObserver.commitWithResultReturned(mutations,
+                indexUUID, indexMaintainersPtr, txState, targetHTable,
+                useIndexProto, clientVersionBytes);
+        mutations.clear();
+        return result;
+    }
+


I didn't like how the code has been duplicated with the regular path just to handle the specialized single row delete case.

The duplicated portion is very small

annotateDataMutations(mutations, scan); if (isDelete || isUpsert) { annotateDataMutationsWithExternalSchemaId(mutations, scan); }

Hence, I wanted to avoid creating small method but it makes sense to have common utility which does apply annotations to the list of mutations.
Done.

tkhurana · 2024-11-13T18:11:46Z

phoenix-core-client/src/main/java/org/apache/phoenix/compile/DeleteCompiler.java

+                    if (isSingleRowDelete) {
+                        return deleteRowAndGetMutationState(table);
+                    } else {
+                        Tuple row = iterator.next();


Why are we going out of the way to do a separate scan? We already know it is a singleRowDelete. We can take different actions based on that information on the tuple that is returned by the server.

Client expects only COUNT(*) result returned in Tuple at this place. Now that we need to return the row result, we need to break this mandatory translation of count result anyway right?

To be more specific, for single row atomic delete returning result case, the server is going to return the entire row and not count value as non-aggregate CF:CQ format, hence Tuple row = iterator.next() just does not work, it throws error because the iterator#next expects Cell result to contain only non-aggregate CF:CQ (s:s).

tkhurana · 2024-11-14T01:40:00Z

@virajjasani The changes look functionally correct but I didn't like how we are creating a specialized path in the code just for this use case. I would like it better if we can implement it in a much cleaner way without creating so many if/else's. Maybe @kadirozde has some ideas.

virajjasani · 2024-11-16T06:18:31Z

@virajjasani The changes look functionally correct but I didn't like how we are creating a specialized path in the code just for this use case. I would like it better if we can implement it in a much cleaner way without creating so many if/else's. Maybe @kadirozde has some ideas.

I agree that having separate path is making it difficult to comprehend but when i looked at how things are implemented for server side delete workflow, i didn't feel we could do much because this path of server side delete is all about scanner performing delete at server side and then returning count of rows deleted. This itself looks a bit hacky to me in the first place because client expects the result with single cell containing count value with non-aggregate CF:CQ as s:s.

Now, our goals are:

We should not change the above hacky behavior because we need to maintain old client compatibility. Moreover, we might cause regression if we try to change the client aggregation interface to manage both "count(*)" and "entire row" return cases.
We need server to return full row as result for the atomically deleted row, and let the client parse the result correctly and return to user as new PhoenixStatement API.

Therefore, we need special code path that changes both client and server for the specific "single row atomic delete" case. I don't think there is any other way we can achieve both of above goals without the risk of causing some regression with old and new clients.

…master

virajjasani added 2 commits October 15, 2024 15:16

PHOENIX-7434 Extend atomic support to single row delete with conditio…

e61b437

…n on non-pk columns

index test

32ce0ad

virajjasani requested review from tkhurana, palashc and kadirozde November 11, 2024 07:02

Merge branch 'master' of github.com:apache/phoenix into PHOENIX-7434-…

830b32d

…master

tkhurana reviewed Nov 13, 2024

View reviewed changes

virajjasani added 2 commits November 16, 2024 12:10

refactor annotation

90f6895

Merge branch 'master' of github.com:apache/phoenix into PHOENIX-7434-…

18dc84a

…master

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PHOENIX-7434 Extend atomic support to single row delete with condition on non-pk columns #2008

PHOENIX-7434 Extend atomic support to single row delete with condition on non-pk columns #2008

virajjasani commented Oct 15, 2024

tkhurana Nov 13, 2024

tkhurana Nov 13, 2024

tkhurana Nov 13, 2024

virajjasani Nov 16, 2024

tkhurana Nov 13, 2024

virajjasani Nov 14, 2024

virajjasani Nov 16, 2024

tkhurana commented Nov 14, 2024

virajjasani commented Nov 16, 2024 •

edited

Loading

PHOENIX-7434 Extend atomic support to single row delete with condition on non-pk columns #2008

Are you sure you want to change the base?

PHOENIX-7434 Extend atomic support to single row delete with condition on non-pk columns #2008

Conversation

virajjasani commented Oct 15, 2024

tkhurana Nov 13, 2024

Choose a reason for hiding this comment

tkhurana Nov 13, 2024

Choose a reason for hiding this comment

tkhurana Nov 13, 2024

Choose a reason for hiding this comment

virajjasani Nov 16, 2024

Choose a reason for hiding this comment

tkhurana Nov 13, 2024

Choose a reason for hiding this comment

virajjasani Nov 14, 2024

Choose a reason for hiding this comment

virajjasani Nov 16, 2024

Choose a reason for hiding this comment

tkhurana commented Nov 14, 2024

virajjasani commented Nov 16, 2024 • edited Loading

virajjasani commented Nov 16, 2024 •

edited

Loading