Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates #15475

findingrish · 2023-12-04T04:59:32Z

Description

The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This task encompasses addressing both realtime and finalized segments.

This modification specifically addresses the issue with realtime segments. Tasks will now routinely communicate the schema for realtime segments during the segment announcement process. The Coordinator will identify the schema alongside the segment announcement and subsequently update the schema for realtime segments in the metadata cache.

Design

Task

New method, Sink.getSignature to return the RowSignature of the Sink by coming the signature of each FireHydrant.
Periodically, the StreamAppenderator.SinkSchemaAnnouncer will compute sink schema changes and announce them to the DataSegmentAnnouncer.
New APIs have been introduced in DataSegmentAnnouncer to receive sink schema information and manage schema cleanup when a task is closed.
A new Pojo named SegmentSchemas has been added to facilitate the passing of schema information for multiple segments.
A new implementation of DataSegmentChangeRequest has been introduced, named SegmentSchemasChangeRequest.

Coordinator

Modifications have been made to the HttpServerInventoryView to handle schema information.
Schema Update Flow: HttpServerInventoryView -> CoordinatorServerView -> CoordinatorSegmentMetadataCache.
The CoordinatorSegmentMetadata cache has been updated to incorporate schema changes. Changes have also been made to the refresh logic to eliminate the need for executing segment metadata queries for realtime segments.

Testing

Added UTs.
Tested it locally with wikipedia dataset and kafka based ingestion.
Tested in a druid cluster, verified the datasource signature and numRows for realtime segments.

Potential side effects

None

Limitations

Currently, this feature doesn't work with zookeeper based segment announcement.

Upgrade considerations

The general upgrade order should be followed. The new code is behind a feature flag, so it is compatible with existing setups. Even if centralized datasource schema building (#14985) is enabled, realtime segments will be refreshed using segment metadata query to Indexer/Task.

This experimental feature aims to eliminate the necessity for periodically executing the SegmentMetadataQuery to the Indexer/Task for retrieving the schema of realtime segments. Presently, it is accessible through two feature flags and should only be enabled for Proof of Concept (PoC) or testing purposes. To activate it, configure the following settings in the common configurations: druid.centralizedDatasourceSchema.enabled and druid.centralizedDatasourceSchema.announceRealtimeSegmentSchema. It's important to note that the feature flag is temporary druid.centralizedDatasourceSchema.announceRealtimeSegmentSchema and will be removed in a subsequent update.

This PR has:

…ement step

…nnouncement

findingrish · 2023-12-11T10:48:39Z

Please note that the test failures are because of an uncovered noop method in BrokerSegmentMetadataCache.

…ature disabled

cryptoe

Need to go through UT's.

server/src/main/java/org/apache/druid/segment/metadata/CentralizedDatasourceSchemaConfig.java

server/src/main/java/org/apache/druid/server/coordination/DataSegmentAnnouncer.java

server/src/main/java/org/apache/druid/client/BrokerServerView.java

server/src/main/java/org/apache/druid/client/CoordinatorServerView.java

cryptoe · 2023-12-15T12:13:18Z

server/src/main/java/org/apache/druid/segment/realtime/appenderator/SinkSchemaUtil.java

+
+        if ((!Objects.equals(numRows, previousNumRows)) || (updatedColumns.size() > 0) || (newColumns.size() > 0)) {
+          publish = true;
+          delta = true;


Suggested change

delta = true;

cryptoe · 2023-12-15T12:13:28Z

server/src/main/java/org/apache/druid/segment/realtime/appenderator/SinkSchemaUtil.java

+            new SegmentSchema(
+                segmentId.getDataSource(),
+                segmentId.toString(),
+                delta,


Suggested change

delta,

true,

Delta can be either true or false. Hence cannot hardcode.

cryptoe · 2023-12-15T12:16:14Z

server/src/main/java/org/apache/druid/segment/realtime/plumber/Sink.java

  private final LinkedHashSet<String> dimOrder = new LinkedHashSet<>();
+  // columns excluding current index, includes __time column


what is curIndex, the inmemory Fire hydrant ? If yes lets rename it like that / or document it

cryptoe · 2023-12-15T12:17:10Z

server/src/main/java/org/apache/druid/segment/realtime/plumber/Sink.java

-        numRowsExcludingCurrIndex.addAndGet(segment.asQueryableIndex().getNumRows());
+        QueryableIndex index = segment.asQueryableIndex();
+        mergeIndexDimensions(new QueryableIndexStorageAdapter(index));
+        numRowsExcludingCurrIndex.addAndGet(index.getNumRows());


This is only for restoring tasks rite ?

cryptoe · 2023-12-15T12:27:02Z

server/src/main/java/org/apache/druid/segment/realtime/plumber/Sink.java

+  /**
+   * Merge the column from the index with the existing columns.
+   */
+  private void mergeIndexDimensions(StorageAdapter storageAdapter)


Suggested change

private void mergeIndexDimensions(StorageAdapter storageAdapter)

private void overWriteIndexDimensions(StorageAdapter storageAdapter)

cryptoe · 2023-12-18T10:29:12Z

server/src/main/java/org/apache/druid/segment/realtime/plumber/Sink.java

@@ -148,7 +159,9 @@ public Sink(
      maxCount = hydrant.getCount();
      ReferenceCountingSegment segment = hydrant.getIncrementedSegment();
      try {
-        numRowsExcludingCurrIndex.addAndGet(segment.asQueryableIndex().getNumRows());
+        QueryableIndex index = segment.asQueryableIndex();


Sink is also used for both realtime and batch. How do your changes affect batch ingestion ?

Additionally, I am maintaining set of columns in each sink. I am assuming this wouldn't be much of an overhead in batch ingestion.

...r/src/test/java/org/apache/druid/segment/realtime/appenderator/StreamAppenderatorTester.java

…ment announcement

cryptoe

Changes LGTM. Nit comments only.

cryptoe · 2024-01-08T04:54:45Z

server/src/main/java/org/apache/druid/segment/metadata/AbstractSegmentMetadataCache.java

  // Newest segments first, so they override older ones.
-  private static final Comparator<SegmentId> SEGMENT_ORDER = Comparator
+  protected static final Comparator<SegmentId> SEGMENT_ORDER = Comparator


Whats the change in this class apart from field access change ?

Visibility is changed for a couple of fields.

cryptoe · 2024-01-08T04:59:57Z

services/src/main/java/org/apache/druid/cli/CliPeon.java

@@ -219,11 +220,20 @@ public void configure(Binder binder)
            taskDirPath = taskAndStatusFile.get(0);
            attemptId = taskAndStatusFile.get(1);

+            if (Boolean.parseBoolean(properties.getProperty("druid.centralizedDatasourceSchema.enabled"))
+                && !properties.getOrDefault("druid.serverview.type", "http").equals("http")) {
+              throw new RuntimeException(


This should be druidException for cluster administrator no ?

cryptoe · 2024-01-08T05:02:51Z

services/src/main/java/org/apache/druid/cli/CliCoordinator.java

@@ -192,6 +194,12 @@ protected List<? extends Module> getModules()
    modules.add(JettyHttpClientModule.global());

    if (isSegmentMetadataCacheEnabled) {
+      if (!properties.getOrDefault(SERVERVIEW_TYPE_PROPERTY, "http").equals("http")) {
+        throw new RuntimeException(
+            "CentralizedDatasourceSchema feature is incompatible with Zookeeper based segment discovery. "


I think we should not mention zookeeper but just the value of properties.getOrDefault(SERVERVIEW_TYPE_PROPERTY). If tomorrow we add new eay of announcing serviewViews that we need not change this piece of code.

cryptoe · 2024-01-08T05:11:09Z

services/src/main/java/org/apache/druid/cli/CliCoordinator.java

@@ -153,7 +154,8 @@ public class CliCoordinator extends ServerRunnable
 {
  private static final Logger log = new Logger(CliCoordinator.class);
  private static final String AS_OVERLORD_PROPERTY = "druid.coordinator.asOverlord.enabled";
-  private static final String CENTRALIZED_SCHEMA_MANAGEMENT_ENABLED = "druid.centralizedDatasourceSchema.enabled";
+  private static final String CENTRALIZED_DATASOURCE_SCHEMA_ENABLED = "druid.centralizedDatasourceSchema.enabled";
+  private static final String SERVERVIEW_TYPE_PROPERTY = "druid.serverview.type";


Please push this config to ServerViewModule and then reference them from there.

cryptoe · 2024-01-10T03:26:11Z

@findingrish Thank you for the PR.

…rce Schema Building (#15817) Issue: #14989 The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Thereafter, we addressed the problem of publishing schema for realtime segments (#15475). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This is the final change which involves publishing segment schema for finalized segments from task and periodically polling them in the Coordinator.

findingrish added 14 commits November 28, 2023 13:49

Changes to publish realtime segment schema changes in segment announc…

43098ba

…ement step

Merge remote-tracking branch 'upstream/master' into schema_annoucement

095c7ea

Fix checkstyle issues

828d150

Revert changes in CombiningSequenceTest

10856d6

Minor changes

91582fc

Fix build

c0de275

Add javadocs

abeb6a0

minor change

fc6249b

Minor change

1c6af13

Update SegmentSchemas pojo, dev testing

e9b83d4

Add tests

fac850e

Remove forbidden apis

050c708

Remove forbidden api usage

a8a90a8

Fix tests

219e270

github-actions bot added Area - Querying Area - Segment Format and Ser/De Area - Ingestion labels Dec 4, 2023

findingrish mentioned this pull request Dec 4, 2023

Rename config for enabling CentralizedDatasourceSchema feature #15476

Merged

findingrish added 3 commits December 5, 2023 11:34

Update logic to build sink schema

6394616

Explicitly add __time column in Sink#getSignature

1315922

Remove forbidden api usage

f9edb58

github-actions bot removed the Area - Segment Format and Ser/De label Dec 5, 2023

findingrish added 8 commits December 5, 2023 12:08

Fix spotbug

3a00d9e

Merge remote-tracking branch 'upstream/master' into schema_annoucement

03a384d

Rename config for CentralizedDatasourceSchema feature

f32d390

Minor changes

e6cbc92

Fix checkstyle

6aac669

Merge remote-tracking branch 'upstream/master' into schema_annoucement

bae54c4

Merge remote-tracking branch 'upstream/master' into schema_annoucement

b5a656e

Add guardrail to prevent enabling the feature with zk based segment a…

2518495

…nnouncement

findingrish added 5 commits December 11, 2023 16:29

checkstyle

ec56554

Enable segment schema announcement for realtime segment in IT

c110e17

minor change

b0f8637

null executor service in StreamAppenderator#SinkSchemaAnnouncer if fe…

141158c

…ature disabled

Merge remote-tracking branch 'upstream/master' into schema_annoucement

92adfa3

cryptoe reviewed Dec 15, 2023

View reviewed changes

cryptoe reviewed Dec 18, 2023

View reviewed changes

findingrish added 2 commits January 3, 2024 12:30

Merge remote-tracking branch 'upstream/master' into schema_annoucement

63d43f5

Address feedback

1ffdf80

github-advanced-security bot found potential problems Jan 3, 2024

View reviewed changes

...r/src/test/java/org/apache/druid/segment/realtime/appenderator/StreamAppenderatorTester.java Fixed Show fixed Hide fixed

findingrish added 5 commits January 3, 2024 14:59

Minor change

74150f8

Throw exception in Peons if feature is enabled alongwith zk based seg…

2986317

…ment announcement

Merge remote-tracking branch 'upstream/master' into schema_annoucement

dc7b0ba

Minor changes

9316cb0

Rename method in DataSegmentAnnouncer

0513e80

cryptoe approved these changes Jan 8, 2024

View reviewed changes

findingrish added 3 commits January 8, 2024 21:31

Merge remote-tracking branch 'upstream/master' into schema_annoucement

2704d1e

Address feedback

9fcfb62

Add test to achieve coverage

4923414

cryptoe merged commit 71f5307 into apache:master Jan 10, 2024
83 checks passed

LakshSingla added this to the 29.0.0 milestone Jan 29, 2024

This was referenced Feb 1, 2024

Introduce Segment Schema Publishing and Polling for Efficient Datasource Schema Building findingrish/druid#4

Closed

Introduce Segment Schema Publishing and Polling for Efficient Datasource Schema Building #15817

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates #15475

Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates #15475

findingrish commented Dec 4, 2023 •

edited

Loading

findingrish commented Dec 11, 2023 •

edited

Loading

cryptoe left a comment

cryptoe Dec 15, 2023

cryptoe Dec 15, 2023

findingrish Jan 3, 2024

cryptoe Dec 15, 2023

cryptoe Dec 15, 2023

cryptoe Dec 15, 2023

cryptoe Dec 18, 2023

findingrish Jan 3, 2024

cryptoe left a comment

cryptoe Jan 8, 2024

findingrish Jan 8, 2024

cryptoe Jan 8, 2024

cryptoe Jan 8, 2024

cryptoe Jan 8, 2024

cryptoe commented Jan 10, 2024

		private final LinkedHashSet<String> dimOrder = new LinkedHashSet<>();
		// columns excluding current index, includes __time column

	private void mergeIndexDimensions(StorageAdapter storageAdapter)
	private void overWriteIndexDimensions(StorageAdapter storageAdapter)

Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates #15475

Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates #15475

Conversation

findingrish commented Dec 4, 2023 • edited Loading

Description

Design

Task

Coordinator

Testing

Potential side effects

Limitations

Upgrade considerations

findingrish commented Dec 11, 2023 • edited Loading

cryptoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoe commented Jan 10, 2024

findingrish commented Dec 4, 2023 •

edited

Loading

findingrish commented Dec 11, 2023 •

edited

Loading