Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding pause/resume functionality for document lambdas #23138

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

shubhi1092
Copy link
Contributor

@shubhi1092 shubhi1092 commented Nov 19, 2024

Description

Recently we added circuit breaker and pause/resume for scriptorium, i.e. partition lambdas. With this PR, I am adding the pause/resume functionality for document lambdas. It will be triggered by a document lambda during various circuit breaker states (pause during open, and resume during close). Currently no doc lambda is using circuit breaker, so this is just to add functionality which will not be triggered until we implement circuit breaker in any doc lambda.

Important changes:

  • Added pause/resume listeners for the event emitted by documentContext.
  • Since there can be multiple docContexts, we pause the kafka partition at the lowest offset out of them, so that no messages are missed during resume.
  • On resume, we start from the lowest offset and send to the contextManager and respective docContext. The offsets which were already processed successfully (if any) are skipped during this resume.
  • Remaining offsets are then processed as usual by the respective lambdas. (Lambdas should make sure to reset any state needed on pause, such that it can reprocess the failed messages on resume)

Reviewer Guidance

  • Tested it locally by hardcoding context.pause() and resume() from deli lambda and making sure that ops are not lost on resume. Tested it with a single and multiple docPartitions.
  • This feature shouldnt have any impact until we implement circuit breaker in any doc lambda and trigger pause/resume from there.

@github-actions github-actions bot added area: server Server related issues (routerlicious) base: main PRs targeted against main branch labels Nov 19, 2024
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 9 changed files in this pull request and generated 2 suggestions.

Files not reviewed (4)
  • server/routerlicious/packages/lambdas-driver/src/document-router/documentLambda.ts: Evaluated as low risk
  • server/routerlicious/packages/lambdas-driver/src/document-router/documentPartition.ts: Evaluated as low risk
  • server/routerlicious/packages/lambdas-driver/src/kafka-service/partitionManager.ts: Evaluated as low risk
  • server/routerlicious/packages/lambdas/src/scriptorium/lambda.ts: Evaluated as low risk
Comments skipped due to low confidence (2)

server/routerlicious/packages/lambdas-driver/src/document-router/contextManager.ts:62

  • [nitpick] The parameter name 'allowBackToOffset' is not very descriptive. Consider renaming it to 'resumeOffset'.
public setHead(head: IQueuedMessage, allowBackToOffset?: number | undefined) {

server/routerlicious/packages/lambdas-driver/src/document-router/contextManager.ts:99

  • The new behavior introduced by the 'allowBackToOffset' parameter should be covered by tests to ensure it works as expected.
public setHead(head: IQueuedMessage, allowBackToOffset?: number | undefined) {

Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: server Server related issues (routerlicious) base: main PRs targeted against main branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants