Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix abort controller handling #36

Merged
merged 15 commits into from
Nov 1, 2023
Merged

fix abort controller handling #36

merged 15 commits into from
Nov 1, 2023

Conversation

joewagner
Copy link
Contributor

@joewagner joewagner commented Sep 24, 2023

The initial work for this PR is located in https://github.com/tablelandnetwork/js-tableland/pull/576/files

Summary
Fixes abort controller handling which allows a user to abort and configure the polling timeout and interval for SDK methods that poll the gateway for a receipt.

Details
The core of this PR was adding the wiring to make the underlying polling handling aware of the user's abort controller. However, I found the API to be unintuitive and spent some time refactoring how a user goes about providing a abort-able controller with a custom timeout and polling interval. See comments below for details.

NOTE: This would be a breaking API change and therefore would require a major version bump. We probably could make the core fix while not breaking the API, but I don't think there's any need to be bashful about frequently releasing new major versions at this point. What do you guys think?

@sanderpick
Copy link
Member

Thanks for sorting this out @joewagner. Should we get this out soonish? Sounded like the timeout issue was blocking some folks on Filecoin.

@joewagner
Copy link
Contributor Author

Thanks for sorting this out @joewagner. Should we get this out soonish? Sounded like the timeout issue was blocking some folks on Filecoin.

Sounds good to me! I'm planning to make sure the studio users get support, but otherwise can work on this.

@sanderpick
Copy link
Member

@joewagner I asked @dtbuchholz if he could take this over... sounds like the abort bug is the main cause of pain for the hackathon so figured we should try to get this out.

statements.map(async (stmt) => await stmt.all<T>(undefined, opts))
statements.map(
async (stmt) => await stmt.all<T>(undefined, controller)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanderpick from this comment—i don't think this is a problem? i tested it out like this, if it's what you meant:

const batch = await db.batch(
  [
    db.prepare(`SELECT * FROM ${tableOne};`),
    db.prepare(`SELECT * FROM ${tableTwo};`),
    db.prepare(`SELECT * FROM ${tableThree};`),
    db.prepare(`SELECT * FROM ${tableFour};`),
    db.prepare(`SELECT * FROM ${tableFive};`),

  ],
  controller
);

and i get the array back w/o issue:

[
  {
    meta: { duration: 50.153499603271484 },
    success: true,
    results: [ [Object] ]
  },
  {
    meta: { duration: 48.47224998474121 },
    success: true,
    results: [ [Object] ]
  }
]

i can add a test but double checking my understanding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, trying to refresh my memory here. shouldn't your array have 5 entries?

@dtbuchholz
Copy link
Contributor

two other follow ups:

  • regarding this comment, is it something worthwhile implementing, or should we keep this as-is?
  • in order to set a predefined per-chain polling timeout value, would it make sense to put that in evm-tableland's network.ts file? otherwise, it'd require custom chain handling in the chains helper, which tries to avoid that by importing from evm-tableland. for example, network.ts could export a "chainPollingTimeout" or "chainBlockTime"...something like that.

@dtbuchholz
Copy link
Contributor

dtbuchholz commented Oct 19, 2023

i tested these changes in the Studio with just a symlink (no code changes to the Studio...and just using the default controller functionality in this PR). occasionally, it works on sepolia, but it's still aborting. in this video, you'll notice:

  1. the tx is signed, the SDK starts waiting for confirmation, and it gets confirmed onchain ~20 seconds later—the tx is logged in the console.
  2. thereafter, it takes 13 seconds for the validator to materialize the table. the SDK makes 8 requests without success and then aborts. (you can see on the far right, i'm manually refreshing the /tables endpoint until this table shows up.)

it seems like the validator is having issues materializing sepolia data?

Screen.Recording.2023-10-19.at.2.33.14.PM.mov

@sanderpick
Copy link
Member

sanderpick commented Oct 20, 2023

  • regarding this comment, is it something worthwhile implementing, or should we keep this as-is?

hmm, i'm not totally clear on what needs to conform to the D1 API, and if we're planning on maintaining compatibility. my pref would be to make an options object like in my example commit. @joewagner do you have any insights here?

  • in order to set a predefined per-chain polling timeout value, would it make sense to put that in evm-tableland's network.ts file? otherwise, it'd require custom chain handling in the chains helper, which tries to avoid that by importing from evm-tableland. for example, network.ts could export a "chainPollingTimeout" or "chainBlockTime"...something like that.

Nice, I like the idea of using evm-tableland here. That could be integrated here in a diff PR.

@sanderpick
Copy link
Member

@dtbuchholz regarding the video, if you increase the timeout to like 60 seconds, it should work then right? ignoring the fact that it shouldn't be that slow on sepolia for now (@avichalp / @brunocalza any idea what's going on there? 13 seconds sounds pretty slow.)

@joewagner
Copy link
Contributor Author

  • regarding this comment, is it something worthwhile implementing, or should we keep this as-is?

hmm, i'm not totally clear on what needs to conform to the D1 API, and if we're planning on maintaining compatibility. my pref would be to make an options object like in my example commit. @joewagner do you have any insights here?

Based on the abstract class D1 publishes, it looks like we need to make the first method polymorphic and potentially accept a colName string parameter, but it can also be called with no arguments.
This means we should be able to accept an optional Options that contains whatever properties we want to support in all of the statement methods.
maybe something like:

{
  first<T = unknown>(colName: string, opts?: TablelandOptions): Promise<T | null>;
  first<T = Record<string, unknown>>(opts?: TablelandOptions): Promise<T | null>;
  run<T = Record<string, unknown>>(opts?: TablelandOptions): Promise<D1Result<T>>;
  all<T = Record<string, unknown>>(opts?: TablelandOptions): Promise<D1Result<T>>;
  raw<T = unknown[]>(opts?: TablelandOptions): Promise<T[]>;
}

Any kind of third party tools that expect the SDK to be D1 compliant will be able to call those methods as ewpected, and we can still allow folks to supply an options object that indicates the timeout they want.

NOTE: looks like we currently do (opts: SignalAndInterval = {}) instead of using the ?, i.e. (opts?: SignalAndInterval). Our current way of setting a default feels nicer and has the same affact imo, but either should work.

@joewagner
Copy link
Contributor Author

joewagner commented Oct 20, 2023

I have a thought after looking at this with fresh eyes. Regardless if we go with per chain default timeouts, it would be nice to be able to set a global timeout via an overrideDefaults[fixed link] type of interface.
This way, as a dev, I can decide how often my default polling should happen and how long to wait before stopping the polling.

@brunocalza
Copy link

@dtbuchholz regarding the video, if you increase the timeout to like 60 seconds, it should work then right? ignoring the fact that it shouldn't be that slow on sepolia for now (@avichalp / @brunocalza any idea what's going on there? 13 seconds sounds pretty slow.)

According to the current config , we're waiting for at least one block to consider the tx final. and the ethereum block time is 12s, so I guess that's kind of expected. Were we experiencing faster times before?

@dtbuchholz
Copy link
Contributor

dtbuchholz commented Oct 20, 2023

According to the current config , we're waiting for at least one block to consider the tx final. and the ethereum block time is 12s, so I guess that's kind of expected. Were we experiencing faster times before?

@brunocalza ah okay that makes sense (i dont think they were faster before...just never really noticed it). out of curiosity, what's the a reason we impose the 1 block depth for sepolia and mumbai when arb/op goerli use 0? do sepolia/mumbai commonly have reorgs or something?

@sanderpick
Copy link
Member

Any kind of third party tools that expect the SDK to be D1 compliant will be able to call those methods as ewpected, and we can still allow folks to supply an options object that indicates the timeout they want.

@joewagner sounds good. @dtbuchholz do you want to give this a shot?

NOTE: looks like we currently do (opts: SignalAndInterval = {}) instead of using the ?, i.e. (opts?: SignalAndInterval). Our current way of setting a default feels nicer and has the same affact imo, but either should work.

There was some discussion of this here. Personally I prefer the ? for objects that don't have optional params. However, if we implement a true opts object on the top-level API (like here), then it would have optional values and using the default pattern ( = {}) makes sense to me, but internal methods that take controller would keep the ? pattern because that object doesn't have optional params.

I have a thought after looking at this with fresh eyes. Regardless if we go with per chain default timeouts, it would be nice to be able to set a global timeout via an #36 type of interface.
This way, as a dev, I can decide how often my default polling should happen and how long to wait before stopping the polling.

Sounds good! Did you mean to link back to this PR? Should this come in this PR, or as a follow on?

@dtbuchholz
Copy link
Contributor

@sanderpick sgtm, will take a stab. and wrt the overrideDefaults, i think it's referring to this.

@brunocalza
Copy link

According to the current config , we're waiting for at least one block to consider the tx final. and the ethereum block time is 12s, so I guess that's kind of expected. Were we experiencing faster times before?

@brunocalza ah okay that makes sense (i dont think they were faster before...just never really noticed it). out of curiosity, what's the a reason we impose the 1 block depth for sepolia and mumbai when arb/op goerli use 0? do sepolia/mumbai commonly have reorgs or something?

i think it's because reorgs can't happen on L2s

@joewagner
Copy link
Contributor Author

Sounds good! Did you mean to link back to this PR? Should this come in this PR, or as a follow on?

Oops, no I meant to link the overideDefaults method.

export function overrideDefaults(

@sanderpick
Copy link
Member

Sweet, @dtbuchholz feel free to try and implement something like that here

return {
signal: controller.signal,
abort: () => {
clearTimeout(timeoutId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the only change I made here from Sander's original code was the addition of this clearTimeout(timeoutId) line, to make sure that it's cleared if an abort occurs.

*/
export interface ChainInfo {
chainName: ChainName;
chainId: number;
contractAddress: string;
baseUrl: string;
pollingTimeout: number;
pollingInterval: number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured that since overrideDefaults is already storing validator-specific and chain information, we could just include the validator polling info there. If needed, I can separate it out into its own interface.

const entries = Object.entries(proxies) as Array<[ChainName, string]>;
const mapped = entries.map(([chainName, contractAddress]) => {
// @ts-expect-error this imported object's values are always a string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the TablelandNetworkConfig can now have types of string | number, the compiler was complaining here with the new URL constructor and baseURIs[chainName]. I tried a few different fixes like checking types & throwing etc, but nothing worked. I think it's safe to ignore since we know that baseURIs will always be strings, though.

// Use per-chain validator polling timeout period
const pollingTimeout = validatorPollingTimeouts[chainName];
// Default to 1500ms polling interval, except for Filecoin due to long block times
const pollingInterval = chainName.includes("filecoin") ? 5000 : 1500;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each chain, we take a chain-specific validator polling timeout as defined here, and then we can use that polling timeout by default in other methods. For the interval, I've set the frequency to 1500ms for all chains—except for Filecoin.

Since it can take a few minutes for the validator to materialize FIL data, the standard interval is, instead, 5 seconds...it could be even longer, if needed.

* @param chainNameOrId The requested chain name.
* @returns A {@link PollingController} with standard timeout & interval per-chain.
*/
export function getChainPollingController(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method makes it easy to use the chain-specific validator polling timeouts/intervals. It'll create a new polling controller with those chain-specific settings.

colName?: K,
opts: SignalAndInterval = {}
): Promise<Result<T | T[K]>> {
async all<T = S>(opts: Options = {}): Promise<Result<T>> {
Copy link
Contributor

@dtbuchholz dtbuchholz Oct 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example of how the options are passed.

Note that I had started down the path of D1 API conformance in this PR, but I ended up creating a separate branch for this work. Looks like I removed some stuff on accident. Thus, the removed code might need to be added back, unless that PR gets merged into this one. For example, I removed the colName since it's not part of the D1 API, but that might actually be something we want to keep in there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. I don't quite follow the point about removed code, but if easier, you could just target this branch. generally speaking though, its easier to keep PRs small and specific.

strictEqual(meta.txn, undefined);
strictEqual(error, undefined);
assert(meta.duration != null);
deepStrictEqual(results, [1, 2, 3, 4]);
deepStrictEqual(results, [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted, I started down the path of D1 API conformance and looks like I muddied up a bit here. So, the colName can't be passed to all() anymore. Again, I can revert this change here if we want to keep things simple and/or still want that feature. The PR with D1 changes, for reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So maybe we can just merge this PR and then you can rebase the other on and fix it up before a major release?

Copy link
Contributor

@dtbuchholz dtbuchholz Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense to me.

also tagging @joewagner to help address one of your questions above about the removed code since he was helping with some cleanups for the D1 conformance. do the changes in this comment make sense? namely, this is the current SDK code, before this PR and any D1 conformance:

async all<T = S, K extends keyof T = keyof T>(
  colName?: undefined,
  opts?: SignalAndInterval
): Promise<Result<T>>;
async all<T = S, K extends keyof T = keyof T>(
  colName: K,
  opts?: SignalAndInterval
): Promise<Result<T[K]>>;
async all<T = S, K extends keyof T = keyof T>(
  colName?: K,
  opts: SignalAndInterval = {}
): Promise<Result<T | T[K]>

and with the changes in this PR & D1 conformance (#63), we dropped colName, and it now simply looks like this:

async all<T = Record<string, S>>(opts: Options = {}): Promise<Result<T>>

similarly, the first method currently looks like this, before any changes from those PRs:

async first<T = S, K extends keyof T = keyof T>(): Promise<T>;
async first<T = S, K extends keyof T = keyof T>(
  colName: undefined,
  opts?: SignalAndInterval
): Promise<T>;
async first<T = S, K extends keyof T = keyof T>(
  colName: K,
  opts?: SignalAndInterval
): Promise<T[K] | null>;
async first<T = S, K extends keyof T = keyof T>(
  colName?: K,
  opts: SignalAndInterval = {}
): Promise<T | T[K] | null>

and after these PR changes:

async first<T = Record<string, S>>(opts?: Options): Promise<T | null>;
async first<T = S, K extends keyof T = keyof T>(
  colName: K,
  opts?: Options
): Promise<T[K] | null>;
async first<T = S, K extends keyof T = keyof T>(
  colName?: K,
  opts: Options = {}
): Promise<T | T[K] | null> 

the main callout with first is that i dropped the overloading with colName: undefined (primarily because chatgpt didn't see why it was useful, but the return type is a bit different). and if no colName is specified at all, we use the Record/return types from the new D1 API.

anyways, i can just merge this PR and make any potential fixes in the conformance PR...just wanna double check here first so that i dont lose track of things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making sure I'm on the right page here, are we trying to match these types? https://github.com/cloudflare/workerd/blob/main/types/defines/d1.d.ts

Without trying it, would the colName: undefined let you call first like this: db.prepare(...).first(colName, options);
Where colName is of type string | undefined?

Copy link
Contributor

@dtbuchholz dtbuchholz Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, i was just trying to match those types.

so, when i test out your example, it looks like we do need to keep the colName: undefined overload...i'll revert/add it back in #63 and include more test cases. but also...the D1 types use first<T = unknown>(colName: string), so maybe colName: undefined shouldn't be allowed because they require string?

edit: i think i misunderstood how generics worked with overloading

wrt your example: what's interesting is that if i don't specify the opts/controller, we dont need the colName: undefined overload; it'll execute the same, whether or not that overload definition is included. but if we do use opts/controller, then it fails, and it only works if i add the colName: undefined overload back in there. i'll debug a bit more, too.

// when the `colName: undefined` overload is not included
let c: string | undefined;
c = undefined;
let row = await stmt.first<any>(c); // works
row = await stmt.first<any>(c, { controller }); // fails
// error: Argument of type 'undefined' is not assignable to parameter of type 'string | number | symbol'. 
// The call would have succeeded against this implementation, but implementation signatures of overloads are not externally visible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtbuchholz I'm ok adding it back, or not, up to you.
To put it all in one place... If we can support the following signatures, that would make sense to me:

// needed for ORM
await stmt.first();
// needed for ORM, i think?
await stmt.first(col);
// needed for users who want to modify there timeouts and specify a col
await stmt.first(col, opts); // NOTE: I don't know if we **must** allow `col` to be undefined, i'm ok either way.
// needed for users who want to modify there timeouts
await stmt.first(opts);

Copy link
Member

@sanderpick sanderpick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice... thanks for adding all the extra tests.

strictEqual(meta.txn, undefined);
strictEqual(error, undefined);
assert(meta.duration != null);
deepStrictEqual(results, [1, 2, 3, 4]);
deepStrictEqual(results, [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So maybe we can just merge this PR and then you can rebase the other on and fix it up before a major release?

@dtbuchholz dtbuchholz merged commit 8c23061 into main Nov 1, 2023
4 checks passed
@dtbuchholz dtbuchholz deleted the joe/fix-polling-opts branch November 2, 2023 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants