-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fluid AI] Update ai-collab client API to transform TreeEdit types into a consumable version for UI visualizations #23117
base: main
Are you sure you want to change the base?
Conversation
…to a consumable version for UI visualizations
🔗 No broken links found! ✅ Your attention to detail is admirable. linkcheck output
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Coverage Summary
↓ packages.framework.ai-collab.src:
Line Coverage Change: -2.67% Branch Coverage Change: No change
Metric Name | Baseline coverage | PR coverage | Coverage Diff |
---|---|---|---|
Branch Coverage | 0.00% | 0.00% | → No change |
Line Coverage | 35.59% | 32.92% | ↓ -2.67% |
↓ packages.framework.ai-collab.src.explicit-strategy:
Line Coverage Change: -0.40% Branch Coverage Change: No change
Metric Name | Baseline coverage | PR coverage | Coverage Diff |
---|---|---|---|
Branch Coverage | 77.98% | 77.98% | → No change |
Line Coverage | 77.60% | 77.20% | ↓ -0.40% |
Baseline commit: 0b6d14f
Baseline build: 307935
Happy Coding!!
Code coverage comparison check failed!!
More Details: Readme
- Skip This Check!!
What to do if the code coverage check fails:
-
Ideally, add more tests to increase the code coverage for the package(s) whose code-coverage regressed.
-
If a regression is causing the build to fail and is due to removal of tests, removal of code with lots of tests or any other valid reason, there is a checkbox further up in this comment that determines if the code coverage check should fail the build or not. You can check the box and trigger the build again. The test coverage analysis will still be done, but it will not fail the build if a regression is detected.
-
Unchecking the checkbox and triggering another build should go back to failing the build if a test-coverage regression is detected.
-
You can check which lines are covered or not covered by your tests with these steps:
- Go to the PR ADO build.
- Click on the link to see its published artifacts. You will see an artifact named
codeCoverageAnalysis
, which you can expand to reach to a particular source file's coverage html which will show which lines are covered/not covered by your tests. - You can also run different kind of tests locally with
:coverage
tests commands to find out the coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the comments below, a couple of things:
- It would be nice to see how the React changes look in the running app.
- Not sure I get what the
Test
section in the PR description is supposed to convey. 0 passing tests, 1 skipped test?
@@ -32,6 +32,7 @@ import { useSharedTreeRerender } from "@/useSharedTreeRerender"; | |||
// Uncomment the import line that corresponds to the server you want to use | |||
// import { createContainer, loadContainer, postAttach, containerIdFromUrl } from "./spe"; // eslint-disable-line import/order | |||
import { createContainer, loadContainer, postAttach, containerIdFromUrl } from "./tinylicious"; // eslint-disable-line import/order | |||
// import { DiffViewer } from "@/components/DiffViewer"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
@@ -342,6 +343,7 @@ describe.skip("AI Job Listings App Benchmark", () => { | |||
} | |||
const foundJohnDoe = createJohnDoeCandidateTask.data; | |||
assert(foundJohnDoe !== undefined); | |||
assert(response.diffs !== undefined); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this test better by asserting what the diffs look like? Or are they non-deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we unskip
the whole test suite (it's currently skipped in line 223
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error: 401 You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys
This is what I get when I run the test. I'd like to spend some time in tmrw's parking lot to ramp up on some AI-collab testing library / concepts 🤯 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading through the suite, and the fact that it requires an API_KEY, this test is not a unit test, is more of an integration/e2e test. The code lives here now because here's where everything started, but it needs to move to a different location (potentially the e2e tests package?) later. So it should remain skipped as long as it's here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can extract the mapping of EditLog to Diff to a separate function we can use unit tests instead of this integration test. It isn't quite deterministic given we haven't set the temperature value on the model, and even when its set to be the most deterministic, technically it can give a different response.
|
||
/** | ||
* The GenerateTreeEditsResponse interface defines the structure of the response object | ||
* | ||
* @alpha | ||
*/ | ||
export interface GenerateTreeEditsResponse { | ||
status: "success" | "failure" | "partial-failure"; | ||
errorMessage?: string; | ||
tokensUsed: TokenUsage; | ||
diffs?: Diff[]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we really need a new type; should we just add the diffs to the two existing response types we already have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should be reusing the existing types here, not seeing value from this new type and we're exposing the internal naming scheme that doesn't match with the public api surface
Adding 21569 in description would help 😄 . |
Yeah, in order for the AI collab to work you need an OpenAI API key for a paid account. |
|
||
/** | ||
* The GenerateTreeEditsResponse interface defines the structure of the response object | ||
* | ||
* @alpha | ||
*/ | ||
export interface GenerateTreeEditsResponse { | ||
status: "success" | "failure" | "partial-failure"; | ||
errorMessage?: string; | ||
tokensUsed: TokenUsage; | ||
diffs?: Diff[]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should be reusing the existing types here, not seeing value from this new type and we're exposing the internal naming scheme that doesn't match with the public api surface
const diffs: Diff[] = editLog.map((log, index) => ({ | ||
id: `diff-${index}`, | ||
// eslint-disable-next-line @typescript-eslint/strict-boolean-expressions | ||
type: log.error ? "error" : "edit", | ||
description: log.error ?? log.edit.explanation ?? "No description available", | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend making this a separate function. You will be able to unit test it as well this way.
What does the id here represent? This should be the id of the tree node that was edited, which the log
object should have. Additionally, these diff's need more information. We need to know what type of edit was this, for example:
- Is it a modification of an existing object? If so, what fields were modified and what is the new values if you can provide that?
- is it an array operation? For example, the insertion of a new object, deletion, move? If so, what index was it inserted/moved/removed from?
} | ||
|
||
/** | ||
* The Diff interface defines the structure of the response object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment needs a more detailed explanation, this doesn't explain the concept of what a diff is.
@@ -46,6 +46,16 @@ export function createMergableDiffSeries(diffs: Difference[]): Difference[]; | |||
// @alpha | |||
export function createMergableIdDiffSeries(oldObject: unknown, diffs: Difference[], idAttributeName: string | number): Difference[]; | |||
|
|||
// @alpha | |||
export interface Diff { | |||
// (undocumented) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please be sure to document all new exported members before merging 🙂
@@ -182,3 +182,26 @@ export interface TokenLimits { | |||
*/ | |||
readonly outputTokens?: number; | |||
} | |||
|
|||
/** | |||
* The GenerateTreeEditsResponse interface defines the structure of the response object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Response to what? Would be good to be a bit more descriptive here.
* - Primitive root nodes are not supported | ||
* | ||
* @internal | ||
* The editLog is transformed into an array of Diff objects. Each Diff object includes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"editLog" is something that's defined within the function - the caller of this function likely won't know what it is. We should express the docs here in terms of what the function means to the caller.
Description
Currently, the only way to get a set of objects for visualizing each change an LLM makes to your tree using the aiCollab library is for users to use the
sharedTreeDiff
utility that was created for the implicit strategy separately. This utility is not perfect, likely does not cover many edge cases and is fickle. However, when using the explicit strategy we maintain an edit log internally that is the source truth. Currently, we don't return the editLog because theTreeEdit
type of the edit log is not simple for users the consume and visualize changes on their UI.The purpose of this ticket is to update the explicit strategy's entry point function
generateTreeEdits
by taking the editLog, which is an array ofTreeEdit
, and transforming it into a new set of objects "diffs" that can be easily used for UI visualizations and return it along with the success/fail response.Test