Skip to content

Commit

Permalink
Refactor fetchText to use host.fetchText function (#886)
Browse files Browse the repository at this point in the history
* refactor: migrate fetchText to host.fetchText function ✨

* feat: add resolveIssue option to GitHub parsing 🎉

* update tests

* refactor: ♻️ restructure project & remove configs
  • Loading branch information
pelikhan authored Nov 21, 2024
1 parent 8adbce7 commit 30d6a0c
Show file tree
Hide file tree
Showing 17 changed files with 104 additions and 66 deletions.
2 changes: 1 addition & 1 deletion docs/src/content/docs/getting-started/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ defTool(
"fetch",
"Download text from a URL",
{ url: "https://..." },
({ url }) => fetchText(url)
({ url }) => host.fetchText(url)
)

$`Summarize https://raw.githubusercontent.com/microsoft/genaiscript/main/README.md in 1 sentence.`
Expand Down
6 changes: 3 additions & 3 deletions docs/src/content/docs/guides/search-and-fetch.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,16 @@ You will need a [Bing Web Search API key](/genaiscript/reference/scripts/web-sea
3. Use the [`webSearch`](/genaiscript/reference/scripts/web-search/) function to
search for information about the destination.
If you don't have one, then you can search for the web pages manually and use the URLs directly
in the call to the `fetchText` function.
in the call to the `host.fetchText` function.
```js
const parkinfo = await retrieval.webSearch("mt rainier things to do")
```
4. `webSearch` returns a list of URLs. Use [`fetchText`](/genaiscript/reference/scripts/fetch/)
to fetch the contents of the 1st URL.
```js
const parktext = await fetchText(parkinfo.webPages[0])
const parktext = await host.fetchText(parkinfo.webPages[0])
```
5. `fetchText` returns a lot of formatting HTML tags, etc.
5. `host.fetchText` returns a lot of formatting HTML tags, etc.
Use [`runPrompt`](/genaiscript/reference/scripts/inline-prompts/)
to call the LLM to clean out the tags and just keep the text.
```js
Expand Down
16 changes: 12 additions & 4 deletions docs/src/content/docs/reference/scripts/fetch.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,20 @@ keywords: fetch API, fetchText, HTTP requests, scripts, API key
The JavaScript `fetch` API is available; but we also provide a helper
`fetchText` for issuing requests into a friendly format.

## `fetchText`
## `host.fetch`

Use `fetchText` to issue requests and download text from the internet.
The `host.fetch` function is a wrapper around the global `fetch` function which adds builtin proxy support and retry capabilities.

```js
const response = await host.fetch("https://api.example.com", { retries: 3 })
```

## `host.fetchText`

Use `host.fetchText` to issue requests and download text from the internet.

```ts
const { text, file } = await fetchText("https://....")
const { text, file } = await host.fetchText("https://....")
if (text) $`And also ${text}`

def("FILE", file)
Expand All @@ -23,7 +31,7 @@ def("FILE", file)
fetchText will also resolve the contents of file in the current workspace if the url is a relative path.

```ts
const { file } = await fetchText("README.md")
const { file } = await host.fetchText("README.md")
def("README", file)
```

Expand Down
5 changes: 4 additions & 1 deletion packages/cli/src/run.ts
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,10 @@ export async function runScript(
let _ghInfo: GithubConnectionInfo = undefined
const resolveGitHubInfo = async () => {
if (!_ghInfo)
_ghInfo = await githubParseEnv(process.env, { issue: pullRequest })
_ghInfo = await githubParseEnv(process.env, {
issue: pullRequest,
resolveIssue: true,
})
return _ghInfo
}
let adoInfo: AzureDevOpsEnv = undefined
Expand Down
38 changes: 34 additions & 4 deletions packages/core/src/fetch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,13 @@ export async function createFetch(
} = options || {}

// We create a proxy based on Node.js environment variables.
const proxy = process.env.HTTPS_PROXY || process.env.HTTP_PROXY || process.env.https_proxy || process.env.http_proxy;
const proxy =
process.env.GENAISCRIPT_HTTPS_PROXY ||
process.env.GENAISCRIPT_HTTP_PROXY ||
process.env.HTTPS_PROXY ||
process.env.HTTP_PROXY ||
process.env.https_proxy ||
process.env.http_proxy
const agent = proxy ? new HttpsProxyAgent(proxy) : null

// We enrich crossFetch with the proxy.
Expand Down Expand Up @@ -89,6 +95,22 @@ export async function createFetch(
return fetchRetry
}

export async function fetch(
input: string | URL | globalThis.Request,
options?: FetchOptions & TraceOptions
): Promise<Response> {
const { retryOn, retries, retryDelay, maxDelay, trace, ...rest } =
options || {}
const f = await createFetch({
retryOn,
retries,
retryDelay,
maxDelay,
trace,
})
return f(input, rest)
}

/**
* Fetches text content from a URL or file.
*
Expand All @@ -101,8 +123,10 @@ export async function createFetch(
*/
export async function fetchText(
urlOrFile: string | WorkspaceFile,
fetchOptions?: FetchTextOptions
fetchOptions?: FetchTextOptions & TraceOptions
) {
const { retries, retryDelay, retryOn, maxDelay, trace, ...rest } =
fetchOptions || {}
if (typeof urlOrFile === "string") {
urlOrFile = {
filename: urlOrFile,
Expand All @@ -114,8 +138,14 @@ export async function fetchText(
let status = 404
let text: string
if (/^https?:\/\//i.test(url)) {
const fetch = await createFetch()
const resp = await fetch(url, fetchOptions)
const f = await createFetch({
retries,
retryDelay,
retryOn,
maxDelay,
trace,
})
const resp = await f(url, rest)
ok = resp.ok
status = resp.status
if (ok) text = await resp.text()
Expand Down
7 changes: 5 additions & 2 deletions packages/core/src/github.ts
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,9 @@ async function githubGetPullRequestNumber() {

export async function githubParseEnv(
env: Record<string, string>,
options?: { issue?: number } & Partial<Pick<GithubConnectionInfo, "owner" | "repo">>
options?: { issue?: number; resolveIssue?: boolean } & Partial<
Pick<GithubConnectionInfo, "owner" | "repo">
>
): Promise<GithubConnectionInfo> {
const res = githubFromEnv(env)
try {
Expand All @@ -110,8 +112,9 @@ export async function githubParseEnv(
res.repo = repo
res.owner = owner.login
res.repository = res.owner + "/" + res.repo
if (isNaN(res.issue)) res.issue = await githubGetPullRequestNumber()
}
if (isNaN(res.issue) && options?.resolveIssue)
res.issue = await githubGetPullRequestNumber()
} catch (e) {}
return Object.freeze(res)
}
Expand Down
3 changes: 3 additions & 0 deletions packages/core/src/promptcontext.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import { HTMLEscape } from "./html"
import { hash } from "./crypto"
import { resolveModelConnectionInfo } from "./models"
import { DOCS_WEB_SEARCH_URL } from "./constants"
import { fetch, fetchText } from "./fetch"

/**
* Creates a prompt context for the given project, variables, trace, options, and model.
Expand Down Expand Up @@ -212,6 +213,8 @@ export async function createPromptContext(

// Define the host for executing commands, browsing, and other operations
const promptHost: PromptHost = Object.freeze<PromptHost>({
fetch: (url, options) => fetch(url, {...(options || {}), trace }),
fetchText: (url, options) => fetchText(url, {...(options || {}), trace }),
resolveLanguageModel: async (modelId) => {
const { configuration } = await resolveModelConnectionInfo(
{ model: modelId },
Expand Down
34 changes: 32 additions & 2 deletions packages/core/src/types/prompt_template.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2237,8 +2237,6 @@ interface Retrieval {
): Promise<WorkspaceFile[]>
}

type FetchTextOptions = Omit<RequestInit, "body" | "signal" | "window">

interface DataFilter {
/**
* The keys to select from the object.
Expand Down Expand Up @@ -3242,11 +3240,43 @@ interface ContentSafetyHost {
contentSafety(id?: ContentSafetyProvider): Promise<ContentSafety>
}

type FetchOptions = RequestInit & {
retryOn?: number[] // HTTP status codes to retry on
retries?: number // Number of retry attempts
retryDelay?: number // Initial delay between retries
maxDelay?: number // Maximum delay between retries
}

type FetchTextOptions = Omit<FetchOptions, "body" | "signal" | "window">

interface PromptHost
extends ShellHost,
UserInterfaceHost,
LanguageModelHost,
ContentSafetyHost {
/**
* A fetch wrapper with proxy, retry and timeout handling.
*/
fetch(
input: string | URL | globalThis.Request,
init?: FetchOptions
): Promise<Response>

/**
* A function that fetches text from a URL or a file
* @param url
* @param options
*/
fetchText(
url: string | WorkspaceFile,
options?: FetchTextOptions
): Promise<{
ok: boolean
status: number
text?: string
file?: WorkspaceFile
}>

/**
* Opens a in-memory key-value cache for the given cache name. Entries are dropped when the cache grows too large.
* @param cacheName
Expand Down
3 changes: 1 addition & 2 deletions packages/core/src/types/prompt_type.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -222,8 +222,7 @@ declare var git: Git
declare var tokenizers: Tokenizers

/**
* Fetches a given URL and returns the response.
* @param url
* @deprecated use `host.fetchText` instead
*/
declare function fetchText(
url: string | WorkspaceFile,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,17 @@ script({
/**
* @type {any}
*/
const res = await fetch(
const res = await host.fetch(
"https://raw.githubusercontent.com/microsoft/genaiscript/main/package.json",
{ method: "GET" }
)
const pkg = await res.json()

const { file: readme } = await host.fetchText(
"https://raw.githubusercontent.com/microsoft/genaiscript/refs/heads/main/README.md"
)

def("PACKAGE", YAML.stringify(pkg))
def("README", readme)

$`Explain the purpose of the product described in PACKAGE. Mention its name.`
$`Explain the purpose of the product described in PACKAG and README. Mention its name.`
2 changes: 1 addition & 1 deletion packages/sample/genaisrc/lza_review.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ for (const link of biceps) {
const [, , p] = dependency
if (p.includes("shared")) continue // ignore those shared files
const dp = path.join(dirname, p)
const resp = await fetchText(dp)
const resp = await host.fetchText(dp)
def("DEPS", resp.file, { lineNumbers: true })
}
}
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion packages/vscode/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ defTool(
"fetch",
"Download text from a URL",
{ url: "https://..." },
({ url }) => fetchText(url)
({ url }) => host.fetchText(url)
)

$`Summarize https://raw.githubusercontent.com/microsoft/genaiscript/main/README.md in 1 sentence.`
Expand Down
1 change: 0 additions & 1 deletion slides/genaisrc/.gitattributes

This file was deleted.

4 changes: 0 additions & 4 deletions slides/genaisrc/.gitignore

This file was deleted.

17 changes: 0 additions & 17 deletions slides/genaisrc/jsconfig.json

This file was deleted.

21 changes: 0 additions & 21 deletions slides/genaisrc/tsconfig.json

This file was deleted.

0 comments on commit 30d6a0c

Please sign in to comment.