-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
9efc17e
commit b2a1dda
Showing
1 changed file
with
138 additions
and
1 deletion.
There are no files selected for viewing
139 changes: 138 additions & 1 deletion
139
docs/core_docs/docs/integrations/document_loaders/web_loaders/dropbox.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,138 @@ | ||
// documentation goes here | ||
--- | ||
hide_table_of_contents: true | ||
sidebar_class_name: node-only | ||
--- | ||
|
||
# Dropbox Loader | ||
|
||
The `DropboxLoader` allows you to load documents from Dropbox into your LangChain applications. It retrieves files or directories from your Dropbox account and converts them into documents ready for processing. | ||
|
||
## Overview | ||
|
||
Dropbox is a file hosting service that brings all your files—traditional documents, cloud content, and web shortcuts—together in one place. With the `DropboxLoader`, you can seamlessly integrate Dropbox file retrieval into your projects. | ||
|
||
## Setup | ||
|
||
1. Create a dropbox app, using the [Dropbox App Console](https://www.dropbox.com/developers/apps/create). | ||
2. Ensure the app has the `files.metadata.read`, `files.content.read` scope permissions: | ||
3. Generate the access token from the Dropbox App Console. | ||
4. To use this loader, you'll need to have Unstructured already set up and ready to use at an available URL endpoint. It can also be configured to run locally. | ||
See the docs [here](https://www.dropbox.com/developers/apps/create) for information on how to do that. | ||
5. Install the necessary packages: | ||
|
||
```bash npm2yarn | ||
npm install @langchain/community @langchain/core dropbox | ||
``` | ||
|
||
## Usage | ||
|
||
### Loading Specific Files | ||
|
||
To load specific files from Dropbox, specify the file paths: | ||
|
||
```typescript | ||
import { DropboxLoader } from "@langchain/community/document_loaders/web/dropbox"; | ||
const loader = new DropboxLoader({ | ||
clientOptions: { | ||
accessToken: "your-dropbox-access-token", | ||
}, | ||
unstructuredOptions: { | ||
apiUrl: "http://localhost:8000/general/v0/general", // Replace with your Unstructured API URL | ||
}, | ||
filePaths: ["/path/to/file1.txt", "/path/to/file2.pdf"], // Replace with file paths on Dropbox. | ||
}); | ||
const docs = await loader.load(); | ||
console.log(docs); | ||
``` | ||
### Loading Files from a Directory | ||
To load all files from a specific directory, provide the `folderPath` and set the `mode` to `"directory"`. Set `recursive` to `true` to traverse subdirectories: | ||
```typescript | ||
import { DropboxLoader } from "@langchain/community/document_loaders/web/dropbox"; | ||
const loader = new DropboxLoader({ | ||
clientOptions: { | ||
accessToken: "your-dropbox-access-token", | ||
}, | ||
unstructuredOptions: { | ||
apiUrl: "http://localhost:8000/general/v0/general", | ||
}, | ||
folderPath: "/path/to/folder", | ||
recursive: true, // Load documents found in subdirectories | ||
mode: "directory", | ||
}); | ||
const docs = await loader.load(); | ||
console.log(docs); | ||
``` | ||
### Streaming Documents | ||
To process large datasets efficiently, use the `loadLazy` method to stream documents asynchronously: | ||
```typescript | ||
import { DropboxLoader } from "@langchain/community/document_loaders/web/dropbox"; | ||
const loader = new DropboxLoader({ | ||
clientOptions: { | ||
accessToken: "your-dropbox-access-token", | ||
}, | ||
unstructuredOptions: { | ||
apiUrl: "http://localhost:8000/general/v0/general", | ||
}, | ||
folderPath: "/large/dataset", | ||
recursive: true, | ||
mode: "directory", | ||
}); | ||
for await (const doc of loader.loadLazy()) { | ||
// Process each document as it's loaded | ||
console.log(doc); | ||
} | ||
``` | ||
### Authentication with Environment Variables | ||
You can set the `DROPBOX_ACCESS_TOKEN` environment variable instead of passing the access token in `clientOptions`: | ||
```bash | ||
export DROPBOX_ACCESS_TOKEN=your-dropbox-access-token | ||
``` | ||
Then initialize the loader without specifying `accessToken`: | ||
```typescript | ||
import { DropboxLoader } from "@langchain/community/document_loaders/web/dropbox"; | ||
const loader = new DropboxLoader({ | ||
clientOptions: {}, | ||
unstructuredOptions: { | ||
apiUrl: "http://localhost:8000/general/v0/general", | ||
}, | ||
filePaths: ["/important/notes.txt"], | ||
}); | ||
const docs = await loader.load(); | ||
console.log(docs[0].pageContent); | ||
``` | ||
## Configuration Options | ||
Here are the configuration options for the `DropboxLoader`: | ||
| Option | Type | Description | | ||
| --------------------- | ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| `clientOptions` | `DropboxOptions` | Configuration options for initializing the Dropbox client, including authentication details. Refer to the [Dropbox SDK Documentation](https://dropbox.github.io/dropbox-sdk-js/Dropbox.html#Dropbox__anchor) for more information. | | ||
| `unstructuredOptions` | `UnstructuredLoaderOptions` | Options for the `UnstructuredLoader` used to process downloaded files. Includes the `apiUrl` for your Unstructured server. | | ||
| `folderPath` | `string` (optional) | The path to the folder in Dropbox from which to load files. Defaults to the root folder (`""`) if not specified. | | ||
| `filePaths` | `string[]` (optional) | An array of specific file paths in Dropbox to load. Required if `mode` is set to `"file"`. | | ||
| `recursive` | `boolean` (optional) | Indicates whether to recursively traverse folders when `mode` is `"directory"`. Defaults to `false`. | | ||
| `mode` | `"file"` or `"directory"` (optional) | The mode of operation. Set to `"file"` to load specific files or `"directory"` to load all files in a directory. Defaults to `"file"`. | | ||
## API References | ||
- [Dropbox SDK for JavaScript](https://github.com/dropbox/dropbox-sdk-js) |