TypeAgent Shell is an UI entry point to TypeAgent sample code that explores architectures for building interactive agents with natural language interfaces using TypeChat.
TypeAgent Shell is a single personal assistant that takes user request and use an extensible set of agents to perform actions, answer questions, and carry a conversation. TypeAgent CLI's interactive mode is a command line version of the app, and both shared the core Dispatcher component. Please read Dispatcher's README.md on example requests and usage.
TypeAgent shell is built using Electron. Install libraries needed to build in Linux/WSL following the instruction
pnpm run shell
On Windows, if you notice a lag when starting up the shell, you can add the source code folder to the exclusions list for Windows Defender scans following the instruction.
Additionally, if you are running MS Graph based sample agents like Calendar and Email, there is an auth token persisted in the identity cache that can occasionally get corrupted. This could also slow down the start up time of the shell, you can delete that by running:
del %LOCALAPPDATA%\.IdentityService\typeagent-tokencache
Currently, TypeAgent Shell optionally supports voice input via Azure Speech Services or Local Whisper Service beside keyboard input.
To set up Azure Speech to Text service, the following variables in the .env
are needed.
Variable | Value |
---|---|
SPEECH_SDK_ENDPOINT |
Service URL (or speech API resource ID when using Identity based authentication) |
SPEECH_SDK_KEY |
API key |
SPEECH_SDK_REGION |
Region of the service (e.g. westus2 ) |
If you would like to enable keyless Speech API access you must have performed the following steps: - Specify identity
as the SPEECH_SDK_KEY
in the .env
file. - Replace the SPEECH_SDK_ENDPOINT
value with the azure resource id of your cognitive service instance (i.e. /subscriptions/<your subscription guid>/resourceGroups/myResourceGroup/providers/Microsoft.CognitiveServices/accounts/speechapi
). - Configure your speech API to support Azure Entra RBAC and add the necessary users/groups with the necessary permissions
(typically Cognitive Services Speech User
or Cognitive Services Speech Contributor
). More information on congitive services roles here. - If you are using JIT access elevate prior to calling the speech API. Please refer to the elevate.js script for doing this efficiently.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.