Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Circuit Breaker for OpenAI API Calls #32

Open
ayush-vibrant opened this issue Oct 31, 2023 · 1 comment
Open

Implement Circuit Breaker for OpenAI API Calls #32

ayush-vibrant opened this issue Oct 31, 2023 · 1 comment

Comments

@ayush-vibrant
Copy link

Given our reliance on the OpenAI API for multiple operations, it's crucial to ensure our system gracefully handles potential downtimes or rate limits from the API. While the current utils.py does incorporate a retry mechanism with exponential backoff, we could further enhance this by implementing a circuit breaker pattern.

Current Behavior:
In utils.py, when making a call to the OpenAI API (e.g., call_llm), there's a retry logic in place which makes a maximum of 3 attempts, with a backoff factor of 1.5. If the API service is unavailable after these attempts, an exception is raised.

Suggestion:
Implement a circuit breaker mechanism. This would allow the system to "break" the connection temporarily if it detects continuous failures (e.g., due to API downtimes). By doing so, we could prevent the system from making unnecessary calls when the API is known to be down, and thus save on resources and improve response times. Especially considering it is being used as a PyPi package, this mechanism would ensure we're being considerate to the downstream service, allowing it necessary recovery time if it's facing issues.

@krrishdholakia
Copy link

hey @ayush-vibrant, I'm the maintainer of LiteLLM we allow you to create a Router to maximize throughput by load balancing + queuing (beta).

I'd love to get your feedback if this solves your issue

Here's the quick start

from litellm import Router

model_list = [{ # list of model deployments 
    "model_name": "gpt-3.5-turbo", # model alias 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-v-2", # actual model name
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-functioncalling", 
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "gpt-3.5-turbo", 
        "api_key": os.getenv("OPENAI_API_KEY"),
    }
}]

router = Router(model_list=model_list)

# openai.ChatCompletion.create replacement
response = await router.acompletion(model="gpt-3.5-turbo", 
                messages=[{"role": "user", "content": "Hey, how's it going?"}])

print(response)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants