Gunicorn Workers Not Using GPU in Parallel #2985

vibhas-singh · 2023-03-21T09:37:40Z

vibhas-singh
Mar 21, 2023

I am trying to deploy a Pytorch image classification model wrapped in Flask on g4dn.xlarge (4 vCPU, 16GB RAM, T4 GPU with 16GB Memory) instances on AWS.

For selecting the optimal number of workers I performed some experiments:

Note:
* Only `model.forward` part runs on GPU - the rest of the steps run on CPU in the application.
* I have added a timing logger for every step of the application for checking

Experiment 1:

gunicorn main.app:app -b 0.0.0.0:8000 --workers 1

Concurrent Requests: 1
Total Time To Process 15 Requests By A Client: 15.87s (model.forward takes 14.98s)

Experiment 2:

gunicorn main.app:app -b 0.0.0.0:8000 --workers 2

Concurrent Requests: 2 (2 clients sending requests in parallel)
Total Time To Process 15 Requests By A Client: 29.35s (model.forward takes 28.34s, 2x of a single request, every other step taking a similar time)

Experiment 3:

gunicorn main.app:app -b 0.0.0.0:8000 --workers 3

Concurrent Requests: 3 (3 clients sending requests in parallel)
Total Time To Process 15 Requests By A Client: 43.82s (model.forward takes 41.81s, 3x of a single request, every other step taking a similar time).

Using 3x workers is enabling me to process 3 requests in parallel but the overall processing time of all those requests is also becoming 3x - hence no improvement in real terms.

I initially thought CPU or IO processes are the bottlenecks in the app - but upon intensively logging the time taken at each step, I found the bottleneck is coming from the GPU processing (model.forward starts taking 2x-3x times).
Upon checking the process ids of the workers for each request - I can also confirm that all the workers are getting the requests in parallel - but those are not able to perform the GPU processings in parallel at the same time.

Any guidance on what can be the bottleneck here will be very helpful.
Also - is there a recommended worker type to be used for such kinds of processing which are GPU-dependent?

benoitc · 2023-05-07T19:14:26Z

benoitc
May 7, 2023
Maintainer

I moved it to a discussion sin eit's more likely an OS issue than directly related to gunicorn.

0 replies

Irtiza17 · 2024-01-07T06:55:36Z

Irtiza17
Jan 7, 2024

I'm having a similar issue. @vibhas-singh did you resolve this issue?

0 replies

dhargopala · 2024-11-19T08:09:57Z

dhargopala
Nov 19, 2024

Any updates on this @vibhas-singh or @Irtiza17 ? I am also facing similar issue. Any help will be greatly appriciated.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gunicorn Workers Not Using GPU in Parallel #2985

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Gunicorn Workers Not Using GPU in Parallel #2985

vibhas-singh Mar 21, 2023

Experiment 1:

Experiment 2:

Experiment 3:

Replies: 3 comments

benoitc May 7, 2023 Maintainer

Irtiza17 Jan 7, 2024

dhargopala Nov 19, 2024

vibhas-singh
Mar 21, 2023

benoitc
May 7, 2023
Maintainer

Irtiza17
Jan 7, 2024

dhargopala
Nov 19, 2024