Part 2: Deploy Flask API in production using WSGI gunicorn with nginx reverse proxy
Upasana | August 31, 2019 | 7 min read | 3,644 views | Flask - Python micro web framework
This concise tutorial will walk you through Flask REST API from development to production. This is continuation of part 1 article where we discussed creating Flask REST API.
-
Part 2. WSGI Gunicorn setup and Nginx Reverse Proxy Setup
What will you learn ?
-
Production readiness & deployment - using wsgi + gunicorn
-
Why WSGI Gunicorn
-
Creating systemd service for gunicorn
-
-
Nginx setup as the reverse proxy
-
Why use nginx?
-
Configuring nginx
-
Configuring load balancing using nginx
-
-
HTTP Benchmarking using h2load
Production Deployment readiness & deployment - using wsgi + gunicorn
Why use WSGI Gunicorn ?
When we run a Flask app using inbuilt development server, we get the below warning on console:
Fig. development server warning
Flask’s official documentation suggests not to use inbuilt flask server in production deployment.
While lightweight and easy to use, Flask’s built-in server is not suitable for production as it doesn’t scale well and by default serves only one request at a time.
Flask uses WSGI Werkzeug’s development server, which could have following issues:
-
Inbuilt development server doesn’t scale well.
-
If you leave debug mode on and an error pops up, it opens up a shell that allows for arbitrary code to be executed on your server.
-
It will not handle more than one request at a time by default.
The right way to run a Flask app in production is to use wsgi production server, like gunicorn
What is WSGI?
WSGI is not a server, a python module, or a framework. Rather it is just an interface specification by which server and application communicate. Both server and application interface sides are described in details by PEP-3333
A WSGI compliant server only receives the request from the client, pass it to the application and then send the response returned by the application to the client. That’s all, it does nothing else.
What is gunicorn
Gunicorn is a WSGI compatible production ready application server.
Gunicorn 'Green Unicorn' is a Python WSGI HTTP Server for UNIX. It’s a pre-fork worker model. The Gunicorn server is broadly compatible with various web frameworks, simply implemented, light on server resources, and fairly speedy.
gunicorn setup
Install gunicorn using the below command:
$ pip install gunicorn
Now lets create a separate endpoint for wsgi app for our given tutorial
from src.main import app
if __name__ == "__main__":
app.run()
Run the gunicorn from command line for testing:
$ gunicorn -w 4 -b 127.0.0.1:8000 src.wsgi:app
This will spawn 4 worker threads for the gunicorn server. We can see it in logs:
[2019-04-03 09:02:55 +0530] [11333] [INFO] Starting gunicorn 19.9.0
[2019-04-03 09:02:55 +0530] [11333] [INFO] Listening at: http://127.0.0.1:8000 (11333)
[2019-04-03 09:02:55 +0530] [11333] [INFO] Using worker: sync
[2019-04-03 09:02:55 +0530] [11338] [INFO] Booting worker with pid: 11338
[2019-04-03 09:02:55 +0530] [11339] [INFO] Booting worker with pid: 11339
[2019-04-03 09:02:55 +0530] [11342] [INFO] Booting worker with pid: 11342
[2019-04-03 09:02:55 +0530] [11343] [INFO] Booting worker with pid: 11343
Systemd Service
We can create systemd service for our flask application. This service will allow automatic start of our app server upon system reboot.
We need to create a unit file with extension .service
within the /etc/systemd/system
directory:
$ sudo vi /etc/systemd/system/wsgi-app.service
Here is the content for this service unit file:
[Unit]
Description=Gunicorn instance to serve flask app
After=network.target
[Service]
User=ubuntu
Group=www-data
WorkingDirectory=/home/ubuntu/wsgi-app
Environment="PATH=/home/ubuntu/wsgi-app/venv/bin"
ExecStart=/home/ubuntu/wsgi-app/venv/bin/gunicorn --workers 3 --bind unix:wsgi-app.sock -m 007 src.wsgi:app
[Install]
WantedBy=multi-user.target
Now our systemd service file is complete and we can save and close it.
$ sudo systemctl daemon-reload
This command will reload systemd manager configuration, reloads all unit files and recreate the entire dependency tree.
$ sudo systemctl start wsgi-app
$ sudo systemctl enable wsgi-app
$ sudo systemctl status wsgi-app
$ sudo systemctl enable wsgi-app
Configure Ubuntu Firewall
We need to open port 5000 in Ubuntu to allow traffic from outside world.
$ sudo ufw allow 5000
Nginx Setup & Configuration
What is nginx
nginx is a front facing web server that most commonly acts as a reverse proxy for an application server.
Any non-trivial production setup for flask may look like this:
Fig. Nginx as front facing web server
Why use nginx on top of wsgi gunicorn?
regardless of app server in use (gunicorn/mod_wsgi, etc), any production deployment will have something like nginx upstream configured as reverse proxy due to various reasons, for example:
-
nginx can handle requests upstream that gunicorn should not be handling like serving static files (css assets/js bundles/images). thus, only the dynamic requests can be passed on to gunicorn application server. More importantly, nginx can easily cache these static files and boost the performance.
-
nginx can act as a load balancer that can evenly route requests across multiple instances of gunicorn in round robin fashion.
-
It is easy to configure nginx for request throttling, API rate limiting, and blocking potentially unwanted calls from insecure origins.
-
gunicorn does not need to worry about the clients that are slow, since nginx will take care of that complex part, thus making gunicorn processing model embarrassingly simple. Slow clients can potentially make your application simply stop handling new requests.
-
SSL/TLS & HTTP/2 can be configured at nginx level, considering that nginx is the only front facing web server that is exposed to internet. As wsgi unicorn is never exposed to internet, all internal communication can happen over plain HTTP/1, without thinking about security. Additionally, nginx can optimize SSL/TLS by session caching, session tickets, etc.
-
GZIP compression can be handled at nginx level, which will reduce network bandwidth requirements for clients.
Installation
Installation of nginx in linux is easy, you just need to have root privileges on the VPS, and run the below command:
$ sudo apt-get install nginx
On MacOS, it can be installed using homebrew
$ brew install nginx
$ sudo nginx -s stop
$ sudo nginx
$ vim /usr/local/etc/nginx/nginx.conf
Configuration
Configuring nginx as the reverse proxy in-front of our wsgi deployment is very simple. We just need to add the below configuration to our nginx server.
server {
listen 80;
listen [::]:80;
server_name www.carvia.io carvia.io;
access_log /var/log/nginx/wsgi-app.access.log;
error_log /var/log/nginx/wsgi-app.error.log;
location / {
proxy_pass http://127.0.0.1:5000/; (1)
proxy_redirect off;
proxy_set_header Host $http_host; (2)
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
1 | 5000 is the port on which wsgi gunicorn server is running. |
2 | We need to configure proxy server to pass these headers, specially $http_host and $remote_addr, to make WSGI server work properly behind the reverse proxy. |
Restart the nginx web server:
$ sudo nginx -t $ sudo service nginx restart
Using nginx as load balancer for multiple wsgi gunicorn instances
Fig. Nginx as load balancer
$ sudo nano /etc/nginx/sites-available/default
upstream backend {
server 127.0.0.1:5000;
server 127.0.0.1:5001;
}
server {
location / {
proxy_pass http://backend;
}
}
Test and restart the nginx server
$ sudo nginx -t $ sudo service nginx restart
HTTP Server Benchmarking using h2load
h2load
is a modern HTTP benchmarking tool often used for benchmarking server capabilities for the given deployment. We can use h2load to send 10000 requests using 10 concurrent connections on our recently deployed server.
$ h2load -n10000 -c10 -m1 --h1 http://localhost:5000/health.json
Typical results on my machine looks like below:
finished in 3.08s, 3246.23 req/s, 155.21KB/s
requests: 10000 total, 10000 started, 10000 done, 10000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 10000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 478.12KB (489592) total, 1.08MB (1130000) headers (space savings 0.00%), 156.25KB (160000) data
min max mean sd +/- sd
time for request: 794us 78.51ms 2.94ms 3.12ms 99.63%
time for connect: 78us 170us 116us 26us 70.00%
time to 1st byte: 898us 2.43ms 1.69ms 628us 50.00%
req/s : 324.64 326.36 325.31 0.63 70.00%
You can continue to Part 3. Docker, Jenkins and CI/CD setup
Top articles in this category:
- Flask Interview Questions
- Deploying Keras Model in Production with TensorFlow 2.0
- Deploying Keras Model in Production using Flask
- Part 3: Dockerize Flask application and build CI/CD pipeline in Jenkins
- Configure Logging in gunicorn based application in docker container
- Part 1: Creating and testing Flask REST API
- Named Entity Recognition using spaCy and Flask