Deploying Keras Model in Production with TensorFlow 2.0
Upasana | October 24, 2019 | 7 min read | 1,226 views
In this article, we are going to discuss the process of building a REST API over keras’s saved model in TF 2.0 and deploying it to production using Flask and Gunicorn/WSGI.
If you are looking for tensorflow 1.x support then refer to this article.
Introduction
We are going to take example of a mood detection model which is built using NLTK, keras in python. When we train deep learning model in keras, we always need some other part as well to test its results and if we want to demo then we cannot show raw probabilities (output from model) and have to show interactive results such that someone who is not from this background shall also be able to understand the results.
Keras
Keras is an open-source neural-network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, Theano, and PlaidML. It is designed to enable fast experimentation with deep neural networks, and focuses on being user-friendly, modular, and extensible.
NLTK
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries.
Now, NLTK has added support for indian languages as well.
Flask
Flask is a micro web framework written in python, which is frequently used by developers to create simple REST endpoints.
We will be creating one python script for calling REST Endpoints using flask application and will be keeping classes in services folder.
Mood detection model
This model was built on 1,82,689 observations which includes data based on emotions categories as Anger, disgust, joy, sadness, shame, guilt and fear. Model is based in Bi-directional LSTM and was trained on only 50 epochs. Since, data was not normalized earlier to retain the pattern, BatchNormalisation layer was also used in model. Below are the recall scores from the model stats on test data:
-
Anger : 0.72
-
Disgust : 0.68
-
Fear : 0.96
-
Guilt : 0.63
-
Joy : 0.92
-
Sad : 0.94
-
Shame : 0.81
Directory structure
Our directory structure is going to be like:
In src
folder, we have two directories and main.py
to start flask app.
-
Directory
mood-saved-models
contains saved keras models and saved tokenizer inpickle
format. -
Directory
service
contains services scripts in.py
.
Text pre-processing
Before training deep learning models with the textual data we have, we usually perform few transformations on the data to clean it and convert it into vector format. This process is generally known as text pre-processing.
Since, we perform these tasks on training data then we shall be doing the same on testing data as well.
Now, we are going to build a service for the same which will pre-process the text before sending it to model for prediction.
def text_preprocessing(self,text):
eyes = r"[8:=;]"
nose = r"['`-]?"
def re_sub(pattern, repl):
return re.sub(pattern, repl, text, flags=self.FLAGS)
text = re_sub(r"https?:\/\/\S+\b|www\.(\w+\.)+\S*", " ")
text = re_sub(r"@\w+", "user")
text = re_sub(r"{}{}[)dD]+|[)dD]+{}{}".format(eyes, nose, nose, eyes), "smile")
text = re_sub(r"{}{}p+".format(eyes, nose), "laugh")
text = re_sub(r"{}{}\(+|\)+{}{}".format(eyes, nose, nose, eyes), "sad")
text = re_sub(r"{}{}[\/|l*]".format(eyes, nose), "neutral")
text = re_sub(r"/"," / ")
text = re_sub(r"<3","love")
text = re_sub(r"[-+]?[.\d]*[\d]+[:,.\d]*", " ")
text = re_sub(r"#\S+", self.hashtag)
text = re_sub(r"([!?.]){2,}", r"\1 repeat")
text = re_sub(r"\b(\S*?)(.)\2{2,}\b", r"\1\2 <elong>")
text = re_sub(r"([A-Z]){2,}", self.allcaps)
return text.lower()
We will be using this method to clean the text. It involves
-
Removing repetitive words
-
converting smileys to text
-
extracting text from hashtags.
We can also add spell corrector such that it can take care of typos. There is library named as enchant which can be used to correct spelling od the words. Try installing and using it by pip install pyenchant . This shall work on Mac OS X and Ubuntu, not sure about windows
|
So now, the whole class is going to be look like below:
import re
class TextPreprocessing(object):
def __init__(self):
self.FLAGS = re.MULTILINE | re.DOTALL
def hashtag(self,text):
text = text.group()
hashtag_body = text[1:]
if hashtag_body.isupper():
result = " {} ".format(hashtag_body.lower())
else:
result = " ".join([""] + [re.sub(r"([A-Z])",r" \1", hashtag_body, flags=self.FLAGS)])
return result
def allcaps(self,text):
text = text.group()
return text.lower() + " "
def re_sub(self,pattern, repl,text):
return re.sub(pattern, repl, text, flags=self.FLAGS)
def tweet_preprocessing(self,text):
eyes = r"[8:=;]"
nose = r"['`-]?"
def re_sub(pattern, repl):
return re.sub(pattern, repl, text, flags=self.FLAGS)
text = re_sub(r"https?:\/\/\S+\b|www\.(\w+\.)+\S*", " ")
text = re_sub(r"@\w+", "user")
text = re_sub(r"{}{}[)dD]+|[)dD]+{}{}".format(eyes, nose, nose, eyes), "smile")
text = re_sub(r"{}{}p+".format(eyes, nose), "laugh")
text = re_sub(r"{}{}\(+|\)+{}{}".format(eyes, nose, nose, eyes), "sad")
text = re_sub(r"{}{}[\/|l*]".format(eyes, nose), "neutral")
text = re_sub(r"/"," / ")
text = re_sub(r"<3","love")
text = re_sub(r"[-+]?[.\d]*[\d]+[:,.\d]*", " ")
text = re_sub(r"#\S+", self.hashtag)
text = re_sub(r"([!?.]){2,}", r"\1 repeat")
text = re_sub(r"\b(\S*?)(.)\2{2,}\b", r"\1\2 <elong>")
text = re_sub(r"([A-Z]){2,}", self.allcaps)
return text.lower()
Now we need to make a service for loading saved model of keras and make it a predict function as well. But, saved deep learning models are usually big in size and some of theme even takes time to load themselves. we shall implement the service in a way such that we won’t have to load it, at every call of endpoint.
To avoid this problem, we will be using singleton design pattern.
from keras.models import model_from_json
import tensorflow as tf
import pickle
class SentimentService(object):
model1 = None
tokenizer = None
@classmethod
def load_deep_model(self, model):
loaded_model = tf.keras.models.load_model("./src/mood-saved-models/" + model + ".h5")
return loaded_model
@classmethod
def get_model1(self):
if self.model1 is None:
self.model1 = self.load_deep_model('model5_ver1')
return self.model1
@classmethod
def load_tokenizer(self):
if self.tokenizer is None:
with open('./src/mood-saved-models/tokenizer.pickle', 'rb') as handle:
self.tokenizer = pickle.load(handle)
return self.tokenizer
load_tokenizer
is for loading saved tokenizer.
Now, we need to build endpoints which will be using these services. We will be building three endpoints.
-
Health Check, to check status of flask service if it is running or not.
-
get structure & parameters of saved model
-
get prediction of the model
@app.route("/heath", methods=["GET"])
def heath():
return Response(json.dumps({"status":"UP"}), status=200, mimetype='application/json')
@app.route("/show_model", methods=["GET"])
def show_model():
model = request.args.get("model", default=None,type=str)
model_format = json.loads(open('mood-saved-models/' + model + '.json').read())
return Response(json.dumps(model_format), status=200, mimetype='application/json')
@app.route('/mood-detect', methods=['POST'])
def model_predict():
if not request.json or not 'text' in request.json:
abort(400)
tp = TextPreprocessing()
sent = pd.Series(request.json['text'])
new_sent = [tp.tweet_preprocessing(i) for i in sent]
seq = SentimentService.load_tokenizer().texts_to_sequences(pd.Series(''.join(new_sent)))
test = pad_sequences(seq, maxlen=256)
another_strategy = tf.distribute.MirroredStrategy()
with another_strategy.scope():
model = SentimentService.get_model1()
res = model.predict_proba(test,batch_size=32, verbose=0)
lab_list = ['anger', 'disgust', 'fear', 'guilt', 'joy', 'sadness', 'shame']
moods = {}
for actual, probabilities in zip(lab_list, res[0]):
moods[actual] = 100*probabilities
return Response(json.dumps(moods), status=200, mimetype='application/json')
Now, we are ready to use this service to detect from a text.
Run main.py and get results after calling endpoints.
$ python src/main.py
To get structure of model [GET]
GET http://0.0.0.0:5000/show_model?model=model5_ver1
{
"class_name": "Sequential",
"config": [
{
"class_name": "Embedding",
"config": {
"name": "embedding_2",
"trainable": false,
"batch_input_shape": [
null,
256
],
"dtype": "float32",
"input_dim": 57888,
"output_dim": 100,
"embeddings_initializer": {
"class_name": "RandomUniform",
"config": {
"minval": -0.05,
"maxval": 0.05,
"seed": null
}
},
"embeddings_regularizer": null,
"activity_regularizer": null,
"embeddings_constraint": null,
"mask_zero": false,
"input_length": 256
}
},
{
"class_name": "SpatialDropout1D",
"config": {
"name": "spatial_dropout1d_4",
"trainable": true,
"rate": 0.2,
"noise_shape": null,
"seed": null
}
},
{
"class_name": "Bidirectional",
"config": {
"name": "bidirectional_7",
"trainable": true,
"layer": {
"class_name": "LSTM",
"config": {
"name": "lstm_13",
"trainable": true,
"return_sequences": true,
"return_state": false,
"go_backwards": false,
"stateful": false,
"unroll": false,
"units": 128,
"activation": "tanh",
"recurrent_activation": "hard_sigmoid",
"use_bias": true,
"kernel_initializer": {
"class_name": "VarianceScaling",
"config": {
"scale": 1,
"mode": "fan_avg",
"distribution": "uniform",
"seed": null
}
},
"recurrent_initializer": {
"class_name": "Orthogonal",
"config": {
"gain": 1,
"seed": null
}
},
"bias_initializer": {
"class_name": "Zeros",
"config": {}
},
"unit_forget_bias": true,
"kernel_regularizer": null,
"recurrent_regularizer": null,
"bias_regularizer": null,
"activity_regularizer": null,
"kernel_constraint": null,
"recurrent_constraint": null,
"bias_constraint": null,
"dropout": 0.2,
"recurrent_dropout": 0.2,
"implementation": 1
}
},
"merge_mode": "concat"
}
},
{
"class_name": "BatchNormalization",
"config": {
"name": "batch_normalization_10",
"trainable": true,
"axis": -1,
"momentum": 0.99,
"epsilon": 0.001,
"center": true,
"scale": true,
"beta_initializer": {
"class_name": "Zeros",
"config": {}
},
"gamma_initializer": {
"class_name": "Ones",
"config": {}
},
"moving_mean_initializer": {
"class_name": "Zeros",
"config": {}
},
"moving_variance_initializer": {
"class_name": "Ones",
"config": {}
},
"beta_regularizer": null,
"gamma_regularizer": null,
"beta_constraint": null,
"gamma_constraint": null
}
},
{
"class_name": "Bidirectional",
"config": {
"name": "bidirectional_8",
"trainable": true,
"layer": {
"class_name": "LSTM",
"config": {
"name": "lstm_14",
"trainable": true,
"return_sequences": false,
"return_state": false,
"go_backwards": false,
"stateful": false,
"unroll": false,
"units": 128,
"activation": "tanh",
"recurrent_activation": "hard_sigmoid",
"use_bias": true,
"kernel_initializer": {
"class_name": "VarianceScaling",
"config": {
"scale": 1,
"mode": "fan_avg",
"distribution": "uniform",
"seed": null
}
},
"recurrent_initializer": {
"class_name": "Orthogonal",
"config": {
"gain": 1,
"seed": null
}
},
"bias_initializer": {
"class_name": "Zeros",
"config": {}
},
"unit_forget_bias": true,
"kernel_regularizer": null,
"recurrent_regularizer": null,
"bias_regularizer": null,
"activity_regularizer": null,
"kernel_constraint": null,
"recurrent_constraint": null,
"bias_constraint": null,
"dropout": 0.2,
"recurrent_dropout": 0.2,
"implementation": 1
}
},
"merge_mode": "concat"
}
},
{
"class_name": "Dense",
"config": {
"name": "dense_10",
"trainable": true,
"units": 7,
"activation": "sigmoid",
"use_bias": true,
"kernel_initializer": {
"class_name": "VarianceScaling",
"config": {
"scale": 1,
"mode": "fan_avg",
"distribution": "uniform",
"seed": null
}
},
"bias_initializer": {
"class_name": "Zeros",
"config": {}
},
"kernel_regularizer": null,
"bias_regularizer": null,
"activity_regularizer": null,
"kernel_constraint": null,
"bias_constraint": null
}
}
],
"keras_version": "2.2.2",
"backend": "tensorflow"
}
To get prediction [POST]
POST http://0.0.0.0:5000/mood-detect
{
"text": "great i am liking it"
}
{
"anger": 7.112710922956467,
"disgust": 3.1775277107954025,
"fear": 12.434638291597366,
"guilt": 2.8116755187511444,
"joy": 56.977683305740356,
"sadness": 13.96680623292923,
"shame": 3.2702498137950897
}
Github repository
Source code is available on the github tensorflow 2.0 repository. You can clone the project from github and run it on your system.
Production deployment using WSGI
You can checkout these 3 series articles for production deployment of Flask endpoints:
Thanks for reading this article.
References
Top articles in this category:
- RuntimeError: get_session is not available when using TensorFlow 2.0
- Deploying Keras Model in Production using Flask
- Imbalanced classes in classification problem in deep learning with keras
- Flask Interview Questions
- Part 2: Deploy Flask API in production using WSGI gunicorn with nginx reverse proxy
- SVM after LSTM deep learning model for text classification
- Creating custom Keras callbacks in python