Hierarchical Text Classification in Tf.Keras
In this blog, we will learn to perform hierarchical text classification on a dataset. The dataset contains mainly three columns: text headline, text description which will be a paragraph and finally the text label. We will be creating a deep learning model in Keras on TensorFlow backend.
The model diagram can be seen below:
Step 1: Text Preprocessing
We will be using our custom text preprocessing code for this. Initially, we will have to import the necessary libraries.
Now, we will load the data which I assume is in a .csv format with three columns namely, headline, description, and label.
Let’s do the coding:
To prepare the data which can be fed to model, we need to to the preprocessing(except the full stops and question marks as they can be used to separate sentences from each other). Since we will be focused mainly on the model part, we will skip the code for preprocessing.
We will be using glove 300d word embeddings to represent the words in the data. Since our data can have some unique words, we will append those with the existing vocabulary and assign them random word embeddings from a gaussian distribution. We will then tokenize our headline and description according to a word2id dictionary from the vocab and the final data is prepared. All the things, discussed previously are just some loops so we will skip this part and move towards creating the model.
Step 2: Model
We will now load the newly formed word vocab and word embeddings in our code.
Our first layer will be an embedding layer whose initial weights will be the pre-trained embedding matrix. We will set the trainable parameter as True so that our word embeddings keep getting updated while the training goes on.
Now, for each sentence in the description, we need to have a sentence encoder model.
Till this point, we have got a vector for each description. We need to concatenate this with the headline embedding sequence and then we will build our final model which predicts the probabilities for the classes.
That’s it we have just created a hierarchical deep learning model in Keras. I will be giving you the link for the full code
soon including the pre-processing steps. If you want to see the code in TensorFlow, it can be found here.
The code is an implementation of the 3HAN model.
I hope you enjoyed the blog. Thank you 😊