Hierarchical Text Classification in Tf.Keras

In this blog, we will learn to perform hierarchical text classification on a dataset. The dataset contains mainly three columns: text headline, text description which will be a paragraph and finally the text label. We will be creating a deep learning model in Keras on TensorFlow backend.

The model diagram can be seen below:

htc1

Step 1: Text Preprocessing

We will be using our custom text preprocessing code for this. Initially, we will have to import the necessary libraries.

htc2

Now, we will load the data which I assume is in a .csv format with three columns namely, headline, description, and label.

Let’s do the coding:

htc3

To prepare the data which can be fed to model, we need to to the preprocessing(except the full stops and question marks as they can be used to separate sentences from each other). Since we will be focused mainly on the model part, we will skip the code for preprocessing.

We will be using glove 300d word embeddings to represent the words in the data. Since our data can have some unique words, we will append those with the existing vocabulary and assign them random word embeddings from a gaussian distribution. We will then tokenize our headline and description according to a word2id dictionary from the vocab and the final data is prepared. All the things, discussed previously are just some loops so we will skip this part and move towards creating the model.

Step 2: Model

We will now load the newly formed word vocab and word embeddings in our code.

htc4

Our first layer will be an embedding layer whose initial weights will be the pre-trained embedding matrix. We will set the trainable parameter as True so that our word embeddings keep getting updated while the training goes on.

htc5

Now, for each sentence in the description, we need to have a sentence encoder model.

htc6

Till this point, we have got a vector for each description. We need to concatenate this with the headline embedding sequence and then we will build our final model which predicts the probabilities for the classes.

htc7

htc8

That’s it we have just created a hierarchical deep learning model in Keras. I will be giving you the link for the full code soon including the pre-processing steps. If you want to see the code in TensorFlow, it can be found here. The code is an implementation of the 3HAN model.
I hope you enjoyed the blog. Thank you 😊