Fun with Machine Learning : Building a Sudoku Solver

18 Oct 2019

Knowledge of machine learning is not only a geek thing. It can be fun too if you are taking some simple, yet interesting problems like solving a Sudoku puzzle.

sud1

So, in this post, we will be implementing a CNN based deep learning network for solving a sudoku puzzle. We will be using a dataset, which I found on Kaggle.

We will start with some definitions:

1. The dataset consists of one million solved sudoku puzzles.

2. The format of the dataset is like this:

X = [a string of 81 characters] with 0 character meaning, empty block

Y = [a string of 81 characters] with all characters being non-zero and between 1 to 9.

So, we need to convert this into a 9x9 2-D matrix format first.

After doing that, we will normalize the dataset and then feed it to our model.

Let’s perform the coding part now:

The code below shows the following pipeline:

1. necessary imports
2. data processing
3. defining the model and compiling it
4. training the model.

import tensorflow.keras as keras
import pandas as pd
import numpy as np
from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import load_model as load_model
from tensorflow.keras.models import save_model as save_model


df = pd.read_csv('sudoku.csv', encoding='latin1')

X = df['quizzes']
Y = df['solutions']
X_data=[]
Y_data=[]

for i in range(len(X)):
    a = np.asarray([((int(ch)/9)-0.5) for ch in X[i]]).reshape(9,9,1)
    b = np.asarray([int(ch) for ch in Y[i]]).reshape(81,1)-1
    X_data.append(a)
    Y_data.append(b) 
   
X_data = np.asarray(X_data)
Y_data = np.asarray(Y_data)

inp = keras.Input(shape=(9,9,1,), dtype='float64')
conv1 = keras.layers.Conv2D(filters=32,
                            kernel_size=(3,3), 
                            activation='relu')(inp)
conv2 = keras.layers.Conv2D(filters=64, 
                            kernel_size=(2,2), 
                            activation='relu')(conv1)
conv3 = keras.layers.Conv2D(filters=256,
                            kernel_size=(1,1), 
                            activation='relu')(conv2)

fvector = keras.layers.Flatten()(conv3)
dense1 = keras.layers.Dense(9*81, activation='sigmoid')(fvector)
reshape1 = keras.layers.Reshape((81,9))(dense1)
output = keras.layers.Activation('softmax')(reshape1)

model = keras.Model(inp, output)


adam = keras.optimizers.Adam(lr=0.009)
model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam', metrics=['acc'])

x_train, x_test,y_train, y_test = train_test_split(X_data,Y_data, 
                                          test_size=0.2,random_state=0)

for i in range(100):
    model.fit(x_train,y_train, 
                       epochs=1,validation_split=0.2, batch_size=200)
                       
    save_model(model, '/run/praphul/saved_models/sudoku.h5')

I hope you enjoyed the blog and the fun part of machine learning.

Keep Learning, Keep Sharing 😊