Implement style transfer with VGG19
An example of style transfer is shown in the header image. This was made by a web app. It looks like this is a bit overtrained. You can try it out with your ...
In order to classify three constituent elements in the image of a handwritten Bengali grapheme: grapheme root, vowel diacritics, and consonant diacritics, we started with a model using multiple convolutional layers and fully connected layers and evaluated the performance of our model.
Credit to Bengali Graphemes: Starter EDA+ Multi Output CNN kernel
# all images are resized to 64*64
inputs = Input(shape = (64, 64, 1))
model = Conv2D(filters=32, kernel_size=(3, 3), padding='SAME', activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 1))(inputs)
model = Conv2D(filters=32, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = Conv2D(filters=32, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = BatchNormalization(momentum=0.15)(model)
model = MaxPool2D(pool_size=(2, 2))(model)
model = Conv2D(filters=32, kernel_size=(5, 5), padding='SAME', activation='relu')(model)
model = Dropout(rate=0.3)(model)
model = Conv2D(filters=64, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = Conv2D(filters=64, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = Conv2D(filters=64, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = BatchNormalization(momentum=0.15)(model)
model = MaxPool2D(pool_size=(2, 2))(model)
model = Conv2D(filters=64, kernel_size=(5, 5), padding='SAME', activation='relu')(model)
model = BatchNormalization(momentum=0.15)(model)
model = Dropout(rate=0.3)(model)
model = Conv2D(filters=128, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = Conv2D(filters=128, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = Conv2D(filters=128, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = BatchNormalization(momentum=0.15)(model)
model = MaxPool2D(pool_size=(2, 2))(model)
model = Conv2D(filters=128, kernel_size=(5, 5), padding='SAME', activation='relu')(model)
model = BatchNormalization(momentum=0.15)(model)
model = Dropout(rate=0.3)(model)
model = Conv2D(filters=256, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = Conv2D(filters=256, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = Conv2D(filters=256, kernel_size=(3, 3), padding='SAME', activation='relu')(model)
model = BatchNormalization(momentum=0.15)(model)
model = MaxPool2D(pool_size=(2, 2))(model)
model = Conv2D(filters=256, kernel_size=(5, 5), padding='SAME', activation='relu')(model)
model = BatchNormalization(momentum=0.15)(model)
model = Dropout(rate=0.3)(model)
model = Flatten()(model)
model = Dense(1024, activation = "relu")(model)
model = Dropout(rate=0.3)(model)
dense = Dense(512, activation = "relu")(model)
head_root = Dense(168, activation = 'softmax')(dense)
head_vowel = Dense(11, activation = 'softmax')(dense)
head_consonant = Dense(7, activation = 'softmax')(dense)
model = Model(inputs=inputs, outputs=[head_root, head_vowel, head_consonant])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
Since the image files are quite large, the model was trained on four files. Dataset 3 is the last file containing the images. We can see that losses continue to decrease for both training set and test set. Accuracy scores for root, vowel, and consonant are slowly increasing. No overfitting is observed.
We will evaluate more architectures and consider different ways of constructing the convolutional layers of our model. More hyperparameters such as filter number, dropout rates, number of strides and image resize should be carefully considered and further evaluated to reduce computation complexity and increase model performance. We will also try to ensemble the best models or try to use transfer learning (if available) to boost model performance.
Stay tuned.
We are NNPlayer, a group of students from Brown University. We are one of the team participating the Bengali.AI Handwritten Grapheme Classification competition kaggle.com/c/bengaliai-cv19.
You can find out more great work done by our team member at GitHub:
You can find out the starter code for our project at github.com/bao1981105/decode-bengali