Handwriting Recognition & Deep Learning; Minyue Dai

From CSclasswiki
Jump to: navigation, search

Week 01: Jun05 - Jun11


  • Install Tensorflow and System in GAN article
On Windows, Tensorflow only works on Python3, but the system Digits only works on Python2.
Successfully install Tensorflow on Python3.5
  • BLSTM(Bidirectional Long Shoert Term Memory)
This is a modified recurrent neural network, which can makes use of both far and close context bidirectionally.
  • Combine Syraic Image data from Spring2017 to a single file
In SyriacGenesis/GenesisSync3820_4540.mat
721 new image data with matched tags


  • Rename the Syriac Image data file
  • Find how to solve the Python Version Problem of Tensorflow&Digits
Author's Github contains source code of the GAN he uses
If wants to run the digits, the only way might be use Linux because digits the author uses is a modified version.
and public version of digits does not support digits now
    • 1. Use linux and run digits platform
    • 2. Use windows and the source code of tensorflow
    • 3. Both solutions need understanding of Tensorflow Library
Network Source Code of mnist dataset
  • Implementation of BLSTM
Tutorial of Tensorflow(including BLSTM)
Tensorflow Documents
BLSTM in Handwriting Recognition Paper


  • Finish the BLSTM paper in handwriting recognition
  • Finish the Tensorflow Tutorial
This is a copy of a BLSTM example from Tensorflow Tutorial, and I add some comments of
used functions. Also I write a multi-layer BLSTM.
Important Functions
    • tensorflow.contrib.BasicLSTMCell: Build an instance of basic LSTM layer.
    • tensorflow.contrib.static_bidirectional_rnn: Build a single layer BLSTN
    • tensorflow.contrib.stack_bidirectional_rnn: Build a multi-layer BLSTM


  • Better Understanding of LSTM and Implementation
Tutorial in RNN(including BLSTM)
  • Test Syriac Letter Data in BLSTM
Read Data in Tensorflow
Read *.csv Data


  • Visualization of Tensorflow
After running the program, type tensorboard --logdir=/Self-defined graph directory/ in command line to start TensorBoard
Then type http://localhost:6006/ in browser
SCALARS: The graph of variables, such as cost and accurary.
GRAPHS: The visualization of the model's structure
Official TensorBoard Tutoial
Tensorboard Example
  • Read image data in Tensorflow
Read *.jpg in Tensorflow
show grayscale image
Read grayscale image and convert it into numpy array

Week 02: Jun12 - Jun18


  • Work on Autoencoding
Encode given data through hidden units, similar to PCA
Impose sparsity on hidden units, which means each unit should be inactive for most input
ρ is a ”‘sparsity parameter”’, typically a small value close to zero (say ρ=0.05ρ=0.05). In other words, we would like the average activation of each hidden neuron to be close to 0.05 (say).
Blog Post on Autoencoding
Tensorflow Example for Autoencoder


  • Test Syriac Letter on BLSTM Model
Need to read same-size Syriac Letter images into a binarized numpy array and save them as array.
Syriac Image file:
Local: C:\Syriac\CharSet\CharSet
Remote: H:\Syriac\CharSet.zip
Images named with focus are in same sizes
Python Code that convert images in P1 to a numpy array and save it: H:\Summer2017\SyriacLetter.py
Numpy Array data file of images in P1: C:\Syriac\FocusArray\Image\P1_focus.npy
save numpy array data in npy file

  • Pretrained Model on Caffe2 and Tensorflow
Convert Caffe Model to Tensorflow
Caffe Model Zoo


  • Transform Syriac Letter Image to Numpy Array
    • H:\Summer2017\AllSyriacLetter.py: Read image in P2-P5 into Numpy Array and save them
    • C:\Syriac\CharSet\FocusArray\Image\SyriacFocus.npy: Data File for all Syriac Image array
Local: C:\Syriac\CharSet\FocusArray\Image
Remote: H:\Summer2017\FocusArray\Image

  • Test Syriac Letter Data in AutoEncoder
For efficiency, just test image in P1: P1_focus.npy
H:\Summer2017\AutoEncoder\SyriacAutoEncoder.py: The definition of cost function is squared error, which does not perform well on Syriac Letter. For AutoEncoder, it is better to add KL divergence for the sparcity.

  • Read the code file of GAN( Generative Adversarial Networks)
    • H:\Summer2017\GAN\GAN.py: The Code file of GAN, which just contains some functions and class and only works with DIGITS platform


  • Read the Syriac Letter Image label and transform them as Numpy Array Data
    • H:\Summer2017\SyriacLetterLabel.py: Use number-Letter file CharSetLabels.txt to generate binary code for each letter , then write the binary code as numpy array data and save them
Local: C:\Syriac\
Remote: H:\Summer2017\Syriac\
CharSetLabels.txt: Image name number - Letter data file
SyriacBinaryCode.txt: Letter Name - Binary Code data file
FocusArray\Label\: All Image Label File(All in binary code)
Pn_focus.npy: Image label data for Pnth folder data
SyriacFocusLabel.npy: Image label data for SyriacFocus.npy(All Syriac Letters)

  • Test the labeled Syriac Letter Data on BLSTM
    • H:\Summer2017\RNN\SyriacBLSTM.py: Both single-layer and multi-layer BLSTM perform bad, it does not convergent.

  • Find GAN Paper


  • Find GAN Paper
GAN Fundamental Paper
Yann LeCun Blog on GAN
GAN Tutorial from its Author
GAN Tutorial Blog Post
Code for Blog Post
  • Transform handreg Syriac Letter Data and Label into Numpy array
This is the better sliced data in size of 59*58
C:\Syriac\OldChar: This is the image data
C:\Syriac\OldCharArray\Image: This is the Numpy data for Image
C:\Syriac\OldCharArray\Label: This is the Numpy data for Label
  • Test handreg Syriac Letter Data on AutoEncoder
The model works on handreg data, but it performs much worse that it does on mnist data set:
1. mnist data is much smaller(28*28) than handreg Syriac(59*58) ---- Enlarge the NN but still performs bad
2. the sparcity issue, 80% of the output should be 0 ---- add NN's weights as cost or add -0.05 at the initialization of weights, both works well.
Model and Code File:
H:\Summer2017\AutoEncoder\oldSyriacAutoEncoder.py: The 2-layer NN AutoEncoder codefile for handreg data set
Result(Image) (H:\Summer2017\AutoEncoderOldCharImage\):
AutoEncoder_oldSyraic1024.png: Result for just squared error as cost and 1024-512 cells in NN
AutoEncoder_oldSyraic-0.05.png: Result for just squared error as cost and -0.05 for all weights at initialization
AutoEncoder_oldSyraic0.1Weightcost.png: Result for just squared error and 0.1*mean(weights) as cost
AutoEncoder_oldSyraic0.2Weightcost.png: Result for just squared error and 0.2*mean(weights) as cost

Week 03: Jun19 - Jun25


  • Test handreg Syriac Letter Data on BLSTM classification model
    • H:\Summer2017\RNN'\handregBLSTM.py: BLSTM Model for handreg Syriac Letter Data
Train on 10000 training data in single-layer BLSTM has about 30% accuracy in testing data
Multi-layer BLSTM performs even worse
Then change the optimizer and the accuracy rises to over 95%, but it drops dramatically in training. This problem might be caused by the order of the data.

  • Build CNN AutoEncoder
Tensorflow Tutorial2
CNN AutoEncoder Tutorial


  • Shuffle the handreg Syriac data and test it on BLSTM again
    • H:\Summer2017\DataPrepare\shuffleSyriac.py: Shuffle and save handreg Syriac Data
shuffleSyriacReg.npy: shuffled Syriac data
shuffleSyriacRegLabel.npy: shuffled Syriac data's label
For the shuffled data, the plot of accuracy grows normally, but the testing accuracy is just 55% after 100,000 training.
Also test on CNN, the accuracy is similar.
When works with only P1 data, the accuracy is over 95%

  • Find the problem of Syriac Data and relabel all of them
The issue is the dictionary for letter-code is randomized for each labeling.
Sort the letter list to make sure the order of the list will be the same for all files
Relabel all data files
Retest data on BLSTM and the model works well
Now the accuracy on test data for handreg Syriac Letter classification is about 99%(Need about 10,000 steps)


  • Read and Test GAN
CNN Tutorial
Interactive GAN Tutorial
Code for Interactive Tutorial
GAN Training Tricks
Test the given GAN for mnist data set, it takes a long time to train
Test the pretrained model and it performs pretty well on mnist data set
This code example only uses CNN, but most complicated models from papers use deCNN.


  • Check GPU
How to use GPU in tensorflow
Tensorflow-GPU install document
Current GPU:GeForce GTX 480 Compute Capability: 2.0
Required Compute Capability for Tensorflow-GPU: >=3.0
CUDA Path: C:\Users\howelab\AppData\Local\Temp\CUDA

  • Test GAN on handreg Data
    • GANhandreg.py: It works but really slow. (This model doesn't use deCNN)

  • Save Binarized Contrext Syriac Data
    • C:\Syriac\Bcontext: Image Size: 60*60


  • Check GAN on handreg Syriac Data
The model runs slowly on CPU. It runs about 6000 times over 18 hours.
  • Test GAN on smaller data
    • shufflesmallSyriac10000.npy: The first 100,00 shuffled handreg data is resized into 29*29
    • GANtestsmall.py: The model file
It's trained much more quickly, but it still takes long time.
It's hard to say if the model works, because it takes long time.

Week 04: Jun 26 - Jul 02


  • Create AutoEncoder model's generator with CNN and DeCNN
output weight and height: out = ceil(in//stride)(with "same" padding)
W(filter) = [height, weight, in_channels, out_channels]
output weight and height: out = in*stride
W(filter) = [height, weight, out_channels,in_channels]
CNNAutoEncoder_2layers.py: 3-layers CNN and DeCNN Autoencoder (shared weights) on mnist
Give up fully-connected layer and add a Convolutional layer
Code Size [4 4 4] = 64; Original Size = 28*28 = 784
Use Xavier initialization for weights rather than normal-distribution
Xavier Initialization
  • Save and restore model
Save and Restore Model Tutorial


  • Test CNN Autoencoder on handreg Syriac Data set
    • CNNAutoEncoder_oldSyriac.py: 3-layer CNN AutoEncoder
    • trainedCNNAutoEn_oldSyriac\CNNAutoEncoder_oldSyriac.ckpt: Trained model (5 epoch, 600,00 training data)
    • AutoEncoder_oldSyriac_CNN.png: Example(Test Data) for the CNN AutoEncoder
This model only uses CNN and DeCNN without pooling and performs really well, much better than fully connected networks. Also, CNN and DeCNN share parameters and they can be used in GAN.


  • Implement GAN with deCNN
GAN with DeCNN for mnist
    • DeGAN_testsmall.py: DeGAN on small handreg Syriac data
Discriminator: CNN - CNN - FullyConnected - FullyConnected
Generator: FullyConnected - DeCNN - DeCNN - DeCNN
Training is much faster, but the model does not converge


  • Try to make DeGAN on handreg Syriac works
Add batch_norm to each layer, the model works much better, but the generated images have weird "grid" and it may caused by the batch_norm function.

Week 05: Jul 17 - Jul 23


  • Try to build an encoder in the discriminator of GAN
GAN/DeAutoGAN_testsmall.py: Train on small(29*28) handreg Syriac data set. The discriminator will also return a z vector and D is trained on Diff(real_image,rebuilt_image)
GANimage\DeAutoGAN_small_random.png: The first line is original image, the second line is the rebuilt image, and the third line is the fake image. It seems the autoencoder works well, but the GAN does not perform well.

  • Try to build conditional GAN
conditionalGAN\CDAutoGAN.py: This is the the conditional GAN with autoencoder trained on handreg Syriac data set.
CDGANimage\CDGAN_handreg_30960period.png: The result shows that autoencoder perfroms well, but the generator does not converge.
CDGAN for mnist


  • Build CDGAN without autoencoder
Because in the article, the author first trains a good CDGAN, and then use transfer learning technique to build an autoencoder with the weights from the CDGAN. Actually the author builds two model: CDGAN to generate fake image and discriminate image, autoencoder to get the z vector of real image.
Article for CDGAN
conditionalGAN\CDGAN.py: The CDGAN without autoencoder, trained on handreg Syriac dataset.
CDGANimage\CDGAN_handreg_noauto_3440period.png: The result shows the model does not converge.

  • Other structures to get z vector of real image: Adversarially Learned Inference and learned similarity metric
Adversarially Learned Inference
The idea is generating z vector for real image and generate fake image for z vector input, so the discriminator input is both image and z vector rather than just image
Webpage and Github Code
learned similarity metric
The idea is building encoder and decoder rather than just generator, and the input of discriminator is rebuilt real image through autoencoder rather than real image itself.
Github Code


  • Build CDGAN on mnist data, exactly same as the author's model
Script from Author to Rebuild his work
conditionalGAN\CDGAN_mnist.py: The model which is exactly the same as author's model.
CDGANimage\CDGAN_mnist_noauto_10000period.png: The result shows it works but really worse. The cause might be:
1. It needs more training period
2. The learning rate is not optimal
3. The initialization of parameters, especially weights for layers, is not optimal.

  • Try Autoencoder for learned similarity metric on mnist dataset
AutoLSM\AutoLSM_mnist.py: Incomplete


  • Improve the reproduced CDGAN on mnist
It seems the author constrains the stddev of weights to be 0.02, which means all weight parameters should be close to 0, but the weights initialization I uses is Xavier, which will makes the parameters too small or large.
conditionalGAN\CDGAN_mnist.py: Change the initialization of weights, and it performs really well.
conditionalGAN\pretrainedCDGANmnist12400\model.ckpt: The model parameters
CDGANimage\modifiedCDGAN_mnist_noauto_12400period.png: The fake image examples
tensorboard --logdir=/tmp/tensorflow_logs/CDGANmnist2: Run in command line and open http://ford343-r08838:6006 to see the loss curve in learning
conditionalGAN\trainedCDGAN_mnist.py: Get and run the trained models

  • Build the Autoencoder based on the trained CDGAN on mnist
Change the output of the discriminator from 1 to the length of z vector for the encoder, and then train the model based on the weights from trained CDGAN. Frozen the variables in generator
conditionalGAN\CDGANAuto_transfer\model.ckpt: Parameters from trained CDGAN for training
AutoCDGAN\AutoCDGAN_mnist.py: Train Autoencoder based on trained CDGAN and it performs really well.
Find a problem of loading a subset of trained variables, find solution at RalphMao's answer
AutoCDGANimage\AutoCDGAN_mnist_1000period.png: The test result. It works really well. The first row is original image, and the second row is the rebuilt image.
AutoCDGAN\pretrainedAutoCDGAN_mnist\model.ckpt: The model is saved.


  • Try CDGAN on handreg Syriac data set
conditionalGAN\CDGAN_handreg.py: Use exactly the same parameters, and the result shows it doesn't work.
CDGANimage\modifiedCDGAN_handreg_noauto_8900period.png: The result of the network.
Possible Reasons:
1. It needs more layer because the image size is four times of the mnist
2. The data is skewed because some syriac letters are rarely used.
3. Some Syriac letters are really similar.
  • Find other examples of CDGAN
The same article does another experiment of faces, and it has another more complicated networks.
README of projects
code for face project
  • Improve the CDGAN for handreg
1. Add 2 more CNN/DeCNN for D/G
2. Set fakelabel to be the same as y_label to solve the skewed data issue

Week 06: Jul 24 - Jul 30

Jul 24

  • Finish training 6-layer CDGAN on handreg data set and save the model
conditionalGAN\CDGAN_handreg2.py: The model has 6-layer for both generator and discriminator, and the fake labels are from the distribution of real labels.
CDGAN\modifiedCDGAN_handreg2_noauto_11680period.png: Examples from Model. The first line is real image and the second line is the fake image from same labels.
conditionalGAN\pretrainedCDGANhandreg\model.ckpt: The trained model for 11680 periods.
conditionalGAN\trainedCDGAN_handreg2.py: Restore and run the trained model.
conditionalGAN\CDGANAuto_handreg2_transfer\model.ckpt: Resave model for transfer learning in Autoencoder of handreg Syraic Data.

  • Train Autoencoder for getting z vector from input image based on trained CDGAN
AutoCDGAN\AutoCDGAN_handreg.py: Train an Autoencoder based on the trained CDGAN.
AutoCDGANimage\AutoCDGAN_handreg2_1000period.png: Results from the model
AutoCDGAN\pretrainedAutoCDGAN_handreg2\model.ckpt: Trained model after 1000 periods

Jul 25

  • Get Data for denoising networks
C:\Syriac\ContextB: Binary Image
C:\Syriac\ContextC: Color Image
C:\Syriac\ContextG: Grey Scale Image
C:\Syriac\StackB: Denoised Binary Image

  • Prepare Data for Training
DataPrepare\LabelImage_GandStack.py: Load and save Gray Scale and Stack image and labels(Just intersection).
C:\Syriac\DenoiseArray\shuffledGrayimage.npy: shuffled Gray Scale Image data
C:\Syriac\DenoiseArray\shuflledStackimage.npy: shuffled Binary Stack Image data
C:\Syriac\DenoiseArray\shuffledLabel.npy: shuffled label data

Jul 26

  • Finish training Denoise Model for Gray to Stack Image
Denoiseimage\Denoise_GandS.png: The Result from trained Model. The first line is gray scale image input, the second is the target stack image, and the third is the result from model.
Denoise\\pretrained_GandS\\model.ckpt: Model saved.

  • Read More Papers
PCA-Initialized Deep Neural Networks
Detailed Papar on PCA in CNN
Professor Nich's paper
  • Use AutoEncoder for CDGAN to generate all z vectors of handreg Syriac Data
OldCharArray\Label\SyriacRegZvector.npy: z vectors data in order of SyriacRegLabel.npy (60880*100)

Jul 27

  • Train Denoise Model with labels on Gray to Stack
Denoise\CDDenoise_GandS.py: New Modeled with label and input and threshold is 0.3
DenoiseIimage\DenoiseCD_GandS_0.3.png: Result from Model
Denoise\pretrainedCD_GandS\model.ckpt: Model saved.
  • Install Tensorflow-gpu on new computer
All softwares installed, and it works.
Main Instructions on Tensorflow
Download Cuda
Download cuDNN (5.1 rather than6 )
testGPU.py: Test if the tensorflow works on GPU.

Jul 28

  • Make the Syriac Number labels for z-vector data set.
OldCharArray\Label\SyriacRegNum.txt: Number of Syriac Letter for handreg data

  • Build CDGAN with pooling layers on handreg Syriac
Blog for building unpooling layers in tensorflow

Week 07: Jul 31 - Aug 06

Jul 30

  • Prepare Data for handreg Syriac letters for dating model.
DataPrepare\handleDate.py: Preprocess needed data for dating
Syriac\DatingArray\: All train and test data for dating

Aug 01

  • Build Dating Model for handreg Syriac Data set
Dating\DatingHandreg.py: Model based on CNN for dating
  • Run and save the model for CDGAN with pooling layers for handreg data set
conditionalGAN\pretrainedCDGANhandregpooling\model.ckpt: Model saved.

Aug 02

  • Build AutoEncoder based on CDGAN with pooling layers for handreg Syriac data
AutoCDGAN\AutoCDGAN_handreg_pooling.py: Model for training CDGAN based on transfer learning.
AutoCDGAN\pretrainedAutoCDGAN_handreg_pooling\model.ckpt: AutoEncoder Model saved.

  • Generate Z-vector for handreg Syriac data in CDGAN with pooling layers
AutoCDGAN\trainedAutoCDGAN_handreg_pooling.py: Generate z-vectors

Aug 03

  • Run Dating Model
Dating\DatingHandreg.py: Dating Model based on Conditional CNN. The result shows that it's overfitting.

  • Write description for built CDGAN with pooling layers model.
GAN Paper

Week 08: Aug 07 - Aug 13

Test different dimensions for z-vectors and write up summaries.

Week 09: Aug 14 - Aug 20

Aug 14

  • Prepare all needed scripts and data for presentation
  • Create Diagram

Aug 18

  • Use pretrained CDGAN model to train Combined CDGAN
H:\Summer2017\CBCDGAN\: Scripts and model

Week 09: Aug 21 - Aug 27

Aug 21

  • Use F-test to evaluate performance of Denoise and CDDenoise Model
F test method
DenoiseResult\DenoiseTestRaw.npy: 10000 test data's raw result from model
DenoiseResult\CDDenoiseTestRaw.npy: 10000 test data's raw result from conditional model
  • Compute F-table and F-score
Denoise\evaluation.py: Generate F-table and F-score

Aug 23

  • Build Classifier based on CNN and test it on reg3 and contextB data
Average Accuracy for reg3 Test Images: 0.9917
Average Accuracy for contextB Test Images: 0.9348