Difference between revisions of "Attention-based Neural Networks for Handwriting Recognition"
(→Summer 2021 Notes) |
(→Summer 2021 Notes) |
||
Line 27: | Line 27: | ||
model_2021-06-24_17_28_20 standard size run; added character frequency weighting. | model_2021-06-24_17_28_20 standard size run; added character frequency weighting. | ||
+ | |||
+ | 6/24 Ford342-05: standard res 32x64 unit, x2 subsampling, no character weights | ||
+ | |||
+ | 6/24 Ford342-06: lo-res 16x32 unit, x4 subsampling, no character weights | ||
+ | |||
+ | 6/25 Ford 354: standard res 32x64 unit, x2 subsampling, linear character weights, argmax pixels |
Revision as of 16:51, 25 June 2021
Documentation
Recording of working system: https://smith.zoom.us/rec/share/841_Ne3snhwP3mSduKZu63ctTFzYvdDdCrwsdPvCQWOAFDxka9tsdDTwGGZM3fWw.n5T9sxD24vdlCBzQ Passcode: iZQc4=5s
Fall 2020
This honors thesis aims to improve current handwriting recognition through refining the use of attention mechanisms in sequence-to-sequence and Transformer models in HTR systems.
Week 1: 09/04 - 09/10
Goals:
- Install PyTorch and get something running
- Find a good starting point - Review more literature (particularly for sequence-to-sequence models) - Distill knowledge from currently cited papers in proposal
Summer 2021 Notes
Pretraining run notes:
model_2021-06-18_11_53_47 is first successful run, using ReLU on fully connected layers. 4x image subsampling on 32x64 basic unit means that letters don't fill much of the space.
model_2021-06-22_16_58_03 implements random pixel sampling to handle images that don't fit in memory. 2x image subsampling makes letters bigger, but decreases random sampling fraction. [SVN 2293]
model_2021-06-24_17_28_20 standard size run; added character frequency weighting.
6/24 Ford342-05: standard res 32x64 unit, x2 subsampling, no character weights
6/24 Ford342-06: lo-res 16x32 unit, x4 subsampling, no character weights
6/25 Ford 354: standard res 32x64 unit, x2 subsampling, linear character weights, argmax pixels