Common Schedule

From CSclasswiki
Revision as of 11:31, 13 June 2014 by Rhuang (talk | contribs) (Week 5: 6/9 - 6/13)
Jump to: navigation, search

Alice Yang & Rui Huang


Week 1: 5/12 - 5/16

- Set up computer; install software
- Familiarize myself with Matlab, having no previous experience
- Create Matlab exercise Q/A
- Become familiar with Adriane's progress and Emma's thesis Automated Writer Identification for Syriac Scribes

Week 2: 5/19 - 5/23

In Adriane's function processSyriacAnnotations.m, we saved the two structural arrays lftr and ldoc. lftr and ldoc are both structural arrays containing 14 fields which correspond to different letters in the Syriac alphabet. Each field is a cell array of various length, in which feature vectors describe a sample of that letter found in a manuscript. Inside each cell array, the first two indices indicate the character's global skew, while the rest of the indices in groups of 7 are feature vectors that describe each feature/cavity's 7 parameters of transformation (translation, rotation, elongation, and shear). At same index of each sample in ldoc is the string holding the source manuscript where the sample came from.

Here are screenshots from Matlab:

  • lftr


Lftr.png

  • lftr.Alaph


Lftr.Alaph.png


  • lftr.Alaph{1,1}


Lftr.Alaphcell.png




Functions written:

  • runSyriacStyleComparison.m compares test features against sample features and ranks all sample sources in proximity to each test source. It takes 4 fixed parameters and 3 modification parameter (sampFtr, sampSrc, testFtr, testSrc + method, weights, letters). The four fixed parameters and the output rank are all structural arrays. The method parameter can be one of the three functions, simple vote, rank vote, or weighted rank vote. Users can pass in which method, which set of weights and which letters they want to use in the program.
  • filterManuscripts.m filters a set of sample features and their sources and returns a subset of features and sources corresponding to the given subset of sources.
  • identifyRepresentativeFeatureSample.m takes a cell array of sample features and returns the most representative sample feature and its index in the given array.
  • identifyRepresentativeSample.m takes a cell array of samples and returns the most representative sample and its index in the given array.

Functions in progress:

  • cleanSamples.m cleans samples of characters passed in as a cell array. The method of cleaning may be specified: ccTouch, selectLetters, predefined, predefinedFromFile. When using selectLetters method, users have the option to save (default empty string vs. a given directory) the masks generated for later use in the option to use predefined, which uses the latest masks saved. The default settings for cleaning: method = selectLetters; mask = ' ' (no mask); saveMasks = ' ' (don't save)

Week 3 & 4: 5/26 - 6/6

Please refer to Rui's page

Week 5: 6/9 - 6/13

  • Display raw manuscripts on the Representative Samples webpage
  • Functions written/revised:
    • processSyriacAnnotations.m: This function processes Syriac annotations by creating a struct array of all given samples with their source information and images. The user can add padding to the extracted samples and specify the desired data format: 'raw', 'binarized', 'grayscale' or 'blackOnWhite' by entering 'true' or 'false' after each argument.
@param dbFile a database text file containing annotations
@param varagin user can specify the directory of manuscripts, the padding in pixels added to the image coordinates, and fields of character images with different formats (raw, binarized, grayscale, blackOnWhite) of the images
ex: 'padding', 3, 'raw', true
defaults: raw = true
binarized = false,
grayscale = false,
blackOnWhite = false,
padding = 0,
directory = 'C:\MATLAB\Handwriting\Summer2014\ImageDirectory');
@returns a struct array of character sample source information and image with fields: source, url, letter, coordinates
ex. sample(1).source = 'VatSyr1'
ex. sample(1).page = '01'
ex. sample(1).url = 'http://.../VatSyr1-01.png'
ex. sample(1).letter = 'Alaph'
ex. sample(1).coordinates = [515 65 80 105]
ex. sample(1).raw = %raw image
ex. sample(1).binarized = %binarized image
    • getSampleSources.m: This function returns a struct (samples) of the data passed in from a database file.
@param dbFile The database file containing info about the character samples
ex. 'C:\MATLAB\Handwriting\Summer2014\FullDB-2014-06-09.txt'
@returns a struct array of character sample source information with fields: source, url, letter, coordinates
ex. sample(1).source = 'VatSyr1'
ex. sample(1).page = '01'
ex. sample(1).url = 'http://.../VatSyr1-01.png'
ex. sample(1).letter = 'Alaph'
ex. sample(1).coordinates = [515 65 80 105]
    • getSamples.m:
    • retrieveManuscriptImage.m:
    • extractSampleFromManuscript.m: