Yesterday I’ve prepared two databases - one for training data and one for test data. For every symbol that has at least 10 samples I’ve randomly chosen up to 100 samples and put one third into the test database and the rest into the training database. I’ve also created a benchmark script that I took for a spin yesterday night.
The script calculates for every sample in the test database it’s position in the hit list of a classifier trained with all the samples in the training database. The results are disappointing but I’ve expected that.
- Top 1: 63.9128461189287%
- Top 2: 72.2575276138599%
- Top 3: 77.9013466485096%
- Top 4: 83.1366318656378%
- Top 5: 85.5651384475715%
Overall 13218 Tests for 873 Symbols in 13413 secs.
Of course for an interactive application like detexify the most important number is unlike most ocr applications not the Top 1 recognition rate but the Top 5 recognition rate. The right symbol being in the Top 5 means it shows up in the result list right away and no further search is required. I’ve already done some work on alternate classifiers (both totally different ones and variations of the original one with only parameters tweaked) and I will evaluate the performance of each based on this benchmark.