But this post won’t deal with geometric adventures, as I did on some previous one.
Tesseract is also an OCR software devoted to the extraction of text from printed (scanned) material.
Meroitic was a language and script used in Meroë and the Sudan during the Meroitic period (attested from 300 BCE) and which went extinct about 400 CE. For purposes beyond this discussion, I needed to OCR some meroitic text in hieroglyphic form. Btw, maybe -or maybe not- these purposes were related with some derivative work from Cthulhu Mythos.
So, to begin with, I had some pages written in meroitic which I wanted to transliterate to latin alphabet. Meroitic alphabet is pretty reduced:
As there is no language data for meroitic on tesseract’s site, we’ll have to “train” tesseract to recognize it. Fortunately it’s…
View original post 1,011 more words