Icelandic OCR
In fact, all languages supported by the open source tesseract project should be available to Adobe customers. Better yet, Adobe licenses OCR from I.R.I.S. S.A. and meanwhile Readiris 17 has support for 130 languages, including Icelandic. Time to update or lose some license restrictions and move open source. What are you paying for, other than sub-par technology? Tesseract is a project supported by Google with an Apache 2.0 license.
Foxit and ABBYY Finereader also have support. Adobe is severely lagging behind here. The tesseract engine is particularly good at character recognition but does not have a very nice interface. There are scholars all over the world that study this language. Editions of medieval texts are especially popular, and tesseract is actually pretty good at older editions of manuscripts, thanks to community based work on tesseract training. It's a shame to not be able to scan in and use OCR on these documents straight from an Adobe product. It would make it much easier for translators to input text into a computer aided translation tool, and could provide the impetus for a windfall of new publications.
When there is an open source solution available it is a real shame that a paid-for consumer product cannot provide access to the same level of language support. I chose Adobe over the other commercial options for access to a wider range of professional level software, but am left disappointed by this aspect of the software while every other piece of OCR software on the market today beats Adobe out in this field.
This is a sore disappointment, and has been for some time now.
-
Anonymous commented
So, when will Icelandic be available?
-
Joren Roth commented
I second this motion whole heartedly, I have encountered the exact same barrier as the op. Trying to export text from a PDF with unrecognized characters (specifically Old Norse/Icelandic). Why isn't character recognition tied to unicode? It doesn't have to be language specific if all it is recognizing is a specific character, which has already been encoded.
I hope this feature can be implemented soon, as it is crucial to my needs as an Adobe Acrobat customer.
-
Hi,
Thank you for your suggestion.
I'll forward this to the team.Regards,
Gaurav