Icelandic OCR

In fact, all languages supported by the open source tesseract project should be available to Adobe customers. Better yet, Adobe licenses OCR from I.R.I.S. S.A. and meanwhile Readiris 17 has support for 130 languages, including Icelandic. Time to update or lose some license restrictions and move open source. What are you paying for, other than sub-par technology? Tesseract is a project supported by Google with an Apache 2.0 license.

Foxit and ABBYY Finereader also have support. Adobe is severely lagging behind here. The tesseract engine is particularly good at character recognition but does not have a very nice interface. There are scholars all over the world that study this language. Editions of medieval texts are especially popular, and tesseract is actually pretty good at older editions of manuscripts, thanks to community based work on tesseract training. It's a shame to not be able to scan in and use OCR on these documents straight from an Adobe product. It would make it much easier for translators to input text into a computer aided translation tool, and could provide the impetus for a windfall of new publications.

When there is an open source solution available it is a real shame that a paid-for consumer product cannot provide access to the same level of language support. I chose Adobe over the other commercial options for access to a wider range of professional level software, but am left disappointed by this aspect of the software while every other piece of OCR software on the market today beats Adobe out in this field.

This is a sore disappointment, and has been for some time now.

9 votes

Ryan E. Johnson shared this idea · May 10, 2018 · Report… · Admin →

Resolved · Jun 27, 2019

An error occurred while saving the comment

Anonymous commented · January 5, 2021 4:52 PM · Report

So, when will Icelandic be available?

Submitting...
Joren Roth commented · February 9, 2020 3:16 PM · Report

I second this motion whole heartedly, I have encountered the exact same barrier as the op. Trying to export text from a PDF with unrecognized characters (specifically Old Norse/Icelandic). Why isn't character recognition tied to unicode? It doesn't have to be language specific if all it is recognizing is a specific character, which has already been encoded.

I hope this feature can be implemented soon, as it is crucial to my needs as an Adobe Acrobat customer.

Submitting...
AdminGaurav Maheshwari (Software Engineer, Adobe) commented · June 27, 2019 7:26 AM · Report

Hi,

Thank you for your suggestion.
I'll forward this to the team.

Regards,
Gaurav

Submitting...

Feature request / Bug report

Feedback

Acrobat for Windows and Mac: Scanning and OCR

Icelandic OCR

Your importance score has been recorded.

Acrobat for Windows and Mac: Scanning and OCR

Categories

Icelandic OCR

We're glad you're here

Your importance score has been recorded.

We're glad you're here

We're glad you're here