Skip to content

Settings and activity

1 result found

  1. 109 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
    An error occurred while saving the comment
    James Keeline commented  · 

    Ben, I think we can all agree that 4,000-page documents is a lot to OCR at once. It would be a particular issue if the PDFs are very large, leaving little RAM for the page processing.

    I have noticed that when Acrobat "Pro" is doing certain actions on my MacBook Pro (2020) that it grabs disk space and doesn't return it until the application is restarted.

    I wonder if you have tried my suggestion of a parallel OCR solution called OCRmyPDF. It is a Python script that uses the Tessaract engine for the work. On my machine (6 core i7, 16 GB RAM), you can tell when it is running because the fans spin up. It is a bit faster than Acrobat "Pro" and seems more reliable. But I have not tried to do more than 1,000 pages at a time.

    If you care to supply a sample file, I can give it a try and let you know the results if you don't care to do it yourself.

    But Acrobat "Pro" should be multithreaded and not just change the UI to be different. I finally turned off the new UI and am happier being able to find certain functions which were well hidden in the new one.