It would be great if Acrobat was multi=threaded. I routinely have to OCR 4000 page documents which almost always fail and take hours
Acrobat needs to be multithreaded. I have a top of the line machine with 32g of ram and an I7-6700k processor an m2 SSD, yet am unable to successfully run OCR on a 4000 page document without splitting it up into 250-500page chunks, which is cumbersome. Even then, acrobat will only use 12% of my resources.
Yet another reason to finally make Acrobat 64 bit and leave Windows 3.1 days behind.
James D. Keeline commented
4,000 pages is a lot. There is merit in doing it in pieces and combining them afterward.
When I don't care to wait for Adobe Acrobat Amateur (it does not deserve the "Pro" label) to OCR a document, I use a Python script on my MacBook Pro 2020 16", 32G RAM, 2T SSD called OCRmyPDF. As I recall from the installation, there were a few dependencies but ultimately I got it going with "brew." Now I can OCR from the command line (terminal). When it is running, the fans and processor meters show that all of the resources are being employed. The time required depends on the resources but is about 1/3 that of Acrobat with an 8-core processor.
It is a tad ironic that a free, open source program can out-perform the expensive Adobe product which has not improved in this area in the past 10-20 years.
How is it 2021 and this still uses a single thread?
Why does it need to take an hour to OCR 1000 pages?
Here here! Just spent a very frustrating afternoon (read: 4 hours) trying to Bates stamp docs in Adobe. Tried multiple versions of Adobe and kept hitting crashes. Even tried multiple computers, same results. Installed a trial of Foxit and while it wasn't blazing fast, it got the job done without complaining.
Safe to say we're reevaluating using Acrobat in our office and are now looking at the alternatives. That's 4 hours of my life I can't get back, not to mention 4 billable hours wasted. I can't bill the client for that time.
STILL WAITING ON THIS ADOBE
Just use PDF XChange Editor. It's cheaper (like about $50) and works better for 99% of everthing you need to do with a PDF. It uses multithreads/multicore.The only thing I could not do with PDF Xchange Editor is remove duplicates, which is a plugin for Acrobat made by a third party provider. It's just not worth paying for DC imho.
James D. Keeline commented
I have seen this sentiment raised for about a decade. Considering the millions of dollars (or equivalent currency) that people pay for Adobe Acrobat Pro, it is well past the point when it should behave like a fully professional program. This means rock-solid stability and reliability and speed. If we invest in hardware with ample RAM, multi-core processors, and good CPU speed, the software should take advantage of the available resources. Doing otherwise is cheating us from the productivity we deserve.
When I use software like HandBrake, it is immediately obvious because the fans spin up and any metering software will show that the resources are being fully utilized. It is harder to do video in a multicore multithreaded environment but they manage. Why can't Adobe in the processes we use every time we open Acrobat Pro ? This includes OCR ("Text Recognition") and building PDFs from images.
This should be faster than it is with a brand new MacBook Pro 2020 with 16G RAM and 2.6 GHz 6-Core Intel Core i7. For the OCR process, it is not really different than my old computer, a mid-2014 MacBook Pro with only 8G RAM.
Tom Bilan commented
I second this. I have an 8 core computer and this process is a perfect candidate for parallelization. I think all that Adobe would need to do is divide the document up into X number of smaller documents and then process them in separate streams then combine the results. X = # of cores. It doesn't seem like too hard of a computer science problem and would be a big win for anyone who's bought a computer in the last decade since everything is multi-core now.
Michel Phillips commented
I agree. I have a fast, new-ish CPU and plenty of RAM, but I can't OCR more than about 108 pages without Acrobat DC crashing. For a 1500 page document (which I routinely deal with), this means I have to run OCR manually 14 or 15 times. On Acrobat Pro XI I would start the OCR process when I was leaving for the day, and when I came back in the morning my entire document would be OCR'd.
New products are supposed to be BETTER. Not WORSE.
I have the same problems and get very frustrated with a minute load time on ducuments over 3000 mb.
Its been over a year! Has this been implemented yet?
Please also improve the Embed Index tool to use Multi-Thread and better CPU utilization. OCR and Index Embedding go hand-in-hand
Adminrishusha (Admin, Adobe) commented
We have raised a feature request for this and shall update you about the same.
mohamed emad commented