Export to MS Word without putting every **** piece of text in a text box
I can see certain documents were created in MS Word with simple follow on/flowing text and standard paragraph styles ie no **** text boxes. Even with the export options turned off, Acrobat does this incredibly annoying export to text boxes where it is simply run-on (flowing) text. You then have to extract the text, box by box to get the document to be in an editable form. Even exporting to rtf has similar issues. Exporting to plain text means you have to format.
In the end, you waste valuable time either extracting the text or having to reformat the text to get something that's useful
Stephen Yuenger commented
I am scanning long documents of scripts for plays and to manually edit the results of text boxes makes this a non-solution. It's easier to retype the whole thing.
I did not see any example files attached for you to review Jessica so here are some. RTF format worked to remove the multiple text boxes but look how the highlighted text is handled in some of the areas. Thanks for whomever is willing to help. Take care & KLUJICS!
Jessica Rainey commented
The best way I have found to correct this issue is by opening Word (Microsoft 365) and then opening the pdf in word. Microsoft Word does a better job of converting the pdf to a word document than Adobe Acrobat DC. The previous Adobe Acrobat Pro version did a good job too, but DC is horrible. If you want to use DC to export to word, the only way I figured out how to avoid the text boxes, is to go into Preferences/Convert to PDF/Word Document/Edit Settings and select under Layout Settings "Retain Flowing Text." If "Retain Page Layout" is selected then every line is placed in a text box (which is ludicrous and stupid - but that's what it does). I could deal with a whole paragraph in a text box, but come on, every line!!! Insane. What is even more ridiculous is if the pdf is OCR'd first and the correct text function is used, those corrections are not saved for the export to Word. WTF! I corrected the OCR when it gets exported it should retain the corrections, but no, it OCR's again on the export and includes the same errors again. The OCR in DC is not as good as it was in Pro.
Gerald Montagu commented
However, I have just completely by chance found a fix. Try exporting your file from Indesign to a PDF. Then, open that PDF in Word for Windows. You should then have a fully stable Word file that you can work with (including running comparisons). The key seems to be that Word's import process works completely differently from Acrobat's export process - it will be very evident that the file is structured completely differently. The difference in the end results is so great, that it must be strongly arguable that when Adobe claim Acrobat 'exports' to Word they are effectively misselling. Please note that this fix does not seem to work if you try to open the PDF in Word for Mac (as at September 2018) because the import process is (sadly) nothing like as refined as it is in the Windows version.
AdminGirija Agarwala (Admin, Adobe) commented
Could you please share the following information with us:
1. The pdf files and the output Word file. Highlight the areas you find problems with.
2. The OS platform that you are using.
3. The Acrobat dot version you are on.