Rebuild a Word Document From PDF
File Juicer is first and foremost an image and text extractor, but if you have Mac OS X 10.4, 10.5 or 10.6 you can use it to convert simple PDF files to Word, RTF or plain text, unless they are scanned or encrypted.
Information about the structure of the Word document is not saved in the PDF file when it is generated. File Juicer will not try to recreate it. You can extract the text from PDF documents as RTF (rich text format) and this may be good enough if you don't need to preserve multicolumn or tabled layout.
Converting Scanned Documents to Text
File Juicer does not convert scanned images to text "Optical Character recognition" or OCR. There are 3 classic applications which do this: Adobe Acrobat X Professional [Mac] , OmniPage Pro X for Macintosh and Readiris Pro 12 For Mac . You can also choose to buy a scanner (Canon from Amazon) which comes with bundled OCR software (read the description closely). VueScan will also do OCR, and while not as advanced, it may be enough to cover your needs.
Also visit Apple's Mac App Store and do a search for OCR there. There are several OCR apps available, some more advanced than others and some more accurate than others. Accuracy is one of the things you pay a premium for. ABBYY FineReader Express has gotten good reviews as of this writing.
Using a Scanning ServiceIf you are not ready to learn the art of OCR you can hire a OCR service to do it. They may provide affordable prices in particular if they have offices in India. One example can be New York Document Scanning or do a Google search.
PDF To Word via RTF or ASCII Text
RTF is developed by Microsoft to carry formatted text between applications. Word, TextEdit and other applications can open it retaining the fonts, font sizes and colors. It will not preserve layout.
File Juicer use the same PDF to RTF engine as Apple's Preview and you can do the same extraction with Preview if you copy and paste the text out of each page of the PDF. File Juicer extract images from PDF and place them in a separate folder. You place them manually in the Word document when you have recreated the layout.
This is the File Juicer preferences I would recommend to extract the text and images needed to rebuild a Word file.
I recommend extracting both ASCII and RTF as sometimes it is easier to rebuild the document from pure text without the formatting. Word lets you use abstract names for the formatting like "Heading 1" or "Normal". In the RTF file, this is replaced by the actual font names and sizes used - like "Arial 16" and "Times 12".
Extract Images from Files
File Juicer is a general purpose extraction tool designed to search inside any file to see if there are images in any standard format. It was originally made to extract images from PowerPoint files, but since then it has been extended to recognize a lot of file formats.
Extracting images from PDF is done without re-compression so it preserves all the quality that was saved in the PDF file originally.
Rebuilding Word documents from other filesYou can download and try File Juicer for free for just this one function from the File Juicer page, but you may also check out its other functions by browsing the User guide and the File Format tips.