File Juicer: Byte-by-Byte Search, Find, and Extract
File Juicer scans files byte by byte to
identify and extract known formats, including JPEG,
PNG, GIF, PDF, BMP, WMF, EMF, PICT, TIFF, Flash, ZIP, HTML, WAV, AVI, MOV, MPEG, WMV, MP3, MP4, AU, and
AIFF.
The only requirement for this to work is that the file to recover must be stored in one of the formats listed
above.
File Juicer does not decode or reencode data. For example, dropping an MP3 file onto File Juicer will not
convert it into a WAV or AAC file.
However, it will extract album cover art from MP3 files, as they can contain images.
For converting from one video file format to another, I recommend checking out VLC or search for a dedicated video converter in Apple's App
Store'
Text extraction is somewhat different. The requirement here is “fuzzy”: there should not be too much binary data
nearby. More details on text extraction are provided below.
1) Feeding File Juicer
You can tell File Juicer which files to search by dragging
and dropping them, or by selecting files and folders through the File menu.
Drag files or folders into the File Juicer window.
If you have many windows open, you can still perform this action. Start the drag from Finder
(or another application), and while holding down the mouse button, switch applications using
Command-Tab.
Drag the files onto File Juicer's Dock
icon. Or onto the File Juicer application icon in Finder.
Select the files and folders from the File -> Open...
menu.
Here, I have selected Safari’s cache folder, which usually contains around 1,000
images—mostly ads.
You can select multiple files and folders at once by holding down the Shift key while
selecting them.
From the Menu
There are shortcuts for juicing the caches of web browsers like Safari and Google Chrome, and
for the temporary images different applications save.
From Finder
You can "juice" .EXE, .PPS, .PPT and .PDF files from
Finder, by holding down the Control key while clicking on the file.
2) The Preferences
File Juicer can search for multiple file formats, and searching for all formats will take more time.
Decoding email attachments requires approximately twice the effort compared to searching for other file
formats.
Email attachments are a special case because File Juicer must perform base64 decoding, which is used to
send various file types via email. File Juicer decodes the attachment and attempts to identify the file
type if the file name or type is not immediately available.
File Juicer has one requirement for the files it searches: the images must be stored in their original
format within the files. In Flash, QuickTime, and Windows Media files, the original files are sometimes
converted into an internal format that File Juicer cannot process.
File Juicer can squeeze a large number of files from browser caches or if you feed it folders containing
many images. By default, the results are saved in a folder on the Desktop, but you can configure File
Juicer to save the results next to the original files (like StuffIt Expander), or choose a different
location.
Creating thumbnails can take some time, but they are useful for getting an overview of the images. You
can enable this option and skip the thumbnail creation process manually if it takes too long.
To avoid saving duplicates, File Juicer compares the contents of the images it finds. If it detects the
same image more than once, it will not save the duplicates, even if they have different names or
dates.
The checkbox labeled “Organize files in folders for each format” is useful when extracting numerous
files of different formats. If you are generating icons to test for potential file corruption, File
Juicer will place files for which it could not generate icons into a separate folder.
If (when!) I learn about bugs, I will fix them and update the web site.
3 File Formats
Up to date information is
in the File Formats List. I have included information
on over 100 common and less common file formats, detailing what they may contain and providing hints for
relevant applications.
Here is a short list of the formats File Juicer handles:
Microsoft PowerPoint, Word & Excel. This is also where most WMF and EMF files are found, as they are
the native clipboard types for images on Windows.
RAW files (from Canon digital cameras) can contain a JPEG version of the image taken. File Juicer makes
it easy to extract them.
Web browser cache files from Safari, Mozilla and Internet Explorer. You find those in the "Library"
folder (see above on the screen shot of the "Open" sheet).
Applications can be dropped on File Juicer, and the images known to File Juicer will be extracted. You
can do that by using the "Contextual Menu" in Finder too.
Apple Keynote
Email attachments from Apple Mail. You can extract those by using Mail too, and that is the preferred
way, but this is not practical for mailboxes not in use or mail boxes which are damaged or stored on
CD's.
QuickTime and Flash files are examples of file formats which do contain images, but they are not stored
in any of the formats File Juicer currently support.
PDF files can contain bitmap images in JPEG format as well as in losslessly compressed formats. File
Juicer can find these JPEG images both in their original form and within PDFs. This is because PDFs
include color management, which can add additional color information to JPEGs. When File Juicer extracts
images from a PDF, it also copies the color information. Lossless images are saved only as PDF images.
EXE files. Flash animations can be saved as
self-playing Windows applications, which means Windows users without a Flash player can still view them.
However, this makes them unusable on macOS. Not anymore! Check the “Flash” and “Inflate” checkboxes in
the preferences and process the files. There’s a good chance you will recover the Flash animation.
Occasionally, the initial extraction may yield files with the “.inflated” extension instead of Flash
files. Extract (the largest) of these to see if the Flash file is contained within.
Some applications (.prc files) for the Palm handheld, are distributed as .exe files, even if Mac users
can not use them. If these exe files have been made with WinZip, File Juicer can extract them.
File Juicer can find text in most files. Use the preferences in TextEdit to set the encoding, if you
extract from files coming from Windows.
For more info about which formats File Juicer have extracted files from, see the File Juicer Formats page.
4 Flash Cards and Disk Images
File Juicer can extract files from disk images. This is useful for
recovering files that were accidentally deleted.
The situation where I’ve needed this most is when recovering photos from the flash card of my digital camera.
You can try this on your flash card now, after you’ve read the images in your preferred way.
This gives you an idea of what to expect if you ever need it.
File Juicer uses Apple’s "Disk Utility" tool to create a disk image of your flash card.
To create a disk image from a flash card, choose "Flash Card..." from the File menu, and File Juicer will
read the flash card,
extract the files, and save the disk image.
When recovering images from flash cards, you may encounter many tiny JPG files, which are remnants of deleted
images.
To distinguish between good and bad files, you can set File Juicer’s preferences to generate thumbnail icons
or simply sort by file size.
If you’ve erased individual images, taken new ones, and erased them again on the flash card,
the results may be less successful, as the images may be fragmented and stored in different parts of the
card. I typically empty my flash card completely when I connect it to my Mac, and I’ve found images that
were over a year old on my cards.
You can make disk images from any type of disk and try extracting from those as well. However, it may be
very slow if you attempt to juice disk images that are larger than
the amount of RAM in your Mac.
Text can sometimes get stuck in unreadable files, mixed with binary data.
File Juicer can search any file for data that might be text and extract it into a text file,
which can be opened with any text editing application.
In the screenshot, I’ve filtered a JPEG file, and the small amounts of text inside have been extracted. File
Juicer names the file with a percentage in the filename, which indicates how much of the file consists of
text. For the MP3 file, it was about 1%. Even with files created by word processors, this percentage can be
well below 50%.
If you filter files from Microsoft Office or other Windows-originated files,
you may benefit from adjusting the preferences in TextEdit,
which can help interpret these files more effectively.
When File Juicer extracts text from PDF and Word documents, the text is encoded in
UTF-8,
which TextEdit handles well, even though it may not be the default setting.
6 Other Uses for File Juicer
If image files lose their extension and file type, and you do not know what they were, they cannot be
opened.
File Juicer will identify the correct file type and append the appropriate extension.
If files are stored in deeply nested folders, File Juicer can create a copy of all the images in a flat,
unnested structure.
iPhoto stores images in nested folders, and this feature can be used to extract images from there.
The Finder's search function can also be used for the same purpose.
Recovering albums from a damaged iPhoto library. If the iPhoto library becomes damaged and iPhoto cannot
repair it,
the files are still inside the folders but are organized by date. Version 4 of iPhoto
(though not newer versions) also creates album folders with aliases to the original images.
If you drop one of these folders onto File Juicer, it will create copies of the original files—not the
aliases.
Stripping icons from images. This may be useful in certain cases,
perhaps saving a bit of disk space or removing icons that do not work well outside of macOS.
However, this is a limited-use feature, as most software that doesn't support icons will simply ignore
them.
7 Getting Overview of Many Images
With File Juicer you can
collect many images, in just one juicing. File Juicer can put icons on images of the types: JPG, GIF, TIFF, PNG
and PDF, so getting an overview in Finder is easy.
Two options from the preferences are:
Preserve Dates - this will copy the date from the files searched to the files extracted. This will help
getting a overview in Finder when sorting by date.
Organize found files after format - this will sort the images in subfolders after format. Files for
which no icon could be generated are placed in yet another subfolder within. Those files may be corrupt
or have very small icons.
Other options for handling lots of images are:
Safari. File Juicer generates index files in html format. Quick to open in Safari. Good for many smaller
images
Preview. Preview is fast for opening many images in one go.
Apple Photos. This is best for photographs, or larger images. It is not suited for tiny graphics.
If you want to convert the RAW files to JPEG or TIFF, just drop them on File Juicer with the JPEG and/or
TIFF checkboxes set in the preferences
If you want the JPEG previews usually found inside RAW files extracted, turn OFF conversion in the
preferences. This option is a lot faster.
9 Limitations & Troubleshooting
File Juicer comes with no warranty, expressed or implied. It may or may not work as intended, and I am not
responsible for any damages, special, indirect, consequential, or whatsoever caused by using the software.
File Juicer does not decrypt PDF files which are encrypted. This will result in white images.
File Juicer recognize .ZIP, ,bz2, .rar and "deflate" compressed data and will extract it so you can
decompress it with Finder, but not other compression algorithms. It does not decrypt encrypted data.
If File Juicer should crash and you wish to tell me about it, I would appreciate if you send me the file:
If you can't send me the file(s), the be more detailed in explaining what I can do to replicate the crash.
10 Technical Details
Log Files
File Juicer saves two log files in your Library > Logs folder. "FileJuicerLog.txt"
and "FileJuicerResultsLog.txt". They are created for every "juicing", and contain the names of the files juiced
and the files found. The log files are emptied every time you juice a new set of files, so you don't need to
delete them as they don't grow.
Special Names on found files
Some files get special names. If they contain "[1346]", it means that
compressed data was found 1346 bytes from the beginning of the file.
If the end is ".inflated" or ".bz2 extracted", it refers to the name of decompression algorithm used when
extracting the file, and that the extracted data was not one of the formats File Juicer can identify. See .inflated for more info.
Text files can have something like "(6%)" in the file name. It means that only 6% of the juiced file was
text.
Display of Images while Juicing
You can disable this feature by checking the small checkbox. This makes File Juicer slightly faster,
but more importantly, it increases stability during very large "juicings." For example,
if you have a disk image of a PC hard drive, many remnants of deleted files can be found.
Ancient versions of macOS like 10.3.9 occasionally encountered issues when trying to display damaged PDF or
TIFF files.