File systems regulate the way data is allocated, stored and manipulated on a data storage medium. As the file system is mainly responsible for data management, it certainly defines the method of handling data during and after its deletion. Thus, the chances for complete and successful data recovery depend drastically on the file system behavior.
Virus attacks and logical failures make system behavior unpredictable: the system might delete user files or format the entire file system. For this reason, here we address two cases of data loss – file deletion and system formatting (with the same file system).
Let us have a closer look at the file systems of Windows, such as NTFS, FAT/FAT32 or exFAT; HFS+ and APFS of macOS or Linux file systems, such as Ext2, Ext3/Ext4, ReiserFS, XFS or JFS.
Windows file systems
NTFS file system
Structure: the file system header (boot record), the Master File Table ($MFT), space for files:
The NTFS file system uses the Master File Table (MFT) for files coordination. Basically, the MFT contains information about all files and folders holding these files. This information, to be specific, includes file location, name, size, date and time of its creation and last modification.
If the file attributes are too big for one MFT cell, the file system will allocate another cell placed in the file for the list of file attributes.
Procedure: the file system does not delete and rather labels the file record in the MFT as unused and marks the file location in the MFT and Bitmap as released. The system also deletes the file entry from its directory.
Recovery: the information about the deleted file (name, size, location) remains in the MFT. If the MFT record remains unchanged and the disk data is not overwritten, the chances for file recovery are 100%. Yet, if this record is deleted, it is still possible to find the file by its content with the help of the RAW-recovery method (recovery by disk contents bypassing the file system structure).
Procedure: the file system wipes the beginning of the MFT only. The MFT tail remains unchanged.
Recovery: the first 256 files lose their links to the MFT; thus, their recovery is only possible with the RAW-recovery method. Recovery chances for the files following these 256 files are up to 100%.
FAT/FAT32 file system
Structure: the file system header (2 headers more for FAT32), FAT tables and the data area.
The FAT file system applies the File Allocation Table containing an entry for each cluster on the disk and making a link from this table to the file location on the disk. It also holds links to the cluster of the file start, file continuation and file end. The FAT file system does not apply defragmentation of fragmented files. As to its original design, files on FAT have 8 symbols for the file name and 3 symbols for the file extension. That is why the file system stores long file names separately utilizing the long file names (LFN) extension feature.
Procedure: the file system deletes all the information contained in the File Allocation Table including the links to the file continuation and end clusters. The data area itself is not wiped, though. The first symbol of the file name is deleted in its short form, and with FAT32, the part of information about the starting file cluster gets deleted.
Recovery: the file start can be found, but the information about the file continuation and the end needs an assumption. For this reason, data recovery may be incomplete. Besides, the FAT file system doesn’t defragment files making it difficult to retrieve fragmented files even with the RAW-recovery method. Another issue is that file names are limited in length and can even be stored detachedly on the disk. Recovery of long file names may give no effect.
exFAT file system
Structure: the file system header, the FAT table and the data area.
Like its predecessors, the exFAT file system applies the File Allocation Table to manage files. This table contains an entry for each cluster on the disk and makes a link from this table to the file location on the disk. It also holds links to the file start, file continuation and file end. This file system tries to avoid file fragmentation. The file system does not provide linking to file sub-directories.
Procedure: the file system deletes all the information contained in the File Allocation Table including the links to the file continuation and end. The data area itself is not wiped, though.
Recovery: as links to the files continuation may be lost, the recovery result for files with the size of several blocks can be incomplete. The chances for successful recovery of a file in case of directory damage can be low as well. At the same time, recovery of files by their contents (the RAW-recovery method) may give highly positive results due to low files fragmentation.
macOS file systems
HFS+ file system
Structure: the file system header; the file system journal; the Catalog File with the files containing information about other files (so-called hard-link files).
The HFS+ file system supports journaling. The file system journal keeps track of all file system modifications. The HFS+ journal is limited in its size, new information being added and written every time over the old journal records. In this way, the file system overwrites older information to release the journal for data about newer file system modifications.
The HFS+ file system aims at files defragmentation. The file system thoroughly looks for a place to store a file, and literally glues file fragments together in the found one. Still, the remaining fragmented files can impose a problem to the recovery result.
HFS+ supports hard links stored as separate files inside a hidden HFS+ root directory, which serve to store information about a user file. Each hard-link file is bound to its user file.
Procedure: the file system deletes the hard link from the directory. Nevertheless, it still keeps this information in its journal records for some time.
Recovery: the program can address the file system journal to find an older file system state and return the lost hard link in its place. Data recovery chances will depend greatly on the time the system is used after file deletion. Yet, if the journal record has been emptied, you can try RAW-recovery, which can give excellent results for non-fragmented files.
Procedure: the file system deletes the hard-link directory leaving the journal and on-disk data area intact.
Recovery: the program addresses the file system journal to recover everything that is recoverable from the journal or employs RAW-recovery (by file contents) to retrieve the lost files. Recovery chances may be low for fragmented files due to hard-links deletion.
APFS file system
Procedure: the file has a two-tree structure: one tree contains the data itself, while the other is the virtualization of the first one. During deletion the first tree is wiped, however, in some data loss cases, both trees get damaged. In such cases, file recovery is not possible.
Recovery: Previous versions of the first tree are analyzed with help of the second one and thus get recovered.
Linux file systems
Ext2 file system
Structure: the file system header; inodes; inode table.
The Ext2 file system uses inodes containing information about files. This information includes user and group ownership, access mode and extension. Some inodes include an inode table copy.
Inodes do not include file contents and file names, as they are stored in file directories and are not considered to be metadata according to the file system.
Procedure: Ext2 labels the file inode as free and updates the map of free blocks. The file name entry is unlinked from the directory record. The file name to node reference is wiped. The file will be deleted as soon as all inode references to this file are deleted.
Recovery: due to file descriptions remaining in the inode, the chances to retrieve the file are quite high. Nevertheless, file names stored in directories and unlinked from the file will be lost.
Procedure: Ext 2 wipes all file allocation groups and deletes file inodes.
Recovery: the program can apply the RAW-recovery method to find files by their contents. Recovery chances depend on file fragmentation: fragmented files are hard to retrieve.
Ext3/Ext4 file system
Structure: the file system header; inodes; inode table.
Ext3 and Ext4 structure
In addition to inodes implemented in Ext2, Ext3 and Ext4 file systems use file system journaling. The file system journal keeps track of all modifications made by the file system. Ext4 differs from the Ext3 file system by the references structure.
Procedure: The file system makes an entry to the journal and then wipes the file inode entry. The directory record is not deleted completely and rather the order for directory reading gets changed.
Recovery: the retrieval of deleted files even with the file name is possible due to the file system journal. Still, the recovery result depends on the time the file system remains in operation after file deletion.
Procedure: All allocation groups as well as file inodes and even the journal are wiped. The file system journal may still contain information about some of the recently created files.
Recovery: the retrieval of lost files is only possible with the RAW-recovery method to find lost files by their contents. Fragmented files have low recovery chances.
ReiserFS file system
Structure: the file system header, the S+-tree.
The file system uses the S+-tree which stores files metadata and has descriptors of all files and file fragments. In the process of writing new metadata into the tree, the new tree created for the new data replaces the old one. At the same time, its older copy remains on the disk. Thus, the file system can store lots of metadata copies. This technique is called Copy-on-Write (COW).
Procedure: The system updates its S+-tree to exclude the file and renews the map of free space.
Recovery: due to COW, it is possible to recover all files including their names. Moreover, you can also retrieve the previous version of such a file from an older S+-tree copy.
Procedure: The file system creates a new S+-tree over the existing one.
Recovery: COW helps to retrieve the previous file system state enabling complete data recovery. However, the chances for complete recovery of lost files decrease, if the file system partition was full. In such case, the system would overwrite the old data with new one.
XFS file system
Structure: complex tree structures, inodes, bitmaps
SGI XFS structure
The XFS file system uses inodes to store files metadata and journaling to keep track of system modifications. Only metadata is journaled by this file system. Each inode has a header and a bitmap. XFS stores inodes in a special tree in a specific place on the disk. The system also has a bitmap for free storage blocks.
Procedure: the inode responsible for this file is excluded from the tree; its place is overwritten with new information.
Recovery: XFS keeps the file metadata still leaving much information and making the recovery of lost files possible. The chances to recover a deleted file even with the correct file name are quite high.
Procedure: root directories of the file system are overwritten.
Recovery: chances to recover files which were not located at the beginning of the storage are high, in contrast to those files, which were stored close to the disk start.
JFS file system
Structure: the superblock, the B+-tree, the journal, inode file sets
JFS file system employs the B+-tree structure for storing data, journaling for file system modifications and inodes for describing files. The system is also capable of storing several file systems on one partition with links to the same file. File names can be saved in the Unicode and UTF8 encodings.
Procedure: JFS updates the counter of the object uses and releases its inode in the inodes map. The directory is rebuilt to reflect the changes.
Recovery: The file inode remains on the disk increasing the chances of files recovery up to almost 100%. The recovery chances are low for file names only.
Procedure: JFS writes a new tree. It is small at the beginning and gets extended with further file system use.
Recovery: the chances to recover lost files after formatting are quite high due to the small size of a new B+-tree. Moreover, the internal inodes numbering increases chances for easy files recovery after formatting.