Data recovery chances: The filesystems of macOS

The chances to recover files you've lost are mostly affected by type of filesystem used by the storage at the moment of deletion or formatting. This is because its mechanisms determine what part of information about them remains on the medium and will be at the disposal of data recovery software. Thus, with the basic understanding of how each given filesytem performs these operations, you can adequately predict the actual degree of recoverability.

Apple computers running macOS have used HFS+ for many years. You may also know it as Mac OS Extended. The filesystem can be found under macOS 10.12 Sierra or earlier. However, after the introduction of the more advanced APFS filesystem, it became the default option for newer models of Macs, starting with macOS 10.13 High Sierra.

HFS+

HFS+ organizes the placement of data using a set of special files. Yet, they are not presented in the form we are used to seeing. Instead, they are arranged as tree structures - the so-called B-trees. The concept of trees may be quite confusing, but you should at least be familiar with it to figure out how most modern filesystems work. The principal point differentiating trees is that they do not store information in a sequential manner. It is arranged on multiple levels that are linked together to illustrate some kind of relationship.

Any partition formatted with HFS+ is spit into equally sized chunks named allocation blocks. The state of each allocation block is marked in the Allocation File. This file serves as a map of available space for the whole filesystem.

Allocation blocks are assigned to files in continuous sequences that are called extents. Each extent is characterized by the location of its starting block and the number of blocks that follow it. Extents are also important elements frequently encountered in different modern filesystem types.

The Catalog File is a central tree structure in HFS+. It sets out the relations of all files and directories, stores their basic properties, including names, and the locations of up to eight first extents. Additional extents, if present, are kept in the Extents Overflow File.

Prior to making any changes to its service files, HFS+ first records them in a separate file known as the Journal. And only after that it performs the actual corrections. The size of the Journal is fixed though, so its older content gets overwritten systematically with the information about newer modifications.

Deletion

Procedure: HFS+ rebuilds the Catalog File immediately, destroying the information about the file's name and extents. The Allocation File is updated to reflect that the corresponding allocation blocks are now free. Despite that, the extents themselves remain somewhere on storage. Also, the records related to these updates stay in the Journal for some time.

Recovery: The Journal may contain a copy of the Catalog File structures that describe the deleted file. The chances depend on how actively the OS has been used after deletion. If this information has been overwritten, but the extents still exist, there is another way to "undelete" the file. Yet, in this case, data recovery software will have to get past the filesystem and analyze the storage on a deeper level, find and piece together the file based on the known peculiarities of its format. This method is also called RAW data recovery. Its main weakness is that the file will be retrieved without its correct name and lose its initial directory. Also, this method will provide positive results only on condition that HFS+ has happened to store the file's content in adjacent extents.

Formatting

Procedure: The Catalog File gets reset, so the information about the previous files is lost. At the same time, the content of files and the Journal remain unaffected.

Recovery: First, the Journal can be analyzed to retrieve whatever is left in its records. The rest of the files can be restored using RAW data recovery. Likewise, positive results can be achieved only for files that occupy adjacent extents.

APFS

Unlike HFS+ and most other filesystem types, APFS resides in the Container instead of a partition. This Container can include several filesystems at a time, that share the available storage space. The state of each block in the Container is indicated in the Bitmap. However, the rest of the structures are managed by each filesystem individually.

Just like HFS+, APFS allocates blocks to files in continuous series named extents, each characterized by the starting block and the count of blocks in the sequence. All extents are organized as a tree structure – the already mentioned B-tree.

The hierarchy of files and directories, along with the information describing them, is stored in the form of a B-tree as well.

One characteristic feature of APFS is that it never modifies its structures on the spot. It always creates a copy in a new location and makes the required changes to it, while the original structure remains on the storage. After that, other structures are updated to point to this modified copy.

Deletion

Procedure: APFS reorganizes the B-tree in response to deletion, wiping the information about the file.

Recovery: It may be possible to find older copies and use them for reconstruction. Yet, as APFS meant for solid-state drives, the file's content is likely to be erased by the TRIM operation, which leaves no opportunities for data recovery.

Read on to learn the chances for data recovery from other filesystems: