With Flash Memory Summit approaching next week, I thought it would be a good time to dig into the technology and life cycle of the SSD. Unlike traditional hard drives, data storage data in SSDs is not on a magnetic surface, but rather inside of flash memory chips (NAND flash). By design, an SSD is made by a motherboard, a few memory chips (depending on the size in GB of the drive) and a controller which commands the SSD.
The memory of SSDs is a non-volatile memory, in other words it’s able to retain data even without power. We can imagine the data stored in the NAND flash chips as an electric charge preserved in each cell. With that in mind, the question arises: how long is the life span or life cycle of a SSD?
The wear and tear of flash memory
It is known that the writing operations wear out the memory cells of a SSD, reducing its life. But will the memories wear out all in the same way?
The memory used in flash chips is not all the same, there are actually three types of NAND:
SLC (Single Level Cell) – 1 bits of data per cell
MLC (Multi Level Cell) – 2 bits of data per cell
TLC (Triple Level Cell) or 3-bit MLC – three bits of data per cell
You can see: The more levels a cell has, the more storage space bits have in a cell, resulting in the production of higher capacity chips. Thanks to the technological advances of today we have SSDs which are able to store several GBs and are at an affordable price. No wonder that a recent report shows that the TLC memory type should equal about 50% of total NAND chips by the end of 2015, with a cost of production of about 15% – 20% less compared to MCL chips.
However, there is a downside: Adding more bits to the cells reduces their reliability, durability and performance. It is quite easy to determine the state (how much space it has) of a SLC cell, as it is either empty or full, while it is more difficult to do the same for MLC and TLC cells as they have multiple states. As a result a TLC cell requires 4 times the writing time and 2.5 times the reading time of a SLC cell. When discussing the life cycle of an SSD, storing multiple bits per cell also means to speed up the wear of the NAND memory.
A memory cell is made by a floating-gate transistor. It consists of two gates, the Control gate and the Floating gate insulated by a layer of oxide (you can see a schematic representation on the right). Each time operations are performed, e.g. programming and erasing the cell, the oxide layer that traps electrons on the floating gate wears. Consequently, as the oxide layer is weakened an electron drain from the floating gate may occur.
How long do SSDs last?
This is the million dollar question, obviously it’s not possible to give an exact answer but… continue to read!
The trend in terms of SSD is to focus on developing products based on 3-bit MCL (TLC) memory. TLC memory is beginning to dominate the market for SSD. In common use, it seems that the 2-bit MLC technology is excessive in terms of durability and performance, not to mention the SLC whose necessity is dwindling and is almost completely disappearing. In other words, manufacturers are giving up an extended life cycle in favor of cost reduction to allow for the expansion of flash memory and their storage capacity.
However, it is seems that there is no worry about the duration of a SSD. In an experiment conducted by The TechReport on 6 SSDs to understand how they can withstand write operations, 2 drives out of 6 have managed writing operations for 2 PB of data and all SSDs tested were able to write hundreds of TB without problems.
SSDs, TRIM and the Recovery of Deleted Data
SSDs are well-known for their reliability and speed, but what does not seem to be quite as well understood is what happens on a modern SSD drive that has TRIM enabled when a file is deleted. So first of all:
What is TRIM?
TRIM is a command that has to be supported by both the computer’s operating system as well as the SSD within that computer. In short TRIM is a method of both prolonging an SSD’s life as well as preventing the inevitable degradation in performance that will occur over time if TRIM is not enabled.
The reason that SSDs degrade over time is quite simple. For an area of the SSD to accept new data, the old data must first be removed, creating a performance hit on each write attempt because each page must first be read from (to determine whether or not the page is empty) then effectively be written to twice, once to clear the original data and once more to write the new data to the page.
The more deletion that takes place over time, the more likely it is that some or all of the pages being written to will need to be cleared of their current data first. In the early days of SSD drives this performance hit eventually resulted in any speed benefits that may have existed eroding to the point that the SSD would become slower than a traditional hard disk.
This problem is known as write amplification and quickly becomes a big and unwanted issue on your expensive SSD.
More on Write Amplification
The write amplification phenomenon exists due to the requirement for flash cells to be empty before they can be written to. There is a further issue compounding the way in which this issue is handled within the SSD and that is down to the fundamental way in which an SSD works.
Data is split up on an SSD into various units, here we are considering SSD ‘pages’ and SSD ‘blocks’. When data is written to an SSD it is written in pages, these pages are normally 2 or 4KB. When an erase cycle takes place however, it must take place on a block. This block will be made up of a number of pages, lets say 5 for the sake of this article.
When a small file, or a portion of a larger file is written to a page within a block, there may be other pages which are used by other data, there is no problem with the SSD storing data in this way, until one of the files occupying the same page is deleted. We have already discussed that in order for the SSD to be optimised, this deleted data must be cleared by TRIM, but we are presented with a further issue that can slow down the SSD even further. This issue is the fact that the SSD can only clear the contents of a used area a block at a time. Because of this a further overhead is added as the data which occupies that same block and has not been deleted needs to be cached by the SSD whilst it carries out the clearing of the block and then re-written again once the clearing has taken place.
For this reason write amplification may have a performance hit many times the advertised write speed of your SSD, in some cases this may be up to a 1000% increase. It is clear to see that this is a big problem if it is not effectively dealt with.
How does TRIM Work?
With a TRIM enabled SSD and computer some extra things go on when data is deleted, these things are not required for non-SSD drives.
Normally when a file is deleted, the record for that file within the file system is removed or marked as usable (or both, depending upon the particular file system/operating system combination). For a standard hard drive this results in the saving of time and overheads as there is no such problem as the above mentioned performance degradation in overwriting existing data.
On a TRIM enabled system an extra command is sent to the SSD drive, which otherwise would be unaware of what is going on at the file system level, when a file is deleted. This command tells the SSD which pages contain deleted data, allowing the SSD to clear these pages during its garbage collection cycles.
How does TRIM affect Data Recovery?
This is a simple question to answer. Once the data has been cleared using TRIM it is not recoverable. The data is zeroed out in a way that leaves no remaining trace of the previous content.
It is not possible to recover deleted data from a TRIM enabled SSD/operating system once the TRIM command has cleared the data. All of those file un-delete programs that have saved people from lost data from their traditional hard disk drive are no longer of any use once TRIM has come into play.
If the prevalence of the SSD is to continue, file recovery software may soon become a thing of the past or at least resigned to limited usage where the system has failed or become corrupted rather than the data having been deleted.