Flash Testing Learning

Flash Test Learning#

References:
[1] Yang Chao, Zhang Jinfeng, Ma Chengying. Discussion on NAND Flash Test Design and Use[J]. Electronic World, 2018, No.551(17):116-118. DOI:10.19353/j.cnki.dzsj.2018.17.063.

NAND FLASH is a non-volatile memory, and its basic storage unit is a block, which is composed of several pages. Among them, the minimum read/write unit is a page, and the minimum erase unit is a block. Before programming a page, the block it belongs to needs to be erased. NAND FLASH has a small number of bad blocks when it leaves the factory. The manufacturer will mark these bad blocks to facilitate user recognition during use. Similarly, new bad blocks may be generated during testing and use. The reliability organization of the component should identify and rewrite the bad blocks, and the user should manage the bad blocks to skip them and avoid data loss. The main suppliers of NAND FLASH are SAMSUNG and MICRON. Their internal structures are similar, divided into 16-bit and 8-bit data formats. The address lines, data lines, and command lines are time-multiplexed and work in a serial manner. Specific control commands are required to perform corresponding operations. The MT-29F64G08AJABAWP-IT selected in this article has 16,384 blocks, each block consists of 128 pages, and each page contains (4096+224) bytes, including 224 redundant bytes for storing configuration information. It has a total storage space of 64GB, and the data storage format is 8-bit. The device in TSOP package has 48 pins, including 8 multiplexed IO ports, as well as some control pins and power pins. The remaining pins can be left floating.

During testing, first read the ID of the chip and compare it with the technical manual. If they match, continue testing; if they do not match, the chip is not qualified and the testing should be stopped. NAND FLASH has a small number of bad blocks when it leaves the factory, and the manufacturer will mark these bad blocks by writing 00 to the first redundant byte of the first page or the second page. During testing, it is necessary to determine the value of this byte for each block. If it is 00, it means that the block was marked as a bad block by the manufacturer when it left the factory. The bad block counter is incremented, and the testing continues by jumping to the next block. At this time, it is important not to erase the bad blocks marked by the manufacturer, because erasing will remove the manufacturer's bad block identification. If it is FF, it means that the block was a good block when it left the factory and needs to be tested.

During the testing process, each block of the NAND FLASH storage array is verified. The testing process includes reading, writing, and erasing the entire block. If any unmarked bad blocks are found, the testing system stores their addresses. After completing the testing of the entire chip, the number of bad blocks is counted. If the number of bad blocks meets the requirements of the technical manual, it proves that the chip is qualified. If there are unmarked bad blocks, the information of these bad blocks is written back. The writing method is the same as the factory marking method, that is, writing 00 to the first redundant byte of the first page or the second page, to facilitate unified recognition by users.

It should be noted that in the above figure, in order to avoid being too cumbersome, only one type of test graphic algorithm's read, write, and erase operations are shown in the figure. In actual testing process, it is necessary to consider the fault coverage and time complexity to decide which test algorithm to use. In some cases, multiple test algorithms may need to be used together. Common test graphic algorithms include two categories: N-type test algorithms with a time complexity of N, such as all 0, all 1, random number, checkerboard, anti-checkerboard, diagonal, etc.; N²-type test algorithms with a time complexity of N², such as stepping, walking, jumping, etc. Different test algorithms can detect specific faults, and test algorithms with higher time complexity have relatively higher fault coverage.

Since the selected chip operates in serial mode, with an 8-bit data port and time-multiplexed addressing, and needs to address a 64GB storage space, using a test algorithm with a time complexity of N to perform a full-chip write will take about 400 seconds. Therefore, even if test algorithms with a time complexity of N² have higher fault coverage, they are not suitable for mass production reliability organizations. Finally, it is decided to use four N-type test algorithms, including all 0, random number, checkerboard, and anti-checkerboard, to test the chip. These algorithms can cover common memory faults such as stuck-at-0, stuck-at-1, address decoding faults, short circuits, open circuits, etc. Within the allowable testing time, they can maximize the fault coverage and achieve good testing results, effectively eliminating defective chips.