Raspberry Pi microSD follow-up, SD Association fools me twice?

 ____________________________________________
/ Fool me once, shame on you. Fool me twice, \
\ prepare to die. (Klingon Proverb)          /
 --------------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

(Excerpt from Ansible for DevOps, chapter 12.)

The fallout from this year's microSD card performance comparison has turned into quite a rabbit hole; first I found that new 'A1' and 'A2' classifications were supposed to offer better performance than the not-Application-Performance-class-rated cards I have been testing. Then I found that A2 rated cards offer no better performance for the Raspberry Pi—in fact they didn't even perform half as well as they were supposed to, for 4K random reads and writes, on any hardware I have in my possession.

Pile of microSD cards A1 A2 and Raspberry Pi NOOBS card

And now, I've discovered that A1 class cards like the SanDisk Extreme Pro A1 actually perform better than A2 cards I've tested. And in a complete about-face from their A2 counterparts, it seems like A1 cards actually perform 2x better than their rated minimum spec:

Raspberry Pi microSD card benchmark showing SanDisk Extreme Pro A1 comparison

The Extreme Pro A1 finally tops the random write performance of the three-year-old Samsung Evo+, though it's still a bit shy of the random read performance. What's more interesting to me (since I buy a ton of these little cards and cost is a major consideration) is the IOPS you get per US dollar:

Card Price (mid-2019) IOPS/USD (read) IOPS/USD (write)
Samsung Evo+ 32GB 8.60 322 115
SanDisk Extreme A2 64GB 15.49 122 61
SanDisk Extreme Pro A1 32GB 13.73 183 88

So, unless SanDisk decides to halve their price points, the Samsung microSD cards seem to be the best value for any kind of 'application' level performance, even though it seems Samsung has not yet applied for any A1/A2 ratings on their cards yet. (Note: I couldn't find a 32GB version of the A2 SanDisk Extreme to purchase and test, so I'm sticking with the 64GB pricing.)

After posting the A2 card performance article, I received a lot of feedback in comments on Reddit and Hacker News, and one thing I learned is that to achieve the rated performance, it seems you need to have special firmware and/or kernel-level support for A2 Command Queue and Cache functions. From /u/farptr's comment summarizing the requirements:

Command Queue

The new CQ mechanism allows the SD memory card to accept several commands in a series (without their associated data) and execute them (with the data) whenever the memory card is ready. It contributes mainly to random read performance. During the data transfer, additional commands may be sent to the card as long as the maximum number of queued commands does not exceed the maximum queue depth supported by the card (the SD standard allows queue depth of min 2, max 32). With CQ, advanced information on intended commands is provided to the card. The card may manage and optimize its internal operations to prepare for the various commands in advance. Multiple tasks can be handled at one time in arbitrary order. New information on next commands may be sent to the card during current execution and during data transfer.

Cache function

In order to overcome the relatively limited write speed operation of flash memory, the Cache function allows the card to accumulate the data accepted by the host in a high-speed memory (e.g., RAM or SLC flash)) first, release the busy line and perform the actual write to the non-volatile slower memory (e.g., TLC NAND Flash) in the background or upon flush command. The card may cache the host data during write and read operation. Cache size is card-implementation specific; flushing of contents stored in cache is done in less than one second. It is supported by OSs today for embedded memory devices and is assumed to be easy to implement for cards.

It seems like, for A1 cards, the card has to do all the hard work in its controller to achieve the IOPS benchmarks, whereas on A2 cards, a lot of the heavy lifting would be offloaded to the device or operating system. This has a couple interesting implications:

  • A2 cards, if they are ever properly supported by devices like the Raspberry Pi, may have different behavior in situations where the power supply is inadequate. For older microSD cards, a common problem is data corruption if you use a low quality power supply (and this has become a bigger problem every generation of Pi—you need a good power supply or you'll have a lot of annoying problems).
  • A2 cards seem to sacrifice some performance when used with hardware which doesn't have A2 Command Queue and Cache functions, versus A1 cards, which offer the same random I/O performance on any device.

So unless and until more devices and operating systems support A2 functionality, it's best to avoid purchasing any A2 cards. If you buy an A1 card, you'll get better performance now, and quite possibly forever. A2 might not technically be marketing BS, but in the real world, I stand by my assertion that it still is.

I haven't seen any indication there is work being done to add support in the Linux kernel, nor in the Raspberry Pi hardware, so I'd posit that, at least for most general computing purposes, the A1 and A2 designations from the SD Association are pretty meaningless, and you still have to rely on 3rd party testing to determine which cards give the best performance.

Comments

Have you tested these on the new raspberry pi 4?

All of my testing for the past few blog posts has been on the Raspberry Pi 4 model B, or in a few rare places—if noted explicitly—on a 3 model B+.

nice work man .... enjoy reading your findings and thankful you do what you do.

From what I can determine, command queue support was added to the Linux kernel in 2017, around the same time that version 6.0 of the SD Card spec came out.
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1528214.html
https://github.com/torvalds/linux/commits/master?after=6789f873ed373319c...

Command Queue can be enabled using the `cmdq_en` attribute:
https://www.kernel.org/doc/Documentation/mmc/mmc-dev-attrs.txt

I'm thinking about making my own database async. on the writes, basically implementing these two features manually myself.

I don't think you should need some (A2) version of a product to get this feature.

that cmdq_en attribute is read-only.

I have been obsessed with the new generation larger SD cards and their latency (install systat then run iostat -x and look for r/w_await) .

Toshiba has dropped the ball seriously and has latency peaks of >5sec. SanDisc is still the cards to use, at least the High Endurance cards work fine. I got some of their regular larger (200/400GB) ones on the way.

There is a chicken-and-egg race around releasing hardware before drivers for it are available. Normally, I would have assumed the device manufacturer reached out to somebody in VLSI/FPGA/Integration land, did tests, and got chipset confirmation of behaviour, but that would imply somebody got first-market-mover advantage, and some device out there has a huge "A2 ready" flag, which you would have seen. Maybe its in camera or phone, but if its in phone (linux == Android) there are drivers in a repo, which means it should be in a mainline repo for linux on Intel or Arm in other devices. The chicken is ready but the egg is cooked? I don't get it: Why release A2 branding and not work with your code supply chain logistics guy to make drivers available?

Your original article was good, and factual when stating that the A2 cards should not be used in the Raspberry Pi. but it quickly derailed into a criticism of the SD card makers, with the insinuation that they were peddling snake oil. Don't get me wrong though, sd specs are even more confusing than USB naming conventions, and its almost impossible for a technological person to understand what performs best, let alone a consumer.

I think in this case you didn't understand what was required to achieve the advertised performance, but its neatly laid out here: https://www.sdcard.org/downloads/pls/latest_whitepapers/Mobile_Device_In...

The big problem is that the spec came out in 2017, and the cards for A2 didn't even appear until late last year. The controller has to support the sd 6.0 spec, so I think that probably precludes anything you used to test your cards. The information is scant, but supposedly the Sandisk Mobilemate 3.0 reader supports A2, so it might be worth picking one up for 15 bucks and retesting your A2 cards. Really I think the problem is for anything beyond video or photos, sd cards are dying out, and there isn't any rush for anyone to support them. They would be good for SBCs, but now even most sbcs are moving away from them. Mobile applications would be a logic choice, but more and more phones are dropping sdcard support.

So is it the hardware or the software in the rpi4 that is the problem?, it seems command queuing is in the raspian kernel https://github.com/raspberrypi/linux/blob/rpi-4.19.y/drivers/mmc/host/cq...

Wouldn't it be faster to have USB boot with USB3 sticks?

Not sure this is the place to share this comment but I have been researching on how to optimize the power usage and consequently the life span of SD cards (and specially for USB thumb drives) on Raspberry Pi reducing unnecessary rewrite cycles and power usage.

What should be the best practices to use a USB thumb drive, on a RPi USB port, to serve as low power, high availability, low budget and big life span to keep our files and serve/store ebooks, videos, music, and other files and avoid public clouds and privacy issues ?

I have a Raspberry Pi on my SOHO network, running 24x7 for the last 3 years. The RPi only goes offline when my ISP goes down or the light company has a problem, but I wish I could use it to fully substitute my NAS which is a power hog.

I could just plug a USB stick in the USB port but what are the long term effects and reliability of this approach ?

I am using a RPi Model B Revision 2.0 with 512MB for low power budget, and I use a small fan on the side of it (fan used originally for desktop DRAM modules with clips) very small power drain too. Keeping cpu temperature low is a good practice to reduce power.

Please share your thoughts about this (or maybe even open a new blog topic), since I believe this is a major topic for all RPi users.

Cheers!