SanDisk's High Bandwidth Flash targets AI's memory bottleneck with SK hynix backing

A memory tier that promises more capacity per watt and a better price per terabyte matters here in hard dollars.
On why HBF's economics matter for Australia's A$20 billion data centre buildout.

HBF offers 8-16x the capacity of HBM at comparable cost, with 1.6 TB/s bandwidth in generation one, targeting the inference workload that dominates modern AI computing. SK hynix, the HBM market leader, is co-standardizing HBF through Open Compute Project, signaling industry confidence despite competing with its own high-margin memory business.

  • HBF delivers 8-16x the capacity of HBM at comparable cost, with 1.6 TB/s bandwidth in generation one
  • SK hynix, which controls 62% of the HBM market, is co-standardizing HBF through Open Compute Project
  • First HBF samples due H2 2026; production AI-inference devices expected early 2027
  • Hyperscalers (Google, Microsoft, Amazon, Meta) tracking toward roughly $725 billion in 2026 capex

Sandisk introduces High Bandwidth Flash (HBF), a new memory architecture designed to address AI inference's capacity and cost limitations that current DRAM and HBM technologies cannot efficiently handle.

Every artificial intelligence accelerator built today is starving for the same thing: enough memory sitting close enough to the chip to hold an entire trained model without choking. That hunger is reshaping how the data centre thinks about memory itself.

The problem is older than it sounds. Today's memory—the DRAM and specialized high-bandwidth memory called HBM that power data centres—was designed for a different job. It excels at low latency and random access, properties that matter enormously for training AI models. But inference, the workload that actually runs a trained model when you type a prompt into ChatGPT or Claude, has different needs entirely. Inference streams large model weights sequentially. It doesn't need DRAM's speed advantage. It needs capacity, and it needs it cheap. Current memory technology can't deliver both at the scale AI now demands, and that gap is where SanDisk's new High Bandwidth Flash architecture is trying to wedge itself open.

The numbers tell the story. SanDisk's first-generation HBF delivers 256 gigabytes per die, stacked 16 high for 512GB per stack, with 1.6 terabytes per second of read bandwidth. The company claims it can deliver 8 to 16 times the capacity of HBM at similar cost—enough to attach up to 4 terabytes of memory beside a single GPU. A second generation targets more than 2 terabytes per second and 1-terabyte stacks. A third aims at 3.2 terabytes per second and 1.5-terabyte stacks. In SanDisk's own simulation, HBF reading the weights of a Llama 3.1 405-billion-parameter model landed within 2.2 percent of a hypothetical unlimited-capacity HBM system. The honest caveat is that flash latency still runs meaningfully higher than DRAM, which is precisely why HBF targets inference and not training, and why it's not pitched as a DRAM replacement.

What gives HBF credibility beyond SanDisk's own claims is who's backing it. In July 2025, SanDisk formed an HBF technical advisory board chaired by David Patterson, the UC Berkeley emeritus professor and Google distinguished engineer who co-pioneered RISC and RAID and won the 2017 ACM Turing Award. Alongside him sits Raja Koduri, the former AMD chief architect and ex-Intel graphics executive who shipped Polaris, Vega, Navi, and Intel's Arc GPUs. Then came the real signal. In August 2025, SanDisk and SK hynix—which controls roughly 62 percent of the HBM market—signed a memorandum to collaborate on HBF specification. In February 2026, they kicked off global standardization through the Open Compute Project. SK hynix is essentially helping standardize a technology that nibbles at its own high-margin cash cow. That's pure hedging, and it's the clearest tell yet that the industry believes HBF is real.

The timing matters because the stakes are enormous. Hyperscale providers—Google, Microsoft, Amazon, Meta—are spending at a scale that makes memory efficiency worth billions. Google alone guided to $175 billion to $185 billion in 2026 capex, close to double its 2025 outlay, much of it data centres and custom chips. Across those four companies, 2026 capex is tracking toward roughly $725 billion. At that scale, shaving cost and power consumption off the memory tier is worth billions in real money.

The edge and enterprise markets matter too. HBM barely exists in phones and edge devices because of density, cost, and power penalties. HBF's persistence—the fact that it retains data when power drops—changes the calculus. A smartphone with HBF-backed memory could run sophisticated AI models locally in real time, recalling context from previous conversations without a round trip to the cloud. For enterprises, HBF-enabled accelerators could finally make it economical for smaller companies to fine-tune large pre-trained models for domain-specific uses. The mid-market, priced out of hyperscaler-grade GPU clusters, finally gets a seat.

In Australia, this lands with particular force. Amazon has committed A$20 billion to Australian data centres from 2025 to 2029, the largest tech investment in the country's history. OpenAI and NextDC are building a roughly A$7 billion AI campus at Eastern Creek in Sydney. The local hyperscale market is projected to grow from about USD 6.27 billion in 2026 to USD 16.18 billion by 2031. Every one of those campuses runs into the same two walls: power and cost. A memory tier that promises more capacity per watt and a better price per terabyte matters here in hard dollars. It's the difference between a Sydney inference cluster that pencils out and one that doesn't.

HBF is still pre-silicon. First samples are due in the second half of 2026, with the first AI-inference devices carrying it expected in early 2027. Flash latency won't magically match DRAM between now and then. But the shape of this is hard to miss: a credible spec, a Turing laureate steering the board, the HBM market leader co-writing the standard, and the Open Compute Project holding the pen. That's how a single vendor's pitch turns into an industry conversation. If HBF ships close to its numbers, the AI memory wall finally gets a service door.

HBF shows the promise of playing an important role in datacenter AI by delivering unprecedented memory capacity at high bandwidth, enabling inference workloads to scale far beyond today's constraints.
— David Patterson, UC Berkeley emeritus professor and chair of SanDisk's HBF technical advisory board
The key to AI infrastructure is to go beyond the performance competition of individual technologies and to optimize the entire ecosystem.
— SK hynix president and chief development officer Ahn Hyun
A Conversa do Hearth Outra perspectiva sobre a história
Inventor

Why does inference need different memory than training? They're both running the same model.

Model

Training is random access chaos—the algorithm bounces around the model constantly, updating weights everywhere. Inference is more like reading a book front to back. You stream the model weights through in order. You don't need speed; you need capacity and bandwidth.

Inventor

So DRAM is overspecced for the job.

Model

Exactly. You're paying a premium for low latency and random access that inference barely uses. It's like buying a sports car to drive to the grocery store.

Inventor

Why is SK hynix helping standardize something that competes with their own business?

Model

Because they're hedging. If HBF takes off, they want a piece of it. If it doesn't, they've lost nothing. But the real reason is they believe it's coming anyway. Better to shape the standard than watch from the sidelines.

Inventor

What's the catch? Why hasn't someone done this before?

Model

Flash latency is still higher than DRAM. For training, that's a dealbreaker. For inference, it's tolerable because you're streaming, not jumping around. The technology had to wait for the workload to shift. Inference just became the dominant AI workload.

Inventor

Does this actually work, or is it just numbers on a spec sheet?

Model

SanDisk's simulation of a 405-billion-parameter model came within 2.2 percent of ideal HBM performance. That's real. But we won't know until silicon ships in late 2026. The spec is credible, but specs are easy. Fabs have to make the parts.

Inventor

Who actually benefits first?

Model

Hyperscalers immediately—they're spending $725 billion on capex and every watt and dollar saved matters. Then edge devices and mid-market enterprises that can't afford hyperscaler-grade clusters. That's where the real unlock happens.

Quer a matéria completa? Leia o original em iTWire ↗
Fale Conosco FAQ