Twelve ideas the paper assumes you know
Each card clears up a single concept the paper builds on. They're collapsed — tap any one open. Read them in order the first time, or skip to whatever looks unfamiliar.
1RAM is not the same as storage (disk / SSD)▾
Picture your desk and a filing cabinet. The desk holds whatever you're working on right now — instant to grab, but small, and cleared off at the end of the day. The cabinet is roomy and keeps everything safe overnight, but it's slower to reach.
A computer has both. RAM is the desk: fast, temporary working space, wiped blank when the power goes off. Storage (disk or SSD) is the cabinet: slower, but permanent — your files survive a reboot.
2What computer memory (RAM) actually does▾
Think of RAM like a kitchen whiteboard. While cooking you jot notes — the timer, what's in the oven, the next step. Fast to read, fast to scribble on, and wiped clean when you leave.
That's RAM for a computer: its fast, temporary working space. Every running program lives there. When a laptop "has 16 gigs of RAM," that whiteboard is what they mean — and more of it, faster, costs more.
3The difference between reading and writing data▾
A sticky note on the fridge. You can glance at it (reading) a hundred times without touching it. Or you can cross it out and rewrite it (writing) — slower, takes effort, actually changes the paper.
To us these feel like one act. For hardware they're completely separate operations with completely different costs. This gap matters more than any other in the paper: the new memory is wonderful at reading but slow and clumsy at writing.
4What an operating system does▾
Picture an apartment building. Tenants (apps like a browser or game) live in their units but never touch the wiring, plumbing, or front‑door locks. For that there's a building manager with master keys.
An operating system — Windows, macOS, Android — is that manager. It's a privileged program between every app and the actual hardware. It can quietly pause an app, do work behind its back, and resume it, and it's the only thing trusted to touch the hardware directly.
5What copy‑on‑write is▾
A shared document set to "read‑only." Everyone can open it; the edit button is locked. Try to type anyway and the system silently does "Save As," makes a fresh copy just for you, and sends your typing there. The original is never touched.
That trick is copy‑on‑write — a decades‑old, ordinary OS technique, not something this paper invented. One word to defuse: when the system notices your write attempt, the technical term is a "fault." That sounds like a crash, but it's just a polite tap on the manager's shoulder.
6Why memory wears out (write endurance)▾
A page you write in pencil, erase, and rewrite. The first few times are fine. But erase and rewrite the same spot enough and the paper thins, smudges, and finally tears.
Ordinary RAM is basically magic paper — you can rewrite it endlessly. But some memory is like that pencil page: each tiny cell survives only so many rewrites before it physically dies. Crucially, only writing wears it out — reading is harmless.
7What a memory page or chunk is▾
A warehouse that only handles standard boxes. Even one small item goes in a whole box, and the forklift only moves complete boxes, never loose items.
Memory works the same way. We picture changing one letter at a time, but really memory is handled in fixed‑size chunks: the OS shuffles "pages" of about 4 kilobytes (the standard boxes); the hardware moves smaller 64‑byte chunks into the processor. Nothing happens one loose byte at a time.
8Why memory has a cost plateau▾
A gadget whose price dropped a little every year — until the discount suddenly stops and the price freezes, because one stubborn part inside simply can't be made cheaper. It has hit a permanent floor.
That's what happened to ordinary memory (DRAM). For decades it got cheaper per unit of data, then plateaued because of a physical limit in its tiny cells. This is different from a temporary price spike (like the AI‑driven shortage of 2025‑2026); it's a wall that won't move on its own. In data centers, memory is now over half the cost of each server.
9The hardware / software boundary▾
A restaurant kitchen. The cleverness of a meal can live in one fancy all‑in‑one machine, or in a plain dumb hotplate paired with a skilled chef. Same meal — but you've drawn the line between "machine" and "person" in a totally different place.
Computers face the same choice. A memory stick can contain its own little controller chip that hides its quirks (the fancy machine), or that intelligence can live in the OS, leaving the chip simple (the hotplate‑plus‑chef). That dividing line is the hardware/software interface — a contract of who does what, not a screen you click.
10What "LtRAM‑as‑peer" means▾
Older systems add cheap memory like a slow basement across the house: roomy but a pain to reach, so anything sent there gets "demoted" and you pay a delay every time you fetch it.
This paper does something different: a second shelf right beside your desk, at the same height, just as quick to read from. The cheap new memory (LtRAM) isn't demoted storage and isn't a temporary copy (a "cache"). It's a genuine equal partner to normal memory, holding different kinds of data.
11What an "invariant" means▾
A board game with one rule players promise never to break: "the bank's money is never touched without a receipt." Everything else can vary, but that one rule holds every turn — so everyone plans around it with total confidence.
In computer systems that kind of always‑true rule is called an invariant. It just means "a promise the system guarantees will hold at all times, with no exceptions." AROM's invariant: regular programs may only read the new memory, never write it; only the OS writes it. Always.
12What latency vs bandwidth means▾
A highway. Two different questions: how long does one car's trip take end to end? That's latency. How many cars pass a point each hour? That's bandwidth. A road can have short trips but few lanes, or long trips but many.
Memory has both measures. Latency is how long one read or write takes; bandwidth is how much data flows per second. People blur them into "fast," but a memory can be great at one and poor at the other.
AROM, in a nutshell
The problem, the failed shortcut, and the clean fix — then a map of how every concept connects.
Every modern server now spends more than half its hardware budget on one component: memory — the fast "RAM" your computer works out of, called DRAM. Worse, the price of DRAM has stopped falling (a permanent manufacturing wall, not a passing spike), so this cost will only keep growing.
The obvious escape is a cheaper memory called LtRAM (Long‑term RAM): it reads just as fast as DRAM and is far denser and cheaper — but it's slow to write, only in big chunks, and physically wears out if you rewrite the same spot too often. The one product that tried it, Intel Optane, disguised this odd memory as ordinary DRAM by bolting a complicated controller onto the memory stick. That disguise is exactly what made it slow and expensive.
This paper's key idea, AROM (Application Read‑Only Memory), throws the disguise away. Picture a library where visitors may read any book but only the librarian may shelve them. AROM lets ordinary programs only READ the new memory; only the operating system may write to it, and only when deliberately moving data into place. With small, messy writes forbidden, the memory chip can drop all of Optane's hidden machinery and stay dumb and cheap, while the software handles the awkward parts.
The payoff: across every candidate technology, projected read speed is 26–79% faster than Optane, landing within 0.9–3.2× of DRAM. The surprise is STT‑MRAM, projected at 0.9× — roughly 72 nanoseconds versus DRAM's 80 — meaning it could match or slightly beat ordinary memory on reads. The catch: STT‑MRAM costs about as much as DRAM and may never scale to DRAM‑sized capacity.
The work is honest about what's unsettled. No single LtRAM technology wins on speed, density, endurance, and manufacturability at once, so there's no clear front‑runner yet. And while the OS rations writes to keep the chip within its lifetime budget, it doesn't yet spread that wear evenly.
🔑 Key takeaways
- LtRAM reads as fast as DRAM but writes slowly, in big chunks, and wears out — so it fits read‑mostly data.
- One rule (apps may only read LtRAM; only the OS writes it) lets the chip drop all of Optane's costly hidden machinery.
- Projected reads are 26–79% faster than Optane and within 0.9–3.2× of DRAM.
- STT‑MRAM can match DRAM read speed (0.9×) but carries DRAM‑level cost and may not scale to DRAM capacity.
- The OS handles migration invisibly via copy‑on‑write — apps need no changes.
- Open questions: no single LtRAM technology dominates yet, and wear is bounded in total but not spread evenly.
① Why, and what's good for it
② Cautionary tale: Optane (what NOT to do)
③ The paper's answer: AROM + a thin interface
OS: page placement · dirty‑bit scan · token allocator
How to think about AROM
Each picture stands alone — read them in any order. Every card ends with an honest note about where the picture stops being true.
The reference‑room encyclopedia
Anyone may walk up and read the giant encyclopedia for free — no limit. But there's one firm house rule: visitors never write in it. If a change is needed, only a librarian makes it, on a fresh copy elsewhere. The shared book stays untouched, so it keeps serving everyone fast. That one rule is what lets all the complicated write‑handling machinery be removed from the hardware.
The kitchen recipe cards
Master recipe cards the whole staff reads all day, with a rule: nobody writes on a master. The instant a cook touches pen to a card, a helper slides a fresh photocopy under the pen so the ink lands on the copy. The master is never marked, and the cook barely notices the swap. That reflex is copy‑on‑write: the moment a program tries to change read‑only data, the system instantly diverts the change to a private copy.
Stone tablets vs. a whiteboard
Reading a stone tablet is instant — exactly as fast as reading a whiteboard, and stone is cheap to stack densely. But writing on stone is the opposite of easy: carving is slow, you re‑carve a whole tablet rather than fix one letter, and a tablet can only be reground a limited number of times before it crumbles. LtRAM is shaped like that stone: read‑fast and cheap, but write‑slow, big‑chunk only, and it wears out.
The pencil page that tears
Write in pencil, erase, rewrite — many times, but not infinitely. Each erase scuffs the paper until it thins and finally tears. Crucially, reading costs the paper nothing; you could glance a million times and it stays pristine. Only the erase‑and‑rewrite cycle uses up its life. Write endurance is the same: each cell survives only a bounded number of rewrites, so rewrites are a finite budget to spend carefully.
The laminated evacuation map
People glance at the fire‑exit map constantly, but it's essentially never edited — maybe once in a remodel, years apart. Because it's read often and changed almost never, it makes sense to print and laminate it (cheap, durable, fast to glance at) rather than keep it on an expensive rewritable display. Read‑mostly data is the computer version: program code, a fixed AI model's numbers, a cached lookup table.
The apartment‑density floor
A city keeps fitting more apartments onto the same plots by building taller, so density rises every year. You'd expect price per apartment to keep falling — but every unit still needs one costly part (say a special elevator mechanism) that refuses to get cheaper. So even as buildings grow denser, the cost per apartment flattens out. DRAM is exactly this: engineers pack more in, but one stubborn cell ingredient won't get cheaper, so price per gigabyte has stopped falling.
The drought water allowance
One well must last the whole dry season, so the family sets a daily bucket allowance sized to reach the first rains. Each chore spends a bucket; quiet days bank unused buckets for a big laundry day later. The token allocator works the same: the OS releases "write tokens" at a steady, precalculated rate, every write spends one, unused tokens accumulate, and because writes can never outrun the token supply, the chip is guaranteed to survive its full lifetime.
The rotating work boots
You own five pairs and wear one a day. Always grabbing your favorite kills it in a year; instead you rotate, so all five last roughly five times as long. Wear leveling is this rotation for memory: since each location survives only so many rewrites, the system deliberately spreads writing around so no spot wears out far ahead of the rest.
The desk and the filing cabinet
Documents you're actively editing stay on the desk (fast memory), within instant reach; reference binders you only consult get filed in the cabinet (cheap memory). The arrangement isn't frozen — a binder that suddenly needs heavy editing comes back to the desk first; a desk document that settles down gets filed away. Page migration is this continual moving, steered by how each piece is currently used.
The simultaneous interpreter
Two people who share no language, with an interpreter relaying every sentence both ways. It works — but every sentence detours through the interpreter, so it's slower, and you pay the interpreter's salary the whole time. Optane took a strange new memory and bolted a translator onto the stick so the computer could talk to it as ordinary memory. The disguise worked, but the translator sat in the middle of every access — and that constant detour made Optane slow and expensive.
The coat‑check ledger
Behind the counter, attendants secretly reshuffle coats all night so no rack overcrowds. Because a coat is never where you left it, every ticket means first looking up its current rack in a big ledger in the back — a delay on every retrieval. The Address Indirection Table is that hidden ledger: a private map translating each address to where data actually sits. It lets the hardware shuffle data, but every read must consult the map first.
How it actually works
The complete walk‑through, section by section, with each of the paper's diagrams rebuilt as a visual you can read at a glance.
1 · Introduction — the movie trailer
The fast working memory in servers (DRAM) has become shockingly expensive. At companies like Microsoft and Meta, memory alone is now more than half the cost of a server, and the price per unit has stopped dropping.
The proposed solution: mix in a cheaper memory, LtRAM. It's dense, cheap, and reads as fast as DRAM, but it's slow to write, only in big chunks, and wears out — so it's perfect for data you read constantly but rarely change. The catch is the cautionary tale: Intel Optane made the new memory pretend to be DRAM by bolting a complicated controller onto the stick, and that disguise is exactly what made it slow and pricey. The authors say: stop forcing the disguise. Keep the hardware dumb, and let the OS handle the bookkeeping. Their rule for doing it safely is AROM — programs may only READ the new memory.
2 · Motivation — the money problem
Ordinary memory used to get cheaper every year, then stopped. Two things happen at once: computers want more and more memory, and the price per unit has flattened and isn't coming back down. The subtlety: this is a permanent wall, not a passing spike. Yes, an AI‑driven buying frenzy in 2025‑2026 roughly doubled memory prices, and that may ease — but underneath it is a deeper problem. The manufacturing process relies on a special capacitor that refuses to get cheaper as it shrinks, so more density no longer means a lower price per unit.
The usual tricks only go so far: you can compress data, or shove rarely‑used data onto slower memory ("tiering"). These help you use memory more efficiently, but none make memory itself fundamentally cheaper per unit. That sets up the pitch: maybe we need a genuinely cheaper kind of memory.
3 · LtRAM and the Optane warning
LtRAM is a family of new chips that share a "shape": read as fast as normal memory, much denser and cheaper, but with three matching downsides — writing is slow, writes must be big chunks, and the chip wears out. Candidate technologies have intimidating names (RRAM, PCM, FeFET, MRAM); treat them as brand names, none perfect.
Now the cautionary tale. The deep reason Optane cost so much is a chunk‑size mismatch. Inside Optane the smallest writable unit is a 256‑byte block (an "XPLine"), but programs often want to change just a few bytes. The hardware can't touch only those bytes — it must read the whole block, change the small part, and write the whole block back, every time. That's a read‑modify‑write, and it moves roughly 4× as much data as asked. On top of that, a secret lookup table (the AIT) translates every address, and it's too big to keep handy, so most reads pay an extra fetch. Hold onto that chunk‑size mismatch — it's the hinge of the whole paper.
A · The chunk‑size mismatch
The CPU wants to change one 64‑byte cache line. Optane's media only writes a whole 256‑byte XPLine — four cache lines wide.
B · The read‑modify‑write dance — 3 media trips for 1 write
① Read
② Modify
③ Write
C · The AIT lookup tax — a toll on every single read
Every read must first check Optane's secret Address Indirection Table. Its on‑stick cache covers only ~6% of the chip, so 94% of reads miss and pay an extra media trip.
HIT
| Read scenario | Total latency | vs DRAM (80 ns) |
|---|---|---|
| AIT cache HIT (rare) | 351 ns | 4.4× slower |
| AIT cache MISS (most reads) | 427 ns | 5.3× slower |
| DRAM reference | 80 ns | 1.0× |
4 · The proposed hardware / software interface
This is the heart of the paper. Having shown that Optane failed by stuffing intelligence into the hardware, the authors flip it: make the hardware deliberately dumb, and move the clever decisions into the OS. The key that makes this safe is AROM — programs may only READ the new memory; only the OS writes it, and only when deliberately moving data. If a program tries to change something in LtRAM, the system quietly catches it (copy‑on‑write), copies the data into DRAM first, and lets the write land there. The program never notices.
Why does one little rule matter so much? Because it removes the need for all of Optane's hidden machinery. If programs can never make small messy writes — and AROM guarantees that — the hardware never does the read‑modify‑write dance and never needs a hidden translation table. The chip's whole job shrinks to three things: hand over data when read, accept whole 4 KB page writes from the OS, and report how worn each block is.
▶ Read path — the fast, common case
Latency: just the raw LtRAM read time (~20–300 ns).
▶ Write path — the copy‑on‑write detour
5 · Implementation — did they build it?
Real LtRAM chips don't exist yet, so the team built a stand‑in: a research computer called Enzian (a processor wired to a reprogrammable chip), turning that chip into a pretend memory controller, with cheap NOR flash standing in for the exotic LtRAM. NOR flash isn't ideal, but it has the same "read‑fast, write‑slow, wears‑out" shape. They ran ordinary Linux with new memory‑management rules added.
The OS rations writes with a "write token" budget — every move of data into LtRAM spends one, so the chip can never wear out faster than planned. And the headline result: every candidate LtRAM technology comes out 26–79% faster than Optane on reads, landing between 0.9× and 3.2× of ordinary memory. The best candidate, STT‑MRAM, is projected at about 0.9× — roughly 72 ns against DRAM's 80, i.e. projected to match or slightly beat ordinary memory. Two honest caveats: these are modeled estimates, not measured guarantees, and STT‑MRAM in particular may hit manufacturing hurdles that keep it from DRAM‑class capacity.
How it works. The OS releases write tokens at rate r = N·E ÷ T — where N = total pages, E = erases each page survives, and T = the seconds the chip should last. Every migration into LtRAM spends one token. Quiet periods bank unused tokens for busy bursts; when the bucket hits 0, migrations wait. Because writes can never outrun the token supply, the chip is guaranteed to outlast its deployment — with almost no bookkeeping (just one running count).
scale: 0 ────────────── 600 ns
6 · Discussion — honest about the rough edges
The authors place their work next to three earlier buckets: making new memory pretend to be normal memory (Optane — failed on overhead); software that demotes rarely‑used data to a slower "basement" tier; and earlier thinkers who argued new memory deserves its own purpose‑built design. Their key distinction: they do NOT treat the cheap memory as a slow basement underneath DRAM — it's an equal partner beside it, specialized for read‑mostly data.
On open questions, they're upfront: which exact technology is best, how to cleverly guess which data is safe to move, how to spread wear when some data is written once and never touched, and how to avoid running out of DRAM if a workload suddenly starts writing a lot. These are flagged as future work, not failures.
7 · Conclusion — the recap
Start to finish: DRAM is now more than half a server's cost and its price has stopped falling. A cheaper memory (LtRAM) could help, but disguising it as DRAM (the Optane way) loads on too much hidden overhead. The answer is a thin, simple hardware/software line, made safe by one rule — AROM: programs see the new memory as read‑only, only the OS writes it, and only while deliberately moving data. With that, the chip strips down to almost nothing while the OS takes over the hard decisions. They proved it's real with a stand‑in prototype, aiming to let cheap memory sit as an equal partner to DRAM for read‑mostly data and meaningfully cut server cost.
Worked examples
A photo service called "Snaply" keeps a 4 GB AI model in memory — read millions of times, written almost never. Let's trace what happens on a read, a write, a wear check, and an Optane failure.
① Reading from AROM — the happy path
Snaply's model lives in cheap LtRAM (here, the fastest candidate, STT‑MRAM). It needs to read one slice.
- Snaply asks for a piece of the model — "give me 64 bytes at address X." It doesn't know or care which chip holds it.
- The hardware sees address X is in LtRAM. No permission check needed — everyone may always READ LtRAM. That's the whole point of the rule.
- The thin controller fetches the bytes directly. No secret lookup table, no bookkeeping, no detour — straight off the media.
- Snaply gets its data — speed depends on the chip. NOR proto ~530 ns, 3D FeFET ~252 ns, 3D V‑RRAM ~152 ns, STT‑MRAM ~72 ns (essentially DRAM's ~80 ns). Snaply chose STT‑MRAM, so the read lands at ~72 ns and feels instant.
② Writing to AROM data — the copy‑on‑write detour
A developer nudges one weight — Snaply tries to WRITE 8 bytes at address X, which lives in read‑only LtRAM. Apps may never write LtRAM directly.
- Snaply attempts the write — "store these 8 bytes at X," same as any memory.
- The hardware catches it and raises an alarm. The 4 KB page is marked copy‑on‑write. The write is frozen mid‑air — it has NOT happened.
- The OS copies the whole 4 KB page into DRAM. A fresh page in DRAM receives all 4,096 bytes.
- The OS remaps address X to point at the new DRAM copy.
- The write is retried — and now it lands. X is in writable DRAM, so the 8 bytes write instantly. Snaply never noticed the detour.
- The OS reclaims the old LtRAM page later — scheduled for a slow erase in the background, off the hot path.
③ Rationing writes — how tokens keep LtRAM alive
A small example with round numbers:
- Chip specs: N = 1,000,000 pages; each survives E = 100,000 rewrites; target life T = 100,000,000 s (~3 years).
- Total lifetime budget: N × E = 100,000,000,000 (one hundred billion) page‑writes over the whole life.
- Convert to a steady drip: r = N·E ÷ T = 1,000 writes per second. The OS may hand out 1,000 tokens/second on average.
- Spend a token on every move into LtRAM. If tokens are available, the move happens; if the bucket's empty, it waits.
- Bank unused tokens during quiet hours, then spend the balance during a burst — like rolling over phone data. Long‑run average stays 1,000/s.
- Notice what the OS does NOT track: just one running number (the token balance) — not a separate erase count for every page. Because the OS is the only thing that writes LtRAM, counting tokens is enough.
④ Why Optane failed — the granularity‑mismatch walkthrough
Follow one tiny operation: a program writes a single 64‑byte cache line into Optane, whose media only writes 256‑byte blocks.
- The program asks to write 64 bytes (one cache line). Only those bytes need to change.
- The hardware can't write just 64 bytes. They sit inside a 256‑byte block, surrounded by 192 bytes it must not disturb.
- READ the whole 256‑byte block into scratch — 256 bytes moved, just to change 64.
- MODIFY the 64 bytes in the scratch copy; the other 192 stay as they were.
- WRITE the whole patched 256‑byte block back — another 256 bytes moved. That three‑step ritual is the read‑modify‑write.
- Tally the waste: 256 ÷ 64 = 4× write amplification (8× total media traffic once you count the read). Throughput collapsed from ~2.3 GB/s to ~0.56 GB/s (−75%). Every small write paid it, every time.
Optional code sketches
Pseudocode for the curious — safe to skip.
How the OS handles a write to read‑only memory
# app tries to write to an LtRAM page when app writes to LtRAM page: OS catches the write # hardware raises a flag OS copies the page into regular DRAM OS lets the app write to the DRAM copy OS updates its map: app now points to DRAM copy LtRAM original stays unchanged, cleared later
Like a librarian who won't let you write in the reference book but will photocopy the page so you can mark your own copy. The app never notices the swap.
How the OS rations writes to protect LtRAM
every second: add a few tokens to the bucket when OS wants to move a page into LtRAM: if bucket has at least 1 token: spend 1 token write the page to LtRAM else: hold off — try again later
Total tokens over the device's lifetime equals exactly the total writes the chip can handle, so it can never wear out ahead of schedule.
How reading from LtRAM works — no hidden steps
when app reads data at address X: send address X straight to LtRAM chip LtRAM chip returns the data directly done # no lookup table, no translation, no detour
Optane consulted a secret address table (the AIT) before every read, adding 76–200 ns. Throwing that table away is the main reason the new design is projected 26–79% faster on reads.
Practice problems
Try each one before opening the answer. Each ends with the common misconception it's checking for.
Show answer
Show answer
Show answer
Show answer
Show answer
Show answer
Show answer
Show answer
Show answer
Frequently asked questions
Fifteen of the questions a newcomer most often asks. Tap any to expand.
Q1Why not just buy more DRAM instead of inventing new memory?▾
Q2What actually happens when an app tries to write to AROM memory?▾
Q3Is fast new memory like STT‑MRAM something I can buy today?▾
Q4What if the OS guesses wrong and moves data that's actually written a lot?▾
Q5How is this different from Intel Optane, which already tried cheap memory?▾
Q6Why are small writes such a big deal? Optane could write, couldn't it?▾
Q7Does every application benefit from AROM?▾
Q8Is AROM a kind of permanent storage, like a hard drive or SSD?▾
Q9Is AROM just a cache for DRAM?▾
Q10When the system copies my data on a write, did something go wrong?▾
Q11If this memory wears out from writing, how do they stop it dying early?▾
Q12Does this token system fully solve the wear‑out problem?▾
Q13Which exact memory technology will AROM actually use?▾
Q14Could a smarter program, maybe using ML, predict which data to move?▾
Q15Did they prove this is faster than DRAM, or is that still a hope?▾
Glossary & further reading
Every term in one place, then a guided path to go deeper.
The computer memory hierarchy: RAM, cache, and storage
How computers store data at different speeds and costs. Try "memory hierarchy explained" or "CPU cache vs RAM vs SSD" (Khan Academy, Computerphile, MIT OCW). The whole paper exists because different memory sits at different points on this curve.
How operating systems manage memory: virtual memory and page tables
How the OS gives each program its own view of memory, uses 4 KB "pages," and handles a page fault. See the free Operating Systems: Three Easy Pieces (ostep.org). The paper's solution lives almost entirely in the OS memory manager.
Copy‑on‑write: how the OS shares memory safely
The single mechanical trick the paper uses to enforce AROM. Search "copy‑on‑write Linux explained."
How flash storage works: cells, erase cycles, and wear leveling
Why flash erases in large blocks, why cells wear out, and how SSDs hide it. NOR flash is the prototype's stand‑in LtRAM, and these properties are shared by all serious LtRAM candidates.
Why memory is so expensive in cloud data centers
How server hardware costs break down at hyperscalers, and why DRAM became such a large fraction. The paper opens with DRAM being over half of server cost at Azure and Meta.
Storage Class Memory is Dead, All Hail Managed‑Retention Memory medium
Proposes that emerging memory deserves its own purpose‑built interface — the intellectual parent of AROM. Short and accessible.
Towards Memory Specialization: A Case for Long‑Term and Short‑Term RAM medium
The paper that coined "LtRAM" — effectively "chapter one" of the story AROM continues.
Software‑Defined Far Memory in Warehouse‑Scale Computers hard
How Google reduced DRAM by compressing cold pages in software — the best pure‑software answer, and why software alone isn't enough.
Basic Performance Measurements of the Intel Optane DC Persistent Memory Module hard
The measurement study that found Optane's read bandwidth collapses 67% when reads and writes mix — the hard numbers motivating AROM.
Pond: CXL‑Based Memory Pooling Systems for Cloud Platforms hard
Sharing a common pool of memory over CXL — source of the "DRAM is more than half of Azure server cost" data point.