-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Fix memory leak when pruning scrollback with non-standard pages #10251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Great minds, you beat us by 45 mins! We literally just got some debug logs and Claude came up with a similar fix. Thank you! |
This fixes a memory leak where standard-sized pages (592K) accumulate in the memory pool without being freed during adjustCapacity operations. PR ghostty-org#10251 fixed non-standard page leaks during scrollback pruning, but a similar leak existed in the adjustCapacity() path. When a standard page is expanded to non-standard size (e.g., when doubling grapheme_bytes from 8192 to 16384), the old standard page was returned to the pool via pool.pages.destroy(). However, the pool uses an ArenaAllocator that never frees individual items, causing these pages to accumulate indefinitely. This is particularly common with emoji-heavy and hyperlink-rich output (like Claude Code), where adjustCapacity() is called frequently to expand page capacity. Over time, this results in hundreds of thousands of leaked 592K VM_ALLOCATE regions. The fix: when replacing a standard page with a non-standard page during adjustCapacity(), munmap the old page directly instead of returning it to the pool. This follows the same pattern as PR ghostty-org#10251 but applies to the adjustCapacity() code path.
|
@mitchellh can you recheck my assumption here. Flow // ...
if (first.data.memory.len > std_size) {
self.destroyNode(first);
break :prune; // here exit before tracked_pins remap
}
/// Later in the code...
const pin_keys = self.tracked_pins.keys();
for (pin_keys) |p| {
if (p.node != first) continue;
p.node = self.pages.first.?;
p.y = 0;
p.x = 0;
p.garbage = true;
}
// ...Code for go-to: ghostty/src/terminal/PageList.zig Lines 2545 to 2580 in 509f073
|
|
That does look like a bug on first glance! |
|
@mitchellh if (prev) |prev_node| prev: {
// ...
for (dst_rows, src_rows) |*dst_row, *src_row| {
prev_page.cloneRowFrom(...); // move page to prev_page
}
// and no pin update from page here
}
// Remove the old page.
// Deallocate the old page.
self.pages.remove(chunk.node);
self.destroyNode(chunk.node); // pin still here a suppose
// pin dangleghostty/src/terminal/PageList.zig Lines 1828 to 1861 in 509f073
upd. added link and formatting |
|
I think I got this in the linked PR (see GH), or did I miss something? |
This finally resolves #9962 (and the myriad of dupes).
The core issue was that when a non-standard size page is reused as part of our scrollback pruning, it was resized to be standard size, which caused our future frees to believe it was pooled and not call
munmapproperly.The solution I chose was to never reuse non-standard sized pages. If during scrollback pruning we detect a non-standard page, we destroy it and re-alloc. This frees the old memory and reuses pool memory for the new page.
As part of this, I also introduced a custom page allocator that uses macOS's mach kernel virtual memory tagging feature to specifically tag our PageList memory. I was able to use this in conjunction with Instruments and
footprintto verify that our PageList memory was previously not freed and is now successfully freed.No AI was used in my work here. AI was used by others in their analysis of this issue that I used the results of to help guide me and give me other things to consider, but the ultimate understanding and fix was all done via my own meat sticks.
Detailed Explainer
Ghostty uses a memory pool of fixed-size
mmap-ed pages to serve as the backing memory for our terminal. If the terminal requires a non-standard (larger) amount of memory due to an abundance of emoji, styles, hyperlinks, etc. then we allocate non-pooled pages directly withmmap. When freeing our pages, if it is<= standard sizewe just put it back into the memory pool. If it is larger, wemunmapit.This explains and defines both a standard and therefore non-standard page.
Ghostty also has the concept of a "scrollback limit" (exposed to the user via the
scrollback-limitconfig). This caps the size of our scrollback or history. When we reach this limit, we have a trick that we do: to avoid allocation, we reuse the oldest page in the history.Unfortunately, as part of this process, we were resizing the underlying memory to be standard size again. This was causing our future frees to believe this was pooled memory, and never unmap it. This was the main source of the leak.
For SEO
For search-ability, I will note that this particular case was often caused by:
exaoreza)Thanks
Big shout out to @grishy for being the person that finally got me a reproduction so I could analyze the issue for myself. His own analysis got to the same conclusion as me but thanks to the reproduction I was able to verify both our understandings independently.