Asynchronous non-blocking temporary storage initialization tolerating broken origins
Categories
(Core :: Storage: Quota Manager, task, P1)
Tracking
(ASSIGNED bug which should be worked on in the current release/iteration)
People
(Reporter: janv, Assigned: janv)
References
(Depends on 4 open bugs, Blocks 13 open bugs)
Details
Attachments
(1 obsolete file)
We are currently facing two major problems:
- Temporary storage initialization errors
- Slow temporary storage initialization in some cases
The temporary storage initialization must be currently successfully finished before a quota client can access files on disk (doesn't apply for the persistent repository).
If we could make the initialization asynchronous then we would mitigate the two major problems in a big way.
I'm still not 100% sure if it is feasible, but so far it looks good.
| Assignee | ||
Comment 1•5 years ago
|
||
We are now fully working on this, it's our top priority.
Comment 2•5 years ago
|
||
I think the description and summary should make one thing explicit that is currently implicit only: At the moment, storage initialization succeeds or fails atomically for all quota clients and origins. This bug intends to be change that so the success/failure is partial (per origin?). This is an important point, because this could be implemented (addressing problem number 1) independently from making initialization asynchronous (which addresses problem number 2). However, this bug specifically suggests to make both changes in one go.
| Assignee | ||
Comment 3•5 years ago
•
|
||
That's true (partially), but it can't be easily implemented independently. I'll try to provide more details about that since I'm focusing on this more right now.
| Assignee | ||
Comment 4•5 years ago
•
|
||
The main point here is that we currently allow only 100% accurate quota management. We want to change it so we will allow "inaccurate" usage tracking for the time the temporary storage is being initialized. So quota clients will be able to use storage even if the initialization is not finished yet. When the initialization finishes and we see that more data has been written than it would be allowed with synchronous initialization, we will evict some origins. After that we will have 100% accurate quota management/usage tracking again.
Once we do necessary changes for initializing origins asynchronously, then they naturally won't be able to break entire temporary storage initialization. The quota management will stay in "not 100% accurate" mode since we couldn't get exact usage for broken origins, The broken origins will stay uninitialized with some files on disk and they won't be included in overall usage calculations. The fact that we leave some extra files on disk which are not included in the usage calculations shouldn't be a big problem. We already had to make an exception for LSNG which tracks only logical size of the database. So the total physical size of all files doesn't have to match the usage we internally use for quota checks. We only allow to use 50% of free disk space anyway, so there should be a lot of space in reserve.
Comment 5•5 years ago
|
||
As part of this, the comment in QuotaManager::CollectOriginsForEviction should be addressed, see https://phabricator.services.mozilla.com/D101182#inline-580859.
Updated•3 years ago
|
Comment 8•3 years ago
|
||
As I wrote on 1741865 if you need to profile or to test some firefox patched let me know as I have that issue only on my workstation since 2 years...
| Assignee | ||
Comment 10•2 years ago
|
||
One of the goals of bug 1671932 is to call EnsureTemporaryStorageIsInitialized
only from InitTemporaryStorageOp. Calling from other places including quota
clients will be disallowed by changing the method to a private method. The
private nature of the method should be emphasized by adding the Internal
suffix.
Changes done in this patch:
- IsTemporaryStorageInitialized renamed to
IsTemporaryStorageInitializedInternal - EnsureTemporaryStorageIsInitialized renamed to
EnsureTemporaryStorageIsInitializedInternal
Depends on D186781
Comment 11•2 years ago
|
||
Comment on attachment 9353390 [details]
Bug 1671932 - Rename EnsureTemporaryStorageIsInitialized to EnsureTemporaryStorageIsInitializedInternal; r=#dom-storage
Revision D188332 was moved to bug 1808294. Setting attachment 9353390 [details] to obsolete.
Comment 12•2 years ago
|
||
Tracked by Reddit: https://old.reddit.com/r/firefox/comments/17u92cj/firefox_first_startup_high_disk_usage_io_wait/
This affects user experience and 2023 user experience is the key.
| Assignee | ||
Comment 13•2 years ago
|
||
We are working really hard on this bug.
| Assignee | ||
Updated•2 years ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 14•1 year ago
|
||
I noticed that there is a new loading tab icon. Does it have any special meaning?
https://i.imgur.com/IprcCyp.png
The left one is the old one. An animated dot.
The right one is the new one. Looks like an hourglass.
Comment 15•1 year ago
|
||
(In reply to eight04 from comment #14)
I noticed that there is a new loading tab icon. Does it have any special meaning?
The tab loading animation now stops after 45s to avoid wasting CPU time / energy for pages that never finish loading. This was implemented in bug 1812019. So no special meaning, other than "this page has been in a loading state for more than 45s".
Updated•1 year ago
|
Updated•1 year ago
|
Comment 16•1 year ago
|
||
I'm pretty sure this bug the cause of long-standing issues on startup (like... many years... at least 4 years, going back to at least v94), where the browser window opens, and even the first page loads, but you can't click anything for a long time. Like... 40+ seconds?
This affects every Firefox on every platform that I run it on, including several versions of Linux, both 32- and 64-bit, and on Windows. The common denominator is that all these systems have magnetic hard drives. I'm guessing that if you have an SSD, the problem is somewhat hidden from you, because I/O is faster. And what Firefox dev doesn't have an SSD?
But Firefox, on startup, wants to do a LOT of I/O, and what it's doing blocks all other activities, including page loads. You can open tabs, click menus, etc., but you can't interact with web pages.
I've actually become accustomed to walking away after starting Firefox, and coming back later. I've been doing this for years.
Oh... and also, in some earlier versions, doing something in a Private Window seems to bypass the problem. Only the non-private window is blocked by the startup IO. But in v136, which is the latest version that I'm running, this Private Window trick no longer works.
My current mitigation strategy, in v136, is to set Firefox to clear "Temporary cached files and pages" on exit. This seems to mostly fix the problem. However, whenever Firefox crashes, this clearing does not happen, and then the bad behavior returns on the next startup.
Currently, in v136, this heavy startup disk activity is all due to "QuotaManager IO"
Long ago, I logged all the files that were being touched at startup, and it seemed to be.... every single file in the cache?
I'm just chiming in here to applaud the efforts to work hard on this bug. It's a big one, and well worth fixing.
| Assignee | ||
Comment 17•1 year ago
|
||
Thanks a lot for this constructive and detailed comment, it really helps to hear from long-time users who have been observing these patterns across versions and platforms. It’s true that this issue has been around for a long time, partly because it’s been difficult to allocate enough resources to tackle it properly. Recently, though, we’ve started addressing it more directly through focus projects like the L2 Quota Info cache, see bug 1953860.
The L2 cache is designed as a middle ground between the very fast L1 Quota Info cache and full directory scans, and it should benefit Firefox Desktop as well, especially in cases where build IDs change frequently (such as on the Nightly channel) or after unexpected crashes, where we currently lose the fastest L1 cache and fall back to much slower full scans.
Here’s a quick status update for this bug: the work on asynchronous storage initialization, also known as incremental origin initialization, is mostly done. If you're adventurous enough, you can actually try it out by flipping the preference dom.quotaManager.temporaryStorage.incrementalOriginInitialization, but only experimentally and with a backed-up profile. More details can be found in comment 23 on bug 1867997.
Once the L2 Quota Info cache work is finished, we’ll get back to incremental origin initialization and continue moving it forward.
Comment 18•1 year ago
|
||
I appreciate the response! Good luck on your quest.
Description
•