-
Notifications
You must be signed in to change notification settings - Fork 200
Reclaim space for all structures in data.kz file #5443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6a2cbe5
to
9c0837d
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5443 +/- ##
=======================================
Coverage 86.58% 86.59%
=======================================
Files 1422 1423 +1
Lines 62492 62579 +87
Branches 7656 7678 +22
=======================================
+ Hits 54110 54191 +81
- Misses 8203 8209 +6
Partials 179 179 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@@ -274,7 +287,7 @@ DiskArrayInternal::getAPPageIdxAndAddAPToPIPIfNecessaryForWriteTrxNoLock( | |||
KU_ASSERT(apIdx == getNumAPs(headerForWriteTrx)); | |||
// We need to add a new AP. This may further cause a new pip to be inserted, which is | |||
// handled by the if/else-if/else branch below. | |||
page_idx_t newAPPageIdx = fileHandle.addNewPage(); | |||
page_idx_t newAPPageIdx = fileHandle.getPageManager()->allocatePage(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the PIP does some manual eviction + truncation of allocated pages, can somebody else double-check if using this PIP as-is with the page manager is safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removePageIdxAndTruncateIfNecessary
should be safe to remove, and I think it might be necessary to remove it since it truncates the file size (though at the moment just truncates the PageStates and other in memory structures, not the actual file on disk), and re-using pages from the middle of the file could break stuff.
I'm not sure about clearWALPageVersionAndRemovePageFromFrameIfNecessary
, but I suspect it can also be removed. It appears that this is the only place we're using ShadowFile::clearShadowPage
, is it not generally necessary to clear those when rolling back any more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ray6080 what are your thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to remove removePageIdxAndTruncateIfNecessary
and keep clearWALPageVersionAndRemovePageFromFrameIfNecessary
as-is for now, we can revisit this in the future if we need to.
@@ -88,5 +88,17 @@ size_t DiskArrayCollection::addDiskArray() { | |||
return oldSize; | |||
} | |||
|
|||
void DiskArrayCollection::reclaimStorage(PageManager& pageManager, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied the iteration logic for DiskArrayCollection::checkpoint()
here, can you double-check to see if this is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to me. Since the body is much shorter, you could even simplify it a little by starting with pageManager.freePage(firstHeaderPage)
and then in the body freeing each nextHeaderPage
(if its not invalid). But it doesn't make much difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. In this case I think it's possible for the first header page to also be invalid so I'll keep it as-is just to keep it as one check.
Benchmark ResultMaster commit hash:
|
@@ -274,7 +287,7 @@ DiskArrayInternal::getAPPageIdxAndAddAPToPIPIfNecessaryForWriteTrxNoLock( | |||
KU_ASSERT(apIdx == getNumAPs(headerForWriteTrx)); | |||
// We need to add a new AP. This may further cause a new pip to be inserted, which is | |||
// handled by the if/else-if/else branch below. | |||
page_idx_t newAPPageIdx = fileHandle.addNewPage(); | |||
page_idx_t newAPPageIdx = fileHandle.getPageManager()->allocatePage(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removePageIdxAndTruncateIfNecessary
should be safe to remove, and I think it might be necessary to remove it since it truncates the file size (though at the moment just truncates the PageStates and other in memory structures, not the actual file on disk), and re-using pages from the middle of the file could break stuff.
I'm not sure about clearWALPageVersionAndRemovePageFromFrameIfNecessary
, but I suspect it can also be removed. It appears that this is the only place we're using ShadowFile::clearShadowPage
, is it not generally necessary to clear those when rolling back any more?
@@ -88,5 +88,17 @@ size_t DiskArrayCollection::addDiskArray() { | |||
return oldSize; | |||
} | |||
|
|||
void DiskArrayCollection::reclaimStorage(PageManager& pageManager, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to me. Since the body is much shorter, you could even simplify it a little by starting with pageManager.freePage(firstHeaderPage)
and then in the body freeing each nextHeaderPage
(if its not invalid). But it doesn't make much difference.
update Run clang-format WIP Track more allocations through page manager Correctly reclaim overflow file pages Reclaim catalog/metadata/pk index header Fix python tests
Description
Allocations for the following structures are now tracked + freed with the page manager:
Also added a table function
FILE_INFO
for testing purposes. It returns the number of total pages in the data.kz file handle.