Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Native support for incremental restore (#13239)
Summary: With this change we are adding native library support for incremental restores. When designing the solution we decided to follow 'tiered' approach where users can pick one of the three predefined, and for now, mutually exclusive restore modes (`kKeepLatestDbSessionIdFiles`, `kVerifyChecksum` and `kPurgeAllFiles` [default]) - trading write IO / CPU for the degree of certainty that the existing destination db files match selected backup files contents. New mode option is exposed via existing `RestoreOptions` configuration, which by this time has been already well-baked into our APIs. Restore engine will consume this configuration and infer which of the existing destination db files are 'in policy' to be retained during restore. ### Motivation This work is motivated by internal customer who is running write-heavy, 1M+ QPS service and is using RocksDB restore functionality to scale up their fleet. Given already high QPS on their end, additional write IO from restores as-is today is contributing to prolonged spikes which lead the service to hit BLOB storage write quotas, which finally results in slowing down the pace of their scaling. See [T206217267](https://www.internalfb.com/intern/tasks/?t=206217267) for more. ### Impact Enable faster service scaling by reducing write IO footprint on BLOB storage (coming from restore) to the absolute minimum. ### Key technical nuances 1. According to prior investigations, the risk of collisions on [file #, db session id, file size] metadata triplets is low enough to the point that we can confidently use it to uniquely describe the file and its' *perceived* contents, which is the rationale behind the `kKeepLatestDbSessionIdFiles` mode. To find more about the risks / tradeoffs for using this mode, please check the related comment in `backup_engine.cc`. This mode is only supported for SSTs where we persist the `db_session_id` information in the metadata footer. 2. `kVerifyChecksum` mode requires a full blob / SST file scan (assuming backup file has its' `checksum_hex` metadata set appropriately, if not additional file scan for backup file). While it saves us on write IOs (if checksums match), it's still fairly complex and _potentially_ CPU intensive operation. 3. We're extending the `WorkItemType` enum introduced in #13228 to accommodate a new simple request to `ComputeChecksum`, which will enable us to run 2) in parallel. This will become increasingly more important as we're moving towards disaggregated storage and holding up the sequence of checksum evaluations on a single lagging remote file scan would not be acceptable. 4. Note that it's necessary to compute the checksum on the restored file if corresponding backup file and existing destination db file checksums didn't match. ### Test plan ✅ 1. Manual testing using debugger: ✅ 2. Automated tests: * `./backup_engine_test --gtest_filter=*IncrementalRestore*` covering the following scenarios: ✅ * Full clean restore * Integration with `exclude files` feature (with proper writes counting) * User workflow simulation: happy path with mix of added new files and deleted original backup files, * Existing db files corruptions and the difference in handling between `kVerifyChecksum` and `kKeepLatestDbSessionIdFiles` modes. * `./backup_engine_test --gtest_filter=*ExcludedFiles*` ✅ * Integrate existing test collateral with newly introduced restore modes Pull Request resolved: #13239 Reviewed By: pdillinger Differential Revision: D67513875 Pulled By: mszeszko-meta fbshipit-source-id: 273642accd7c97ea52e42f9dc1cc1479f86cf30e
- Loading branch information