Skip to content

Conversation

h7x4
Copy link
Contributor

@h7x4 h7x4 commented Sep 11, 2025

When running gitea dump, don't store the contents of data/repo-archive in the output.

These archives can easily be regenerated from the repository data, and does not need to be backed up.

Fixes #35450


Added a --skip-repo-archive flag to the gitea dump command.

Enabling this flag will exclude the contents of data/repo-archive from the dump. The implementation is similar to the other --skip-* flags that excludes directories from data.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Sep 11, 2025
@github-actions github-actions bot added modifies/go Pull requests that update Go code modifies/cli PR changes something on the CLI, i.e. gitea doctor or gitea admin labels Sep 11, 2025
@@ -265,6 +269,7 @@ func runDump(ctx context.Context, cmd *cli.Command) error {
excludes = append(excludes, setting.LFS.Storage.Path)
excludes = append(excludes, setting.Attachment.Storage.Path)
excludes = append(excludes, setting.Packages.Storage.Path)
excludes = append(excludes, setting.RepoArchive.Storage.Path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only need to exclude the repo archive path by default, no need to introduce a more flag.

Because the repo archives are automatically generated from existing repositories, so they would only be duplicate if they are copied into the dump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So like, just skip them outright? Or skip by default and have a --include-repo-archive flag?

I imagine some people might want to take care of these files for keeping the hashes of the archives the same. Although I have not tested it, I assume timestamps are going to change between each time you regenerate the archive. Or do we just leave those people to make their own rsync script or something if they really need it?

Copy link
Contributor

@wxiaoguang wxiaoguang Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine some people might want to take care of these files for keeping the hashes of the archives the same.

As far as I know, I don't see there would be such users.

I assume timestamps are going to change between each time you regenerate the archive

That's a good point. But I think it doesn't matter because the archives are automatically generated when a visitor clicks the "download archive" button. And Gitea also supports "clean up the archives periodically" cron: archive_cleanup. These archives are just like "caches".

Or do we just leave those people to make their own rsync script or something if they really need it?

I think so. Or if there would be a real use case, we can still introduce this flag in the future.

@h7x4 h7x4 force-pushed the cmd-dump-add-skip-repo-archive-flag branch from e48fd27 to 8cc0f98 Compare September 12, 2025 01:52
@h7x4 h7x4 changed the title Add --skip-repo-archive flag to dump command Don't store repo archives on gitea dump Sep 12, 2025
@h7x4 h7x4 requested a review from wxiaoguang September 12, 2025 01:55
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Sep 12, 2025
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Sep 12, 2025
@lunny lunny added the type/enhancement An improvement of existing functionality label Sep 12, 2025
@lunny lunny added this to the 1.25.0 milestone Sep 12, 2025
@techknowlogick techknowlogick merged commit 7a474d1 into go-gitea:main Sep 12, 2025
26 checks passed
zjjhot added a commit to zjjhot/gitea that referenced this pull request Sep 13, 2025
* giteaofficial/main:
  Fix different behavior in status check pattern matching with double stars (go-gitea#35474)
  Replace webpack with rspack (go-gitea#35460)
  Don't store repo archives on `gitea dump` (go-gitea#35467)
  Fix SSH signing key path will be displayed in the pull request UI (go-gitea#35381)
  [skip ci] Updated translations via Crowdin
  Update image name in integration README (go-gitea#35465)
silverwind added a commit to ChristopherHX/gitea that referenced this pull request Sep 16, 2025
* origin/main:
  Clean up npm dependencies (go-gitea#35484)
  Update eslint to v9 (go-gitea#35485)
  Revert the rspack change (go-gitea#35482)
  Replace gobwas/glob package (go-gitea#35478)
  Fix various typos in codebase (go-gitea#35480)
  Fix different behavior in status check pattern matching with double stars (go-gitea#35474)
  Replace webpack with rspack (go-gitea#35460)
  Don't store repo archives on `gitea dump` (go-gitea#35467)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/cli PR changes something on the CLI, i.e. gitea doctor or gitea admin modifies/go Pull requests that update Go code type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add --skip-repo-archive option to gitea dump
5 participants