Borg - Httqm's Docs

borgbackup

Permissions needed to backup (source) :

your own files : just run Borg as your normal user
files of other users or the operating system : running Borg as will be required

initialize a repository

A repository can either be a local filesystem or a remote storage accessed via ssh (details). A different syntax is needed to refer to a local vs "ssh" repository. Examples below apply to a local repository.

borg init --encryption=mode /path/to/repo

make a backup (aka archive)

borg create --stats --progress /path/to/repo::archiveName path/to/data/to/backup

archiveName : for a daily backup, the file name may be something like yyyy-mm-dd, which can be achieved by :
- computing archiveName by yourself with date
- using Borg's {now} placeholder (which accepts a custom format string, see examples)
--stats and --progress are there to make Borg more verbose (it's pretty quiet by default)
archives are compressed with default settings when --compression is omitted
you may also make Borg even more verbose with -v and --list

This outputs :

Creating archive at "/path/to/repo::archiveName"

(status + list of files if --list was used)

------------------------------------------------------------------------------
Repository: /path/to/repo
Archive name: archiveName
Archive fingerprint: 5fed43cf97cc903917073a436f9255879ff14b92505750c39f41db80008c3659
Time (start): Sun, 2024-04-28 11:28:27
Time (end):   Sun, 2024-04-28 11:28:27
Duration: 0.36 seconds
Number of files: 666
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:              458.04 MB            433.47 MB                563 B
All archives:              916.09 MB            866.95 MB            403.56 MB

                       Unique chunks         Total chunks
Chunk index:                     717                 1538
------------------------------------------------------------------------------

about the Deduplicated size :

                       Original size      Compressed size    Deduplicated size
This archive:              458.04 MB            433.47 MB                563 B
All archives:              916.09 MB            866.95 MB            403.56 MB

Deduplicated size of :

This archive : amount of data stored only for this archive, i.e. chunks that exist only in this archive
All archives : amount of data stored in the repo, i.e. all chunks in the repository

get information about an archive

borg info /path/to/repo::archiveName

list archives

borg list /path/to/repo

list contents of an archive

borg list /path/to/repo::archiveName

use --json-lines for details

check

a single archive :
borg check /path/to/repo::archiveName
the repository itself (i.e. all archives) :
borg check /path/to/repo

Without further instructions (-v), Borg only returns a success / failure exit code
Can Borg verify data integrity of a backup archive?

compare archives

borg diff /path/to/repo::archive1 archive2

archive1 and archive2 must both belong to the same repository : /path/to/repo
How can I compare contents of an archive to my local filesystem?

restore files

with extract

borg extract /path/to/repo::archiveName

see examples
you may also
- simulate the extract
- make Borg more verbose with -v and --list

with mount

This is not exactly a file restore command, but once an archive is mounted, you can browse its contents and copy files as you wish.

borg mount /path/to/repo::archiveName /mount/point
browse / copy files
fusermount -u /mount/point

prune repository

For consistent backup of data, you don't need to keep all archives you ever made. You'll have to define a retention policy and enforce it with :

borg prune [options] [policy] /path/to/repo

How do Borg's prune work with deduplication?

the magic is that files and chunks don't actually belong to an archive, they are just referenced by it
so any file or chunk can be —is!— referenced by multiple archives, be it archive_1, archive_2 or archive_n : this is deduplication at work !
since an archive is just a list of chunks, prune-ing an archive just deletes a list of keys, but chunks themselves remain untouched

What happens to orphan chunks (i.e. chunks that are not referenced anymore) ?

I guess they end up being garbage-collected by borg compact, but I've found no evidence of this 
https://www.reddit.com/r/BorgBackup/comments/17zoswk/how_does_deduplication_work_if_you_delete_a_backup/
https://github.com/systemd/casync/issues/43

Storage space is not freed until you run borg compact.

delete archive or repository

archive : borg delete /path/to/repo::archiveName
repository : borg delete /path/to/repo

Command	Flag	Usage
check	(none)	verify the consistency of a repository and its archives. It consists of two major steps : check the consistency of the repository itself check the consistency and correctness of the archive metadata archive data, with --verify-data both steps can also be run independently with --repository-only or --archives-only
diff		find differences (file contents, user/group/mode) between archives
extract	(none)	extract the contents of an archive by default, the whole archive is extracted to extract only a subset : provide a list of files / directories to extract use the --pattern* / --exclude* options data is written to the current directory, there is no option to specify the output directory
extract	-n --dry-run	do not actually change any file
init	--encryption=mode	with mode: none : anyone can read or alter archives authenticated : archives are not encrypted but modifications will be detected keyfile : stores the encrypted key into ~/.config/borg/keys/ repokey : stores the encrypted key into repoDir/config repokey-blake2 : (?) mode can only be specified when initializing a new repository and can't be changed afterwards
mount		mount archive as a FUSE filesystem to browse its contents or restore individual files examples you can also declare a borgfs device in /etc/fstab
prune	--keep-period=n	keep n period archives, with period : hourly, daily, weekly, monthly, yearly a daily archive to keep is the latest archive made on a given day (and so on with other values of period) n must be understood as the number of archives to keep, whenever archives are made : with --keep-daily=7, the oldest kept archive will be 1 week old if we backup every day 70 days old if we backup every 10 days n counts archives, not periods
prune	-n --dry-run	do not actually change any file
-p --progress		show progress
-v --verbose --info		verbose mode : display INFO level log entries

archive

what you get after a single backup operation (run with borg create )
contains a backup copy —i.e. snapshot of data— and aka backup
one can later extract or mount an archive to restore data
Borg archives take advantage of (source) :
- deduplication : any file chunk is stored only once, allowing daily backups since only changes are stored
- compression : save space (at the cost of speed / CPU usage)
- authenticated encryption : you may store archives on not fully trusted targets

caches

files cache : stored in cache/files and is used at backup time to quickly determine whether a given file is unchanged and we have all its chunks
chunks cache : stored in cache/chunks and is used to determine whether we already have a specific chunk

chunks

The object graph
Files are sliced in chunks (which is at the root of deduplication) : when creating a new archive, if one of these slices of data is already stored in an existing chunk, that chunk is linked/referenced by the new archive, hence using no storage space. (inspired by, details)

repository

filesystem directories acting as self-contained stores of archives
can be either :
- a local filesystem :
  - a local storage device (dedicated partition, USB disk, )
  - a remote filesystem mounted locally (Samba, SSHFS, NFS, )
- or a remote filesystem accessed over ssh (details)
What is the difference between a repo on an external hard drive vs. repo on a server?
under the hood, repositories contain data blocks and a manifest tracking which blocks are in each archive. If some data hasn't changed from one backup to another, Borg can simply reference an already uploaded data chunk (deduplication)

segment

Transactionality is achieved by using a log (aka journal) to record changes. The log is a series of numbered files called segments. Each segment is a series of log entries : the segment number together with the offset of each entry relative to its segment start establishes an ordering of the log entries. (source)
Borg stores its data in a repository : a filesystem-based transactional key-value store. Thus the repository does not know about the concept of archives or items. Objects referenced by a key are stored inline in files (segments) of approx. 500 MB size in numbered subdirectories of /path/to/repo/data.
segment are strictly append-only.
Each log entry is like :
```
CRC32 of log entry|entry size|tag|object key|data
```
with :
- tag : one of
  - PUT : the log entry adds data
  - DELETE : the log entry marks data as DELETED but doesn't actually delete data
  - COMMIT : ends a transaction
- data : for PUT entries only

Borg - aka "BorgBackup" : Deduplicating archiver with compression and encryption

Borg usage

Installed with the Debian package

Usage