20220111_backup_rsync_duplicity.rst (6064B)
1 Combining duplicity and rsync 2 ############################# 3 4 :date: 2022-01-15 16:57 5 :category: Archiving 6 :author: Louis Holbrook 7 :tags: backup,rsync,duplicity,bash 8 :slug: backup-rsync-duplicity 9 :summary: An exercise in combining plain and encrypted backups on local and remote hosts 10 :series: Organizing backups 11 :seriesprefix: organizing-backups 12 :seriespart: 1 13 :lang: en 14 :status: published 15 16 17 There are two awesome, weathered tools out there that are all you really need for your personal backups. [1]_ One is the `rsync cli`_, the other is duplicity_. 18 19 The former should need no introduction. 20 21 The latter operates more like tar. But it still works over ssh like rsync. In fact, it's based on librsync_ which implements the `rsync protocol`_. The special sauce however is, of course, *encryption*. 22 23 24 Backup categories 25 ================= 26 27 Let's for the sake of argument say that our personal backups can be divided in three categories: 28 29 30 Stuff that can be public 31 ------------------------ 32 33 Code snippets, git repositories, public data store states (e.g. blockchain ledgers), copies of OS packages and any other assets assets without redistribution issues. 34 35 For this we will use rsync_. 36 37 38 Sensitive stuff 39 --------------- 40 41 Passwords, keys, contacts, calendars, contracts, invoices, task lists, databases, system configurations, application data. 42 43 For this we will use duplicity_. 44 45 Secret stuff 46 ------------ 47 48 Long-lived keys, password- and volume decryption keys, cryptocurrency keys and meta-information about the backups themselves. 49 50 This will not be addressed now. 51 52 53 Why not just one or the other? 54 ============================== 55 56 Duplicity_ stores everything in an archive file format. That means that you must first authenticate, decrypt and unpack the archive in order to even browse the files inside. 57 58 If there is no reason to keep the files from prying eyes, then it's much more practical to be able to browse the files where they lie, with the regular filesystem tools. In such a case, rsync_ will scratch your itch. 59 60 For the **sensitive** and **secret stuff**, there would be no real need to use duplicity_ if you were only operating on your local host. You'd just use an encrypted volume [2]_ and rsync_ everything in there. 61 62 But half the point here is to keep remote copies aswell as your local ones. You know, in case of fire, hardware-eating locust swarms or some totalitarian minions nabbing all your electronics. Unless "remote" here means some box hidden in some moated leisure castle of yours, you'll want to encrypt everything *before* you ship it off. And that's where duplicity_ comes in. 63 64 65 Vive la difference 66 ================== 67 68 Of course, it would be too much to hope for that duplicity_ and `rsync cli`_ have aligned the ways they parse their invocation parameters. 69 70 Here are some examples [3]_ of how they do *not* match: 71 72 73 local to local 74 -------------- 75 76 .. code-block:: bash 77 78 $ rsync -a src/ /path/to/dst/ 79 80 $ duplicity src/ file:///path/to/dst/ 81 82 83 local to remote, relative path 84 ------------------------------ 85 86 .. code-block:: bash 87 88 $ rsync -a src/ user@remotehost:path/to/dst/ 89 90 $ duplicity src/ scp://user@remotehost/path/to/dst 91 92 93 toggle dotfiles from current path 94 --------------------------------- 95 96 .. code-block:: bash 97 98 # include only .foo/foo.txt given the current structure: 99 $ tree src/ -a 100 src/ 101 ├── .bar 102 ├── baz 103 └── .foo 104 └── foo.txt 105 106 $ rsync --exclude=".b*" --include=".*/***" --exclude="*" ./ ../dst/ 107 108 $ duplicity --exclude="./.b*" --include="./.*/***" --exclude="*" ./ file:///home/lash/tmp/dst/ 109 110 logging 111 ------- 112 113 .. code-block:: bash 114 115 # spill the beans 116 $ rsync -vv ... 117 118 $ duplicity -v debug 119 120 121 122 Batchin' 123 ======== 124 125 Since you will want to select up front which tool to use for which sensititivy category, you'll be writing the includes and excludes specifically for the tool anyway. 126 127 So the only real issue with the above is the way remote host is specified. 128 129 Let's say we choose to stick to the `rsync cli`_ host format. That means we need to make the following translations: 130 131 .. list-table:: 132 :widths: 50 50 133 :header-rows: 1 134 135 * - rsync 136 - duplicity 137 * - ``foo/bar`` 138 - ``file://foo/bar`` 139 * - ``/foo/bar`` 140 - ``file:///foo/bar`` 141 * - ``user@host:foo/bar`` 142 - ``scp://user@host/foo/bar`` 143 * - ``user@host:/foo/bar`` 144 - ``scp://user@host//foo/bar`` 145 146 147 Expressed in ``bash`` that could look like this: 148 149 .. include:: code/backup-rsync-duplicity/translate.sh 150 :code: bash 151 152 153 Let's behave and test our code: 154 155 .. include:: code/backup-rsync-duplicity/translate_test.sh 156 :code: bash 157 158 .. code-block:: bash 159 160 # 0 == good! 161 $ BAK_TEST=1 bash remote.sh && echo $? 162 0 163 164 165 Now we can use the `rsync cli`_ path input, and use that same input to a batch of single backup steps, each which may use `rsync cli`_ or duplicity_ 166 167 .. code-block:: bash 168 169 to_duplicity_remote localhost:/foo/bar 170 171 rsync -avzP pub/ $remote_base:src/ 172 173 duplicity -v info secret/ $remote_duplicity_base:secret/ 174 175 176 See also 177 ======== 178 179 * https://git.defalsify.org/rsync-duplicity-backups 180 181 182 .. 183 184 .. [1] Ok, I know, I assuming that you are using ``git`` in daily life, too. 185 186 .. [2] Provided, of course, that it's an encrypted volume that you don't keep unlocked all the time. 187 188 .. [3] Duplicity needs at a minimum a password for symmetric encryption, and will prompt for it unless it's set in the environment. Simply ``export PASSPHRASE=test`` for these examples to relieve you of the annoyance. 189 190 .. 191 192 .. _duplicity: https://duplicity.gitlab.io/duplicity-web/ 193 194 .. _Duplicity: https://duplicity.gitlab.io/duplicity-web/ 195 196 .. _`rsync cli`: https://rsync.samba.org/ 197 198 .. _rsync: https://rsync.samba.org/ 199 200 .. _librsync: http://librsync.sourcefrog.net/ 201 202 .. _`rsync protocol`: https://rsync.samba.org/tech_report/