20221015_fresh_git.rst (5293B)
1 Keeping your gits in a row 2 ########################## 3 4 :date: 2024-06-18 16:28 5 :modified: 2024-06-18 16:33 6 :category: Archiving 7 :author: Louis Holbrook 8 :tags: git,bash 9 :slug: git-fresh 10 :summary: Scripts to keep your local git clone fresh, and help you move them around. 11 :lang: en 12 :status: published 13 14 15 I believe that if you use a piece of code, you are also responsible for making sure that that code is available in the future. 16 17 In this spirit, I decided a couple of years ago that I would keep a full clone of all VCS repositories that I use. 18 19 20 Can't someone else do it? 21 ========================= 22 23 Yeah, yeah, I hear ya. 24 25 But imagine that one day you cannot reach the code repository anymore. 26 27 It could be because you are working where internet is scarce or impossible to rely on. 28 29 It could be that you have to cope with what was in your faraday cage when a giant solar flare happened. 30 31 It could be that you, or the author of the code, have been cut off by the accelerating `weaponization of everything <https://torrentfreak.com/the-eu-wants-its-own-dns-resolver-that-can-block-unlawful-traffic-220119/>`_. 32 33 Or maybe none of the above happened. But you still understand and appreciate what it means to build a truly decentralized society, where we all participate and contribute, not only consume. 34 35 36 Git organized 37 ============= 38 39 For every `git` repository that I use, I actually keep a *local copy* on my daily device. 40 41 I also keep a copy on a device at home, *and* on a remote device. 42 43 My thinking is: 44 45 1. If I lose my laptop, I have two copies 46 2. If my house burns down, I have two copies 47 3. If my house burns down *with* my laptop inside, I have *at least one more copy*. 48 49 ... and so on. 50 51 52 I hate to move it, move it 53 ========================== 54 55 Sometimes we have to, though,. 56 57 And what can be a real pain is to move heaps of code repositories around. For example if you are moving to a new machine, or want to bootstrap a new copy without having to source the data yourself. 58 59 To make this easier, I wrote the `gitrefresh bash tool <https://holbrook.no/src/gitrefresh/log.html>`_ to copy only the minimum of information required to source the data from a remote. [1]_ 60 61 62 63 64 Freshening up 65 ============= 66 67 To make sense of what is what in the repository store, I use a simple folder structure. 68 69 Obviously, when I create copies of the repository store, I would like to keep the same folder structure. So the tool needed to make that possible. 70 71 Additionally, what's needed are tools to bootstrap a repository group from a list, and a tool to refresh those repositories periodically once they've been bootstrapped. 72 73 To achieve this, I actually wrote `three tools <https://holbrook.no/src/gitrefresh/log.html>`_, as follows: 74 75 76 `gitlist.sh` 77 ------------ 78 79 create a list of `git` repositories under a filesystem path, with the option of preserving the directory structure. 80 81 82 `gitstart.sh` 83 ------------- 84 85 clone `git` repositories from a list generated from :code:`gitlist.sh`, with or without direcory structure. 86 87 88 `gitrefresh.sh` 89 --------------- 90 91 fetch and merge updates from remotes of each repository under a directory. 92 93 94 Behavior 95 ======== 96 97 The :code:`gitlist.sh` and :code:`gitrefresh.sh` tools work more or less the same way. 98 99 They traverse a directory structure recursively. 100 101 Every time a valid git repository is found, that repository is processed. Afterwards, the tool will exit to the parent folder. [2]_ 102 103 104 Example 105 ------- 106 107 Let's say we have three repositories that we are mirroring locally: 108 109 * :code:`https://github.com/bitcoin/bips` under :code:`btc/bips` 110 * :code:`https://aur.archlinux.org/libkeccak.git` under :code:`os/archlinux/aur/libkeccak` 111 * :code:`git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git` under :code:`linux/linux` 112 113 First we use :code:`gitlist.sh` to generate the list of repos to bootstrap [3]_: 114 115 .. code:: console 116 117 $ gitlist.sh -p | tee gitlist.txt 118 https://github.com/bitcoin/bips btc/bips 119 https://aur.archlinux.org/libkeccak.git os/archlinux/aur/libkeccak` 120 git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git linux/linux` 121 122 123 Using :code:`gitstart.sh` with this list, we can restore this bunch of repositories *with* the same directory structure anywhere else: 124 125 .. code:: console 126 127 $ cd /path/to/new/repos/location 128 $ gitstart.sh < gitlist.txt 129 130 Now, the idea is that from time to time you should get the latest changes from the upstream source. 131 132 I simply combine :code:`gitrefresh.sh` with :code:`cron` to do this on the remote, while manually doing the refresh locally once in awhile. 133 134 Using the tool, all it takes is: 135 136 .. code:: console 137 138 $ cd /path/to/new/repos/location 139 $ gitrefresh.sh pull 140 141 142 .. 143 144 .. [1] Yes. I didn't get beyond `git` yet. But at least it's a start. 145 146 .. 147 148 .. [2] This, of course, means that the tool will not automatically archive code from *submodules*. The submodule construct is a target of both a lot of love and a lot of hate. Personally, I like it. But at the same time it is my opinion that it does not absolve us from *knowing* and being *mindful* which submodules a repository is using, and thus making sure that we have an independent clone of that repository. 149 150 .. 151 152 .. [3] We add the :code:`-p` flag to preserve the directory structure on disk. 153