manbytesgnu_site

Source files for manbytesgnu.org
git clone git://holbrook.no/manbytesgnu_site.git
Info | Log | Files | Refs

20221015_fresh_git.rst (5293B)


      1 Keeping your gits in a row
      2 ##########################
      3 
      4 :date: 2024-06-18 16:28
      5 :modified: 2024-06-18 16:33
      6 :category: Archiving
      7 :author: Louis Holbrook
      8 :tags: git,bash
      9 :slug: git-fresh
     10 :summary: Scripts to keep your local git clone fresh, and help you move them around.
     11 :lang: en
     12 :status: published
     13 
     14 
     15 I believe that if you use a piece of code, you are also responsible for making sure that that code is available in the future.
     16 
     17 In this spirit, I decided a couple of years ago that I would keep a full clone of all VCS repositories that I use.
     18 
     19 
     20 Can't someone else do it?
     21 =========================
     22 
     23 Yeah, yeah, I hear ya.
     24 
     25 But imagine that one day you cannot reach the code repository anymore.
     26 
     27 It could be because you are working where internet is scarce or impossible to rely on.
     28 
     29 It could be that you have to cope with what was in your faraday cage when a giant solar flare happened.
     30 
     31 It could be that you, or the author of the code, have been cut off by the accelerating `weaponization of everything <https://torrentfreak.com/the-eu-wants-its-own-dns-resolver-that-can-block-unlawful-traffic-220119/>`_.
     32 
     33 Or maybe none of the above happened. But you still understand and appreciate what it means to build a truly decentralized society, where we all participate and contribute, not only consume.
     34 
     35 
     36 Git organized
     37 =============
     38 
     39 For every `git` repository that I use, I actually keep a *local copy* on my daily device.
     40 
     41 I also keep a copy on a device at home, *and* on a remote device.
     42 
     43 My thinking is:
     44 
     45 1. If I lose my laptop, I have two copies
     46 2. If my house burns down, I have two copies
     47 3. If my house burns down *with* my laptop inside, I have *at least one more copy*.
     48 
     49 ... and so on.
     50 
     51 
     52 I hate to move it, move it
     53 ==========================
     54 
     55 Sometimes we have to, though,.
     56 
     57 And what can be a real pain is to move heaps of code repositories around. For example if you are moving to a new machine, or want to bootstrap a new copy without having to source the data yourself.
     58 
     59 To make this easier, I wrote the `gitrefresh bash tool <https://holbrook.no/src/gitrefresh/log.html>`_ to copy only the minimum of information required to source the data from a remote. [1]_
     60 
     61 
     62 
     63 
     64 Freshening up
     65 =============
     66 
     67 To make sense of what is what in the repository store, I use a simple folder structure.
     68 
     69 Obviously, when I create copies of the repository store, I would like to keep the same folder structure. So the tool needed to make that possible.
     70 
     71 Additionally, what's needed are tools to bootstrap a repository group from a list, and a tool to refresh those repositories periodically once they've been bootstrapped.
     72 
     73 To achieve this, I actually wrote `three tools <https://holbrook.no/src/gitrefresh/log.html>`_, as follows:
     74 
     75 
     76 `gitlist.sh`
     77 ------------
     78 
     79 create a list of `git` repositories under a filesystem path, with the option of preserving the directory structure.
     80 
     81 
     82 `gitstart.sh`
     83 -------------
     84 
     85 clone `git` repositories from a list generated from :code:`gitlist.sh`, with or without direcory structure.
     86 
     87 
     88 `gitrefresh.sh`
     89 ---------------
     90 
     91 fetch and merge updates from remotes of each repository under a directory.
     92 
     93 
     94 Behavior
     95 ========
     96 
     97 The :code:`gitlist.sh` and :code:`gitrefresh.sh` tools work more or less the same way.
     98 
     99 They traverse a directory structure recursively.
    100 
    101 Every time a valid git repository is found, that repository is processed. Afterwards, the tool will exit to the parent folder. [2]_
    102 
    103 
    104 Example
    105 -------
    106 
    107 Let's say we have three repositories that we are mirroring locally:
    108 
    109 * :code:`https://github.com/bitcoin/bips` under :code:`btc/bips`
    110 * :code:`https://aur.archlinux.org/libkeccak.git` under :code:`os/archlinux/aur/libkeccak`
    111 * :code:`git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git` under :code:`linux/linux`
    112 
    113 First we use :code:`gitlist.sh` to generate the list of repos to bootstrap [3]_:
    114 
    115 .. code:: console
    116 
    117         $ gitlist.sh -p | tee gitlist.txt
    118         https://github.com/bitcoin/bips btc/bips
    119         https://aur.archlinux.org/libkeccak.git os/archlinux/aur/libkeccak`
    120         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git  linux/linux`
    121 
    122 
    123 Using :code:`gitstart.sh` with this list, we can restore this bunch of repositories *with*  the same directory structure anywhere else:
    124 
    125 .. code:: console
    126 
    127         $ cd /path/to/new/repos/location
    128         $ gitstart.sh < gitlist.txt
    129 
    130 Now, the idea is that from time to time you should get the latest changes from the upstream source.
    131 
    132 I simply combine :code:`gitrefresh.sh` with :code:`cron` to do this on the remote, while manually doing the refresh locally once in awhile.
    133 
    134 Using the tool, all it takes is:
    135 
    136 .. code:: console
    137 
    138         $ cd /path/to/new/repos/location
    139         $ gitrefresh.sh pull
    140 
    141 
    142 ..
    143 
    144    .. [1] Yes. I didn't get beyond `git` yet. But at least it's a start.
    145 
    146 ..
    147 
    148    .. [2] This, of course, means that the tool will not automatically archive code from *submodules*. The submodule construct is a target of both a lot of love and a lot of hate. Personally, I like it. But at the same time it is my opinion that it does not absolve us from *knowing* and being *mindful* which submodules a repository is using, and thus making sure that we have an independent clone of that repository.
    149 
    150 .. 
    151 
    152    .. [3] We add the :code:`-p` flag to preserve the directory structure on disk.
    153