20210508_nft_tokenid_content.rst - manbytesgnu_site - Source files for manbytesgnu.org

20210508_nft_tokenid_content.rst (8648B)
      1 The NFT token id as URI 
      2 #######################
      3 
      4 :date: 2021-05-08 19:14
      5 :modified: 2021-05-10 09:18
      6 :category: Code
      7 :author: Louis Holbrook
      8 :tags: nft,evm,hash,key-value store,decentralized storage
      9 :slug: nft-tokenid-content
     10 :lang: en
     11 :summary: How to embed asset references for NFTs that are independent of providers in the current standard.
     12 :status: published
     13 
     14 
     15 Let's consider an NFT that works like a badge for participating in development of a software project.
     16 
     17 This token is awarded as a proof that the task was completed.
     18 
     19 To make things more fun, each NFT should have some unique, immutable content attached to it.
     20 
     21 In other words, the properties of this token, once set, should never change.
     22 
     23 Nor should they disappear.
     24 
     25 So how do we refer to the artwork asset within the token standard?
     26 
     27 
     28 It was acceptable at the time
     29 =============================
     30 
     31 The ERC721 standard is not explicit about where the assets that belong with the NFT can be discovered and resolved.
     32 
     33 At the time when the standard was adopted by the Ethereum community, there were multiple *"[...] Alternatives considered: put all metadata for each asset on the blockchain (too expensive), use URL templates to query metadata parts (URL templates do not work with all URL schemes, especially P2P URLs), multiaddr network address (not mature enough)."* Furthermore, they *"[...] considered an NFT representing ownership of a house, in this case metadata about the house (image, occupants, etc.) can naturally change."* [EIP721]_
     34 
     35 A "changing house" doesn't sound quite like what we need. And anyway; if we stick a good old web2 URI in there, then that will end up on the great bonfire of dead links before long.
     36 
     37 
     38 Image, schmimage
     39 ================
     40 
     41 To be honest, I find the presumption in the optional EIP721 metadata structure to be surprisingly short-sighted. It *specifically* defines the asset as an image, and at the same time is presupposes that only a *single* asset file will be used.
     42 
     43 We may want to add *multiple* sources, so this is another obstacle for us.
     44 
     45 So how to get around this, while still playing nice with existing implementations out there? Two ideas come to mind:
     46 
     47 - Embed a *thumbnail* as a preview of the artwork using a :code:`base64` *data URI* [1]_ in the metadata. Stick :code:`name` and :code:`description` on it, and the schema is still fulfilled.
     48 - *Extend* the structure with a list of *attachments* that *our* application layer knows about. Of course, each of these can have the same format as above.
     49 
     50 In other words:
     51 
     52 .. include:: code/nft-tokenid-content/erc721_metadata_schema_base.json
     53         :code: json
     54 
     55 
     56 Mirror, mirror
     57 ==============
     58 
     59 Since the asset reference shouldn't change, we can refer to it by its fingerprint or `content address <https://en.wikipedia.org/wiki/Content-addressable_storage>`_. If we define that the resource can be looked up over HTTP by that fingerprint as its basename, then we are free to define and modify whatever list of mirrors for that resource that's valid for any point in time. The application layer would simply try the endpoints one after another.
     60 
     61 We take the :code:`sha2-256` [2]_ of the asset reference (the json file above, free of evil whitespace and newlines):
     62 
     63 .. code-block:: bash
     64 
     65            $ cat reference.json | jq -c -j | sha256sum | awk '{ print $1; }'
     66            3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
     67 
     68 Imagine we had a mirror list of https://foo.com and https://bar.com/baz/. Then our application would try these urls in sequence, stopping at the first that returns a valid result:
     69 
     70 .. code-block:: text
     71 
     72         https://foo.com/3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
     73         https://bar.com/baz/3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
     74 
     75 Once we receive the content, all we have to do is hash it ourselves and verify that the sum matches the basename of the URI. If it doesn't the result is of course not valid and we continue down the list, appropriately banning the mischievous server then throrougly harassing its admin.
     76 
     77 
     78 Cast away
     79 =========
     80 
     81 Since our fingerprint is 32 bytes, it fits exactly inside the :code:`tokenId` (:code:`uint256`). Let's decide to big-endian numbers when converting (I find them easier to make sense of). In that case our hash from the reference turns into this modest number:
     82 
     83 .. code-block:: python
     84 
     85         # python3
     86         >>> hx = bytes.fromhex('3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551')
     87         >>> int.from_bytes(hx, byteorder='big')
     88         28891040728719892888467057134569335350980764617882743994259054993630416573777
     89 
     90 As long as we're composing the :code:`evm` inputs ourselves, we don't really have to worry about the integer representation in this particular case. But the interface is defined as an integer type, and other mortals may be using higher level interfaces, we have to be explicit about our choice.
     91 
     92 
     93 Welcoming mint
     94 ==============
     95 
     96 Assume we have a method :code:`mintTo(address _recipient, uint256 _tokenId)` on our NFT contract. The Solidity signature of that contract is :code:`edb20b7e` [3]_. If I were to mint to myself then the input to the contract would be:
     97 
     98 .. code-block:: text
     99 
    100         edb20b7e000000000000000000000000185cbce7650ff7ad3b587e26b2877d95568805e33fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
    101 
    102 Broken down:
    103 
    104 .. code-block:: text
    105 
    106         signature:             edb20b7e
    107         address, zero-padded:  000000000000000000000000185cbce7650ff7ad3b587e26b2877d95568805e3
    108         token id:              3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
    109 
    110 The corresponding web3.js code would look like:
    111 
    112 .. code-block:: javascript
    113 
    114         const c = new web3.eth.Contract([...], '0x...');
    115         c.methods.mintTo('0x185cbce7650ff7ad3b587e26b2877d95568805e3', 28891040728719892888467057134569335350980764617882743994259054993630416573777).call();
    116 
    117 To satisfy the `tokenURI` method, we can generate a string that's prefix with sha256 as a "scheme" [4]_. A bit of (unoptimized) solidity helps us out here:
    118 
    119 .. include:: code/nft-tokenid-content/tohex.sol
    120         :code: solidity
    121 
    122 This will return :code:`sha256:3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551` for :code:`tokenId` :code:`3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551` as input, provided that the :code:`tokenId` actually exists. That may seem a bit useless at first, but consider the scenario where we want to interface with other NFTs aswell. Or perhaps we are implementing a contract that optionally can support a static web2 URI in storage. By doing it this way, all bases are covered.
    123 
    124 
    125 Decentralized identifiers
    126 =========================
    127 
    128 Even better would be to add redundancy with autonomous decentralized storage. However, networks like `Swarm <https://ethswarm.org>`_ and `IPFS <https://ipfs.io>`_ use their own hashing recipes. That means that for every network referenced, we'd have to define an *alternative* in our reference structure.
    129 
    130 Referencing the canonical :code:`sha256` aswell as the :code:`Swarmhash` for the same item could then look like this [5]_:
    131 
    132 
    133 .. include:: code/nft-tokenid-content/erc721_metadata_schema_swarm.json
    134    :code: json
    135 
    136 ----
    137 
    138 ..
    139 
    140         .. [1] Yes, they are valid URIs actually: https://www.rfc-archive.org/getrfc.php?rfc=2397
    141 
    142 ..
    143 
    144         .. [2] Likely it would be prudent to start using the official :code:`sha3` instead of :code:`sha2` these days, also because the :code:`sha2` hash is not a builtin for :code:`evm`. But neither is :code:`sha3`. The :code:`keccak256` Bitcoin uses, which EVM has inherited, is a pre-cursor to the :code:`keccak` published as the *official* :code:`sha3`. Still, :code:`keccak256` and :code:`sha3` is used interchangeably in opcode lists (and previously in `Solidity <https://docs.soliditylang.org/en/v0.8.0/050-breaking-changes.html#functions>`_ too). This has caused me quite a fair bit of confusion, I might add. Apart from it being ambiguous, the :code:`keccak256` tooling is also less common in the wild. Therefore :code:`sha2` seems like a safer bet for our experiments. It's not broken yet, after all. 
    145 
    146 ..
    147         
    148         .. [3] The hex result of :code:`keccak256("mintTo(address,uint256)")`
    149 
    150 
    151 ..
    152 
    153         .. [4] *Data URI*  is of no use here, because the hash itself is just nondescript binary data. Luckily :code:`<scheme>:<path>` is still a valid URI.
    154 
    155 ..
    156         
    157         .. [5] Here the hashes represent the media content itself, not the reference. That's why the :code:`sha256` one is different than before.
    158 
    159 ..
    160         
    161         .. [EIP721] https://eips.ethereum.org/EIPS/eip-721
	manbytesgnu_site Source files for manbytesgnu.org
	git clone git://holbrook.no/manbytesgnu_site.git
	Info \| Log \| Files \| Refs