manbytesgnu_site

Source files for manbytesgnu.org
git clone git://holbrook.no/manbytesgnu_site.git
Log | Files | Refs

20210627_eth_log_filters.rst (5613B)


      1 Making sense of Ethereum log bloom filters
      2 ##########################################
      3 
      4 :date: 2021-06-27 11:17
      5 :modified: 2021-06-27 15:56
      6 :category: Code
      7 :author: Louis Holbrook
      8 :tags: crypto,ethereum,bloom filter
      9 :slug: eth-log-bloom
     10 :summary: How to generate values to match the log bloom filters in Ethereum
     11 :lang: en
     12 :status: published
     13 
     14 
     15 Ethereum blocks and transaction receipts both carry a bloom filter. They provide a shortcut to check whether a particular event of a particular contract was triggered. The filter in the transaction indexes all events that occurred in the transaction. The filter in the block carries all the filter bits of the individual transactions within the block.
     16 
     17 Somewhat surprisingly, there are practically no friendly descriptions out there that tell you how to generate filters to match them with. In fact, the `Ethereum Yellow Paper <https://holbrook.no/doc/ethereum/yellowpaper.pdf>`_ was the only source I found.
     18 
     19 **So let's make a friendly description**
     20 
     21 Before we get to the friendliness, though, an ever so slight dose of brain melt is due:
     22 
     23 The formalities
     24 ===============
     25 
     26 The formal definition of the filter calculation in the `Ethereum Yellow Paper section 4.3.1 <https://holbrook.no/doc/ethereum/yellowpaper.pdf>`_ is (here cited in painfully inadequate math formatting):
     27 
     28         .. math::
     29 
     30                 M(O) \equiv \bigvee x \in \{Oa\} \cup Ot (M_{3:2048}(x))
     31 
     32 Where :math:`Oa` is the contract address and :math:`Ot` is all of the topics.
     33 
     34 In the context of *Solidity*, "all" of the topics means the actual event signature, along with all the *indexed* parameters to the event.
     35 
     36 The :math:`M` is the filter bit generator. The filter 2048 bits long, and for each entry that is added, 3 bits need to be set.
     37 
     38 Reading further, we learn that the bits to set are the "first three" pairs of bytes of the keccak256 hash [1]_, modulo the length of the filter - 2048.
     39 
     40 In the context of hashes, I usually assume "first" means from the left. That happens to be the case here, too. In other words, the "first" bytes are actually the bytes of most significance in big-endian order.
     41 
     42 What's less clear, however, is which bits to set in the filter. The paper says:
     43 
     44         :math:`\mathcal{B}` is the bit reference function such that :math:`\mathcal{B}_j(x)` equals the bit of index :math:`j` (indexed from 0) in the byte array :math:`x`.
     45 
     46 Turns out after some trial and error that "index 0" means the *right* end of the byte array. So when setting the bits, we have to count from the right side.
     47 
     48 
     49 The friendly recipe
     50 ===================
     51 
     52 Now let's try to bring this down to a mortal level:
     53 
     54 1. Take the keccak256 hash of the contract address
     55 
     56 2. Take the keccak256 hash of the first topic value
     57 
     58 3. Optionally, take the keccak256 hash of the remaining topic values (the indexed parameters of the event) [2]_
     59 
     60 4. Instantiate a byte array with length 256
     61 
     62 5. Now, *for each* of the hashes:
     63 
     64    1. Get the *big-endian* integer of the *first two* bytes of the hash
     65 
     66    2. Calculate the 2048-modulo of that integer [3]_
     67 
     68    3. *Bitwise or* the filter bit at the index corresponding to the modulated value, counting from *right to left* (i.e. the value :math:`0` corresponds to the *rightmost bit*).
     69 
     70    4. Repeat 2 and 3 for the two next pairs of bytes of the hash.
     71 
     72 
     73 The code
     74 ========
     75 
     76 All that taken into account, we can take a stab at implementing a filter generator:
     77 
     78 .. include:: code/eth-log-bloom/generate.py
     79    :code: python
     80    :number-lines: 0
     81 
     82 Let's say we'd want to check if an Solidity-defined event :code:`Foo(uint256,bytes32)` emitted by a contract at address :code:`0xeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee` exists in a filter.
     83 
     84 Using the class above, we would have to do something like this:
     85 
     86 .. code:: python
     87 
     88         # external imports
     89         import sha3
     90         import some_block_getter
     91 
     92         # local imports
     93         from the_above import LogBloom
     94 
     95         # all set bits in our filter must be set in theirs
     96         def filter_match(theirs, ours):
     97                 for i in range(len(ours)):
     98                         if ours[i] & theirs[i] != ours[i]:
     99                                 return False
    100                 return True
    101                         
    102         fltr = LogBloom()
    103 
    104         # add the contract address to the filter
    105         address = bytes.fromhex('eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee')
    106         h = sha3.keccak_256()
    107         h.update(address)
    108         address_element = h.digest()
    109         fltr.add(address_element)
    110 
    111         # create the topic signature for the event
    112         h = sha3.keccak_256()
    113         h.update(b'Foo(uint256,bytes32)')
    114         topic = h.digest()
    115 
    116         # add the topic to the filter
    117         h = sha3.keccak_256()
    118         h.update(topic)
    119         topic_element = h.digest()
    120         fltr.add(topic_element)
    121 
    122         # get the filter from a block
    123         block = some_block_getter()
    124         block_bloom = bytes.fromhex(block['logsBloom'][2:]) # assumes there is a 0x-prefix
    125 
    126         # check if it matches
    127         match = filter_match(block_bloom, fltr.contents)
    128         print(match) 
    129 ..
    130 
    131         .. [1] Remember the keccak hash used in Ethereum/Bitcoin is _not_ the official `sha3` hash. The padding parameter is different. More on that `here <{filename}/20210508_nft_tokenid_content.rst#id8>`_
    132 
    133 ..
    134 
    135         .. [2] Solidity supports up to 3 indexed parameters, which maps to the *evm* opcodes :code:`LOGn` where :math:`n \in \{1,2,3,4\}` 
    136 
    137 
    138 ..
    139 
    140         .. [3] Since :math:`2^11 = 2048`, this is the same as zeroing out all bits above bit number 11; :math:`x \wedge 2048`. In other words, the result is an *11 bit integer*.