20210627_eth_log_filters.rst (5613B)
1 Making sense of Ethereum log bloom filters 2 ########################################## 3 4 :date: 2021-06-27 11:17 5 :modified: 2021-06-27 15:56 6 :category: Code 7 :author: Louis Holbrook 8 :tags: crypto,ethereum,bloom filter 9 :slug: eth-log-bloom 10 :summary: How to generate values to match the log bloom filters in Ethereum 11 :lang: en 12 :status: published 13 14 15 Ethereum blocks and transaction receipts both carry a bloom filter. They provide a shortcut to check whether a particular event of a particular contract was triggered. The filter in the transaction indexes all events that occurred in the transaction. The filter in the block carries all the filter bits of the individual transactions within the block. 16 17 Somewhat surprisingly, there are practically no friendly descriptions out there that tell you how to generate filters to match them with. In fact, the `Ethereum Yellow Paper <https://holbrook.no/doc/ethereum/yellowpaper.pdf>`_ was the only source I found. 18 19 **So let's make a friendly description** 20 21 Before we get to the friendliness, though, an ever so slight dose of brain melt is due: 22 23 The formalities 24 =============== 25 26 The formal definition of the filter calculation in the `Ethereum Yellow Paper section 4.3.1 <https://holbrook.no/doc/ethereum/yellowpaper.pdf>`_ is (here cited in painfully inadequate math formatting): 27 28 .. math:: 29 30 M(O) \equiv \bigvee x \in \{Oa\} \cup Ot (M_{3:2048}(x)) 31 32 Where :math:`Oa` is the contract address and :math:`Ot` is all of the topics. 33 34 In the context of *Solidity*, "all" of the topics means the actual event signature, along with all the *indexed* parameters to the event. 35 36 The :math:`M` is the filter bit generator. The filter 2048 bits long, and for each entry that is added, 3 bits need to be set. 37 38 Reading further, we learn that the bits to set are the "first three" pairs of bytes of the keccak256 hash [1]_, modulo the length of the filter - 2048. 39 40 In the context of hashes, I usually assume "first" means from the left. That happens to be the case here, too. In other words, the "first" bytes are actually the bytes of most significance in big-endian order. 41 42 What's less clear, however, is which bits to set in the filter. The paper says: 43 44 :math:`\mathcal{B}` is the bit reference function such that :math:`\mathcal{B}_j(x)` equals the bit of index :math:`j` (indexed from 0) in the byte array :math:`x`. 45 46 Turns out after some trial and error that "index 0" means the *right* end of the byte array. So when setting the bits, we have to count from the right side. 47 48 49 The friendly recipe 50 =================== 51 52 Now let's try to bring this down to a mortal level: 53 54 1. Take the keccak256 hash of the contract address 55 56 2. Take the keccak256 hash of the first topic value 57 58 3. Optionally, take the keccak256 hash of the remaining topic values (the indexed parameters of the event) [2]_ 59 60 4. Instantiate a byte array with length 256 61 62 5. Now, *for each* of the hashes: 63 64 1. Get the *big-endian* integer of the *first two* bytes of the hash 65 66 2. Calculate the 2048-modulo of that integer [3]_ 67 68 3. *Bitwise or* the filter bit at the index corresponding to the modulated value, counting from *right to left* (i.e. the value :math:`0` corresponds to the *rightmost bit*). 69 70 4. Repeat 2 and 3 for the two next pairs of bytes of the hash. 71 72 73 The code 74 ======== 75 76 All that taken into account, we can take a stab at implementing a filter generator: 77 78 .. include:: code/eth-log-bloom/generate.py 79 :code: python 80 :number-lines: 0 81 82 Let's say we'd want to check if an Solidity-defined event :code:`Foo(uint256,bytes32)` emitted by a contract at address :code:`0xeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee` exists in a filter. 83 84 Using the class above, we would have to do something like this: 85 86 .. code:: python 87 88 # external imports 89 import sha3 90 import some_block_getter 91 92 # local imports 93 from the_above import LogBloom 94 95 # all set bits in our filter must be set in theirs 96 def filter_match(theirs, ours): 97 for i in range(len(ours)): 98 if ours[i] & theirs[i] != ours[i]: 99 return False 100 return True 101 102 fltr = LogBloom() 103 104 # add the contract address to the filter 105 address = bytes.fromhex('eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee') 106 h = sha3.keccak_256() 107 h.update(address) 108 address_element = h.digest() 109 fltr.add(address_element) 110 111 # create the topic signature for the event 112 h = sha3.keccak_256() 113 h.update(b'Foo(uint256,bytes32)') 114 topic = h.digest() 115 116 # add the topic to the filter 117 h = sha3.keccak_256() 118 h.update(topic) 119 topic_element = h.digest() 120 fltr.add(topic_element) 121 122 # get the filter from a block 123 block = some_block_getter() 124 block_bloom = bytes.fromhex(block['logsBloom'][2:]) # assumes there is a 0x-prefix 125 126 # check if it matches 127 match = filter_match(block_bloom, fltr.contents) 128 print(match) 129 .. 130 131 .. [1] Remember the keccak hash used in Ethereum/Bitcoin is _not_ the official `sha3` hash. The padding parameter is different. More on that `here <{filename}/20210508_nft_tokenid_content.rst#id8>`_ 132 133 .. 134 135 .. [2] Solidity supports up to 3 indexed parameters, which maps to the *evm* opcodes :code:`LOGn` where :math:`n \in \{1,2,3,4\}` 136 137 138 .. 139 140 .. [3] Since :math:`2^11 = 2048`, this is the same as zeroing out all bits above bit number 11; :math:`x \wedge 2048`. In other words, the result is an *11 bit integer*.