====== Crypto: Initial ArbitrageManager implementation ====== {{tag>dev python crypto nervhome finance blockchain}} In this post, we will start with some preliminary analysis of the blocks/trnsactions content and progressively build our way to an Arbitrage management system for EVM blockchains 😎 Or well... an initial version of it at least lol. ====== ====== ===== Checking content of a transaction in block ===== * Let's start with some simple overview on the transactions we can extract from each block... * When there is no **input** data for a transaction it means we are simply transfering funds, like in this one: {'blockHash': HexBytes('0xe1bb91e1dd02d2032a760aefaa9434e26d74e252871ca987a0147ce08ae75129'), 'blockNumber': 18224230, 'from': '0xBD612a3f30dcA67bF60a39Fd0D35e39B7aB80774', 'gas': 207128, 'gasPrice': 10000000000, 'hash': HexBytes('0xe81d4a96e5259149353554a012b040dd29d82df9a6a1147c392ca24c007aee75'), 'input': '0x', 'nonce': 3579524, 'to': '0x00fC8a4aF0Cc0B917EA935a33e0369153B63B357', 'transactionIndex': 10, 'value': 1931268460000000000, 'type': '0x0', 'v': 147, 'r': HexBytes('0x75bd9f1b6a3b8d1013b7ddedcdf103146e4304af6a8b4dc4f481d3a569d405e9'), 's': HexBytes('0x6d750340acef0f58d2d0474cdf133bfbe964658c5058d2b3fbfeb02e72556db0')} ===== Retrieving common method signatures on a given DEX router ===== * I used the following code to try to collect all possible method signatures/IDs: bck = chain.get_block('latest', full_tx=True) # find all the transactions occuring on a dex router # Checking with PancakeSwap2 router only below: routers = ['0x10ED43C718714eb63d5aA57B78B54704E256024E'] alltx = bck["transactions"] tx_count = 0 sigs = set() for txn in alltx: if txn['to'] in routers: print(f"Found transaction on router: {txn}") input = txn['input'] print(f"Hash: {txn['hash'].hex()}") print(f"Input: {input}") # print(f"Value: {txn['value']}") sigs.add(input[:10]) tx_count += 1 print(f"Keeping {tx_count} transactions on {len(alltx)}") print(f"List of method signatures: {sigs}") * **0xa9059cbb** is the method id for ''transfer(address recipient, uint256 amount)'' * Some of the signatures of interest to us are: * **0x7ff36ab5** is for ''swapExactETHForTokens(uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0x18cbafe5** is for ''swapExactTokensForETH(uint256 amountIn, uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0x8803dbee** is for ''swapTokensForExactTokens(uint256 amountOut, uint256 amountInMax, address[] path, address to, uint256 deadline)'' * **0xe8e33700** is for ''addLiquidity(address tokenA, address tokenB, uint256 amountADesired, uint256 amountBDesired, uint256 amountAMin, uint256 amountBMin, address to, uint256 deadline)'' * **0x38ed1739** is for ''swapExactTokensForTokens(uint256 amountIn, uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0xf305d719** is for ''addLiquidityETH(address token, uint256 amountTokenDesired, uint256 amountTokenMin, uint256 amountETHMin, address to, uint256 deadline)'' /* * **0xb6f9de95** is for ''swapExactETHForTokensSupportingFeeOnTransferTokens(uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0x791ac947** is for ''swapExactTokensForETHSupportingFeeOnTransferTokens(uint256 amountIn, uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0xfb3bdb41** is for ''swapETHForExactTokens(uint256 amountOut, address[] path, address to, uint256 deadline)'' * **0x5c11d795** is for ''swapExactTokensForTokensSupportingFeeOnTransferTokens(uint256 amountIn, uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0xded9382a** is for ''removeLiquidityETHWithPermit(address token, uint256 liquidity, uint256 amountTokenMin, uint256 amountETHMin, address to, uint256 deadline, bool approveMax, uint8 v, bytes32 r, bytes32 s)'' * **0x2195995c** is for ''removeLiquidityWithPermit(address tokenA, address tokenB, uint256 liquidity, uint256 amountAMin, uint256 amountBMin, address to, uint256 deadline, bool approveMax, uint8 v, bytes32 r, bytes32 s)'' * **0x4a25d94a** is for ''swapTokensForExactETH(uint256 amountOut, uint256 amountInMax, address[] path, address to, uint256 deadline)'' * **0x5b0d5984** is for ''removeLiquidityETHWithPermitSupportingFeeOnTransferTokens(address token, uint256 liquidity, uint256 amountTokenMin, uint256 amountETHMin, address to, uint256 deadline, bool approveMax, uint8 v, bytes32 r, bytes32 s)'' * **0xaf2979eb** is for ''removeLiquidityETHSupportingFeeOnTransferTokens(address token, uint256 liquidity, uint256 amountTokenMin, uint256 amountETHMin, address to, uint256 deadline)'' * **0x02751cec** is for ''removeLiquidityETH(address token, uint256 liquidity, uint256 amountTokenMin, uint256 amountETHMin, address to, uint256 deadline)'' * **0xbaa2abde** is for ''removeLiquidity(address tokenA, address tokenB, uint256 liquidity, uint256 amountAMin, uint256 amountBMin, address to, uint256 deadline)'' * **0xd67b571e** is for ''swapTokensForExactBNB(uint256 amountOut, uint256 amountInMax, address[] path, address to, uint256 deadline)'' * **0x5d616c5b** is for ''swapExactTokensForBNB(uint256 amountIn, uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0x9cf68911** is for ''swapExactBNBForTokens(uint256 amountOutMin, address[] path, address to, uint256 deadline)'' * **0x8332a963** is for ''swapBNBForExactTokens(uint256 amountOut, address[] path, address to, uint256 deadline)'' * **0xeaaed442** is for ''addLiquidityBNB(address token, uint256 amountTokenDesired, uint256 amountTokenMin, uint256 amountBNBMin, address to, uint256 deadline)'' * List of signatures: {'0x7ff36ab5', '0x18cbafe5', '0x8803dbee', '0xe8e33700', '0x38ed1739', '0xf305d719', '0xb6f9de95', '0x791ac947', '0xfb3bdb41', '0x5c11d795', '0xded9382a', '0x2195995c', '0x4a25d94a', '0x5b0d5984', '0xaf2979eb', '0x02751cec', '0xbaa2abde', '0xd67b571e', '0x5d616c5b', '0x9cf68911', '0x8332a963', '0xeaaed442'} */ * Now trying to collect on more blocks: bck = chain.get_block('latest', full_tx=True) last_num = bck['number'] # find all the transactions occuring on a dex router routers = ['0x10ED43C718714eb63d5aA57B78B54704E256024E'] tx_count = 0 all_count = 0 sigs = set() for i in range(20): num = last_num - i print(f"Checking block {num}...") alltx = bck["transactions"] all_count += len(alltx) for txn in alltx: if txn['to'] in routers: # print(f"Found transaction on router: {txn}") input = txn['input'] print(f"Hash: {txn['hash'].hex()}") print(f"Input: {input}") # print(f"Value: {txn['value']}") sigs.add(input[:10]) tx_count += 1 bck = chain.get_block(num-1, full_tx=True) print(f"Keeping {tx_count} transactions on {all_count}") print(f"List of method signatures: {sigs}") ===== Retrieving most used contracts in the recent blocks ===== * I then tried to check what were the most used contracts in recent blocks with the following script: bck = chain.get_block('latest', full_tx=True) last_num = bck['number'] all_count = 0 targets = {} for i in range(100): num = last_num - i # print(f"Checking block {num}...") alltx = bck["transactions"] all_count += len(alltx) for txn in alltx: addr = txn['to'] targets[addr] = targets.get(addr,0)+1 bck = chain.get_block(num-1, full_tx=True) arr = [(key, val) for key, val in targets.items()] arr.sort(key=lambda item: item[1], reverse=True) print(f"Most used contracts:") for idx, item in enumerate(arr[:10]): print(f"- {item[0]}: {item[1]*100/all_count:.2f}%") * And I got the following results with that: Most used contracts: - 0x10ED43C718714eb63d5aA57B78B54704E256024E: 16.89% - 0x55d398326f99059fF775485246999027B3197955: 6.20% - 0x4a2c860cEC6471b9F5F5a336eB4F38bb21683c98: 1.82% - 0x18B2A687610328590Bc8F2e5fEdDe3b582A49cdA: 1.58% - 0xe9e7CEA3DedcA5984780Bafc599bD69ADd087D56: 1.43% - 0xfe702057BDD6E725d8372708d16143F6928B1383: 1.40% - 0x093C639e4291fbDcC339557ACCDAFF52aaAC37d9: 1.10% - 0x760B8237e457e6A2b5b196ae073EF0301b8316a2: 1.04% - 0x1B6C9c20693afDE803B27F8782156c0f892ABC2d: 0.97% - 0xca4533591F5E5256f1bDb0f07FEe3be76a1aae35: 0.88% * **Note**: the first address on that list (0x10ED43C718714eb63d5aA57B78B54704E256024E) is the PancakeSwap2 router address. ===== Collecting the quote tokens ===== * In my arbitrage mechanism I do not check for arbitrage setup between all tokens: instead I rely on swaps with **major** tokens, which is what I call **"quote tokens"**. These are the tokens that are used most often in all the available pairs. * So first we need to find those quote tokens (by default trying to collect only 10 of them) (in jupyter): # Retrieve the quote tokens: allpairs = db.get_all_pairs() npairs = len(allpairs) print(f"Num pairs: {npairs}") tokens = {} tot_count = npairs*2 for pair in allpairs: t0addr = pair[2] t1addr = pair[3] tokens[t0addr] = tokens.get(t0addr,0)+1 tokens[t1addr] = tokens.get(t1addr,0)+1 arr = [ (key,val) for key,val in tokens.items()] arr.sort(key=lambda item: item[1], reverse=True) # arr[:30] arr = arr[:10] quotes = [] for item in arr: token = chain.get_token(item[0]) quotes.append([item[0], token.symbol(), 100.0*item[1]/tot_count]) quotes * And this gives us the results: Num pairs: 231088 [['0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c', 'WBNB', 45.35523263864848], ['0xe9e7CEA3DedcA5984780Bafc599bD69ADd087D56', 'BUSD', 1.9501228969050752], ['0x55d398326f99059fF775485246999027B3197955', 'USDT', 0.818952087516444], ['0x0E09FaBB73Bd3Ade0a17ECC321fD13a19e81cE82', 'Cake', 0.2633196011908883], ['0x2170Ed0880ac9A755fd29B2688956BD959F933F8', 'ETH', 0.10537111403448036], ['0x7130d2A12B9BCbFAe4f2634d864A1Ee1Ce3Ead9c', 'BTCB#2', 0.07875787578757876], ['0x8AC76a51cc950d9822D68b83fE1Ad97B32Cd580d', 'USDC', 0.07507962334695008], ['0x8076C74C5e3F5852037F31Ff0093Eeb8c8ADd8D3', 'SAFEMOON', 0.06382849823443883], ['0x603c7f932ED1fc6575303D8Fb018fDCBb0f39a95', 'BANANA#2', 0.06296302707193796], ['0xE02dF9e3e622DeBdD69fb838bB799E3F168902c5', 'BAKE', 0.05452468323755452]] * In the list above, we see that the symbol with use for BTCB and BANANA should be renamed in our database: let's implement support for that. * => I added this command in the BlockchainManager class: if cmd == 'swap-symbols': chain_name = self.get_param("chain") chain: EVMBlockchain = self.get_component(f"{chain_name}_chain") sym1 = self.get_param("symbol1") sym2 = self.get_param("symbol2") self.swap_token_symbols(chain, sym1, sym2) return True * And now we can swap the symbols for BTCB and BTCB#2: nvp bchain swap-symbols -c bsc BTCB BTCB#2 * Oh crap... this doesn't work because we are violating a constraint in the process 😅: $ nvp bchain swap-symbols -c bsc BTCB BTCB#2 Traceback (most recent call last): File "D:\Projects\NervHome\nvh\crypto\blockchain\blockchain_manager.py", line 171, in comp.run() File "D:\Projects\NervProj\nvp\nvp_component.py", line 69, in run res = self.process_command(cmd) File "D:\Projects\NervHome\nvh\crypto\blockchain\blockchain_manager.py", line 54, in process_command self.swap_token_symbols(chain, sym1, sym2) File "D:\Projects\NervHome\nvh\crypto\blockchain\blockchain_manager.py", line 31, in swap_token_symbols chain.get_db().update_tokens([desc1, desc2]) File "D:\Projects\NervHome\nvh\crypto\blockchain\chain_db.py", line 240, in update_tokens self.execute(SQL_UPDATE_TOKEN, data, many=True, commit=True) File "D:\Projects\NervHome\nvh\crypto\blockchain\chain_db.py", line 95, in execute return self.sql_db.execute(*args, **kaargs) File "D:\Projects\NervHome\nvh\core\postgresql_db.py", line 60, in execute c.executemany(code, data) psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "tokens_symbol_key" DETAIL: Key (symbol)=(BTCB#2) already exists. * So testing with an updated version now: def swap_token_symbols(self, chain, sym1, sym2): """Swap the symbols for 2 tokens""" desc1 = chain.get_token_desc(symbol=sym1) desc2 = chain.get_token_desc(symbol=sym2) self.check(desc1 is not None, "Cannot find token with symbol %s", sym1) self.check(desc2 is not None, "Cannot find token with symbol %s", sym2) # Rename token 1 with temp symbol name: desc1['symbol'] = "" chain.get_db().update_token(desc1) # Renaming token2: logger.info("Renaming %s to %s...", sym2, sym1) desc2['symbol'] = sym1 chain.get_db().update_token(desc2) # Renaming token1: logger.info("Renaming %s to %s...", sym1, sym2) desc1['symbol'] = sym2 chain.get_db().update_token(desc1) logger.info("Done.") * This time this works: $ nvp bchain swap-symbols -c bsc BTCB BTCB#2 2022/05/30 08:32:16 [nvh.crypto.blockchain.chain_db] INFO: Updating token row: ('BTCB Token', '', 18, None, None, 0, '0x9621579a532C9052B1F3CE8e3022716120Ba4187') 2022/05/30 08:32:16 [__main__] INFO: Renaming BTCB#2 to BTCB... 2022/05/30 08:32:16 [nvh.crypto.blockchain.chain_db] INFO: Updating token row: ('BTCB Token', 'BTCB', 18, None, None, 0, '0x7130d2A1 2B9BCbFAe4f2634d864A1Ee1Ce3Ead9c') 2022/05/30 08:32:17 [__main__] INFO: Renaming BTCB to BTCB#2... 2022/05/30 08:32:17 [nvh.crypto.blockchain.chain_db] INFO: Updating token row: ('BTCB Token', 'BTCB#2', 18, None, None, 0, '0x962157 9a532C9052B1F3CE8e3022716120Ba4187') 2022/05/30 08:32:17 [__main__] INFO: Done. * Doing the same for BANANA (but with less outputs): $ nvp bchain swap-symbols -c bsc BANANA BANANA#2 2022/05/30 08:34:44 [__main__] INFO: Renaming BANANA#2 to BANANA... 2022/05/30 08:34:44 [__main__] INFO: Renaming BANANA to BANANA#2... 2022/05/30 08:34:44 [__main__] INFO: Done. * Now checking again the quote tokens: **All OK** * => In those quote tokens, WBNB represents a large 45.36% of all the pair legs, and then we have the following additional quotes: {{ blog:2022:0530:quote_tokens.png }} * Now that we can get a list of valid quote tokens, the next step would be to filter the full list of pairs eliminating the too exotic pairs (those not using one of our accepted quote tokens): # Filter all pairs: qaddrs = [item[0] for item in quotes] valid_pairs = [] for pair in allpairs: if pair[2] in qaddrs or pair[3] in qaddrs: valid_pairs.append(pair) print(f"quote filter: kept {len(valid_pairs)} on {len(allpairs)} pairs") * And we still get quite a lot of pairs from that filtering 👍: quote filter: kept 227846 on 233463 pairs ===== Collecting pairs usable for arbitrage setups ===== * To collect the pairs that we can use for arbitrage setup construction I can use the following script: tokens_map = {} for pair in valid_pairs: t0 = pair[2] t1 = pair[3] t0, t1 = chain.sort_tokens(t0, t1) key = f"{t0}_{t1}" if key not in tokens_map: tokens_map[key] = [pair] else: tokens_map[key].append(pair) # And we keep only the token pairs where we have at least 2 pairs: compat_pairs = {} count = 0 for key, plist in tokens_map.items(): npairs = len(plist) if npairs < 2: continue # if npairs > 2: # print(f"Found {npairs} compat pairs.") for i in range(npairs): others = plist.copy() del others[i] compat_pairs[plist[i]] = others count += 1 print(f"Found {count} arb compatible pairs") * => Currently I can find about 53156 pairs which is quite good already. ===== Building the ArbitrageManager component ===== * The **ArbitrageManager** class will be responsible for finding those arbitrage setups, so it should be able to perform all the steps described above internally. * Here is an initial minimal version of that component: """arbitrage manager class""" import logging # import signal # import sys import time from nvp.nvp_component import NVPComponent from nvp.nvp_context import NVPContext from nvh.crypto.blockchain.evm_blockchain import EVMBlockchain # from nvp.nvp_object import NVPCheckError logger = logging.getLogger(__name__) # def signal_handler(_sig, _frame): # """Handler for ctrl+c signal""" # logger.info("Ctrl+C pressed, exiting.") # sys.exit(0) # signal.signal(signal.SIGINT, signal_handler) class ArbitrageManager(NVPComponent): """ArbitrageManager component class""" def __init__(self, ctx): """ArbitrageManager base constructor""" NVPComponent.__init__(self, ctx) self.arb_pairs = None self.chain = None self.routers = None self.quote_tokens = None self.native_address = None self.ignore_dual_quote_pairs = True self.last_block_number = 0 def collect_quote_tokens(self): """Collect the best quote tokens from the current pairs""" def monitor_arbitrages(self, chain: EVMBlockchain): """Method used to monitor the availabla arbitrage setups.""" # Prepare the monitoring context: self.chain = chain self.arb_pairs, self.quote_tokens = chain.collect_arb_compatible_pairs() self.routers = chain.get_router_addresses() self.native_address = chain.get_wrapped_native_token().address() logger.info("wrapped native token address: %s", self.native_address) logger.info("Collected %d arb pairs", len(self.arb_pairs)) result = True while result: try: result = self.handle_arbitrage_setups() except KeyboardInterrupt: logger.info("Stopping arbitrage monitoring on %s", chain.get_short_name()) break def handle_arbitrage_setups(self): """Method used to find and handle arbitrage setups""" # Collect the latest block: bck = self.chain.get_block("latest", full_tx=True) # Get the block number: bnum = bck["number"] if bnum == self.last_block_number: # logger.info("Waiting for new block...") time.sleep(0.1) return True logger.info("Processing block %s", bnum) self.last_block_number = bnum # Get all transactions: transactions = bck["transactions"] pairs = self.collect_potential_arb_pairs(transactions) npairs = len(pairs) if npairs > 0: logger.info("Collected %d potential arb pairs: %s", npairs, pairs) return True def parse_input_data(self, inputdat): """Parse the input data from a given transaction""" sig = inputdat[:10] regs = inputdat[10:] # the remaining "regs" length should be a multiple of 64 chars lenregs = len(regs) self.check(lenregs % 64 == 0, "Invalid register length: %s", regs) return {"sig": sig, "regs": [regs[i : i + 64] for i in range(0, lenregs, 64)]} def collect_potential_arb_pairs(self, txlist): """Collect the potential arb pairs from a list of transactions""" pairs = set() for txn in txlist: # Check what kind of transaction this was based on the 'to' value # and 'input' value: # logger.info("Parsing transation: %s", txn) if txn["to"] not in self.routers: # We ignore this transaction continue # We should check that transaction: input_data = self.parse_input_data(txn["input"]) # logger.info("Collected input data: %s", self.pretty_print(input_data)) # Now we can collect the pairs involved in this transaction depending on the method signature: # try: tx_pairs = self.parse_input_pairs(input_data, txn["hash"]) # except NVPCheckError as err: # logger.error("Error while processing input data from %s", txn["hash"]) # raise err if tx_pairs is not None: pairs = pairs.union(tx_pairs) return list(pairs) def parse_input_pairs(self, input_data, txh): """Parse the input data to extract the involved pairs""" regs = input_data["regs"] sig = input_data["sig"] tokens = None if sig in ["0x7ff36ab5", "0xb6f9de95", "0xfb3bdb41"]: self.check(int(regs[1], 16) == 8 * 16, "Invalid value for regs[1]: %s", regs[1]) # get the number of tokens: ntokens = int(regs[4], 16) self.check(len(regs) == 5 + ntokens, "Invalid number of tokens: %s (hash=%s)", regs) # Get each token: tokens = [self.chain.to_checksum_address(regs[5 + i][-40:]) for i in range(ntokens)] if tokens is None: # Nothing by default: return None # now we construct the pair names: pnames = [] for i in range(ntokens - 1): t0, t1 = tokens[i], tokens[i + 1] if t0 == t1: logger.error("Detected same token (%s) in pair %d from %s", t0, i, txh) continue t0, t1 = self.chain.sort_tokens(t0, t1) pair_name = f"{t0}_{t1}" # We only keep that pair if it is part of the arbitrage compatible pairs: if pair_name not in self.arb_pairs: continue # If both tokens are quote tokens we should ignore the pair: we will probably not get # any valid arb setup on these: if self.ignore_dual_quote_pairs and t0 in self.quote_tokens and t1 in self.quote_tokens: logger.debug("Ignoring double quote pair %s", pair_name) continue # Otherwise we keep that pair name: pnames.append(pair_name) return set(pnames) def process_command(self, cmd): """Check if this component can process the given command""" if cmd == "monitor-arbs": chain_name = self.get_param("chain") chain: EVMBlockchain = self.get_component(f"{chain_name}_chain") self.monitor_arbitrages(chain) return True return False if __name__ == "__main__": # Create the context: context = NVPContext() # Add our component: comp = context.register_component("arb_man", ArbitrageManager(context)) context.define_subparsers( "main", { "monitor-arbs": None, }, ) psr = context.get_parser("main.monitor-arbs") psr.add_argument( "-c", "--chain", dest="chain", type=str, default="bsc", help="Blockchain from where to monitor the arbitrage setups", ) comp.run() * And with that code I can start collecting some arbitrage compatible pairs: $ nvp arbman monitor-arbs 2022/06/01 22:26:39 [nvh.crypto.blockchain.evm_blockchain] INFO: Keeping 369459/377028 quotable pairs. 2022/06/01 22:26:40 [nvh.crypto.blockchain.evm_blockchain] INFO: Found 33524 arb compatible pairs 2022/06/01 22:26:40 [__main__] INFO: wrapped native token address: 0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c 2022/06/01 22:26:40 [__main__] INFO: Collected 33524 arb pairs 2022/06/01 22:26:41 [__main__] INFO: Processing block 18320121 2022/06/01 22:26:41 [__main__] INFO: Collected 1 potential arb pairs: ['0x9fD87aEfe02441B123c3c32466cD9dB4c578618f_0xbb4CdB9CBd36B0 1bD1cBaEBF2De08d9173bc095c'] 2022/06/01 22:26:42 [__main__] INFO: Processing block 18320122 2022/06/01 22:26:42 [__main__] INFO: Collected 1 potential arb pairs: ['0x1CE0c2827e2eF14D5C4f29a091d735A204794041_0xbb4CdB9CBd36B0 1bD1cBaEBF2De08d9173bc095c'] 2022/06/01 22:26:45 [__main__] INFO: Processing block 18320123 2022/06/01 22:26:45 [__main__] INFO: Collected 2 potential arb pairs: ['0x23396cF899Ca06c4472205fC903bDB4de249D6fC_0xe9e7CEA3DedcA5 984780Bafc599bD69ADd087D56', '0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c_0xc748673057861a797275CD8A068AbB95A902e8de'] ===== Retrieving the reserves ===== * Now that we can collect some potential arbitrage setup pairs, we need to retrieve all the reverses for those pairs, which should be easy and fast to do using the **"DualReserves"** smart contract I created sometime again for that: it will accept a list of pair addresses, and output the reserves for the token0 and the token1 in 2 separate arrays for all those pairs. * => Let's restore the usage of that contract now. * So we get the list of pair addresses from the "token_pairs" of interest to ust with this kind of code: pair_tk_map = {} for tkpair in tkpairs: # That name must be in the global mapping: plist = self.arb_pairs[tkpair] for pdesc in plist: paddr = pdesc[1] # address is element 1 in this row. pair_tk_map[paddr] = tkpair # Get the list of addresses: paddrs = list(pair_tk_map.keys()) logger.debug("Should collect reserves for %d pairs: %s", len(paddrs), paddrs) # Now we call the DualReserves contract: pair_reserves, block_ref = self.get_all_reserves(paddrs) logger.info("Got the pair reserves: %s", pair_reserves) if block_ref != bnum: logger.info("Pair reserves block offset: %d", block_ref - bnum) * And we collect the reserves for each pair with the following call to our internal **GetDualReserves** contract: def get_all_reserves(self, paddrs): """Retrieve the reserves for all the pairs provided as input""" # logger.info("Calling getDualReserves with paddrs=%s", paddrs) result = self.dual_reserves_sc.call_function("getDualReserves", paddrs) block_ref = result[0] reserves0 = result[1] reserves1 = result[2] reserves = {} for idx, paddr in enumerate(paddrs): reserves[paddr] = (reserves0[idx], reserves1[idx]) return reserves, block_ref * This work just fine and will give us some valid outputs (and we don't seem to have any block offset for now which is nice): 2022/06/02 07:17:00 [__main__] INFO: Processing block 18330640 2022/06/02 07:17:00 [__main__] INFO: Got the pair reserves: {'0x1494CB2920AF3b296556a2569D637ed36EE3E653': (6679798815876686757, 321226250680082), '0xa584A5805DDE848cb2 00Da49a2A58174cCfd1Ba8': (56610244478243821, 1600579074), '0x038625dDCD10fde5253f1b3db84E53339D50f82F': (501060929594755052116933, 18729657199448284460), '0x438F9009A1a 2A40301D4663366Af7eb7ffc91ff3': (2382367943914026, 179616631182), '0x756Ed14D42475b1EE7Cf1c5Ba035E3D1A9295061': (15935285709348273686573965, 337309236320016822568), '0x 1f5C12a536d13689d0198B8df0F9637E15Bf8A29': (15272089060184523372197, 349075381400303126), '0x250B6610717Ef6F1F5e062d61FD5F3f075f201bF': (3334496309106401971007, 5333080 855004205), '0xb5BcdAeDC04dBd0742fD9dC342dE7C427192Cf3c': (50801579006377274100, 88942202633920), '0x265E2B86Cff3d43e2cB732f1fD547775b03eD71C': (4959307302118341344, 16 432324013437), '0x24E471d46596750BAAE5881CBB878c0ba929a288': (5186144545154995305313, 194895216166968), '0xc592F85794d80D8C048dF047E7091F74747D9B72': (14686598592882386 211786, 561017969257450), '0xC0AFB6078981629F7eAe4f2ae93b6DBEA9D7a7e9': (12897197441848189284851144042, 463137737246621906914), '0x343e0c149cA3ba381F16485746B26FB81807a 334': (1160295101461103435134530, 42751428133928641), '0x6b7b3523a6660a5fcE3c28E1536CC8dd8D57f7E0': (7012481874296233978981009795, 251414318937330880508)} 2022/06/02 07:17:01 [__main__] INFO: Processing block 18330641 2022/06/02 07:17:05 [__main__] INFO: Processing block 18330642 2022/06/02 07:17:05 [__main__] INFO: Got the pair reserves: {'0xA0a414df3481c68b07B9FD809ee7Fa9f4d05A5d3': (157901004649148527, 206740412469435), '0x8CA3fF14A52b080C54A 6d1a405eecA02959d39fE': (1128915779061263920786382, 2149602698120994035992), '0x81499F67774F8e5fc60d65F23cd799461b172BD3': (5321889577169921053, 6544234512333964371), ' 0x1A43917e018Fac283b520D1c3e3dc09072dCA343': (120449873172412, 119057808912149), '0xbB69c66E6005D810b13D65bA71ccfF5480AA1278': (5005476603695973, 3492139298165763), '0x A1f1AAe9aBd24C7C7fA08b86DD096710e02B3bc7': (574226955524916956552, 876692518047456226607), '0x0A147bf930ED01d92B9Cd83A87482020e66617e2': (4291124175, 217174449541635278 1), '0x955631FEDea348D237F4FD772BD5bc81b4c651Bf': (4232821968, 2175965266826955625), '0xB450CBF17F6723Ef9c1bf3C3f0e0aBA368D09bF5': (3052235222724935529443704, 131071950 94219077811599), '0xc232A127F9f57C2006bCABa8C9D813CE0811cf23': (53435982089657, 306540307770), '0x1C5041ca551e6C58381dC3f66500a477E11033A5': (204251654999022668, 106323 5502407882), '0xc1b047cda56754eCA893cF0226274365EeE0b6D6': (838050760916017980147, 3529160662798421453), '0xfff58a50Fdde55F5B5626BB66dE476F63155E078': (7663654884354897 9981354, 344099901419847577931)} ===== Checking for arbitrage setups between pairs ===== * So now we should try all the possible combinations between 2 pairs from the same tk_pair to see if we indeed have an arbitrage setup 👍! * So here is the main function to compute the best possible arb setup given 2 pairs: def compute_arb_profit(self, p0, p1, pair_reserves, qtoken): """Compute the arbitrage profit between 2 pairs""" p0addr = p0[1] p1addr = p1[1] # We really expect to have the same t0 and t1 values: t0addr = p0[2] t1addr = p0[3] self.check(t0addr == p1[2], "Mismatch in token0 between %s and %s", p0addr, p1addr) self.check(t1addr == p1[3], "Mismatch in token1 between %s and %s", p0addr, p1addr) # Get the exchanges from the indices: dex0 = self.chain.get_exchange(p0[8]) dex1 = self.chain.get_exchange(p1[8]) # Get the reserves: r0, r1 = pair_reserves[p0addr] rin, rout = pair_reserves[p1addr] # Get the fee points: fp0 = dex0.get_swap_fee_points(p0addr) fp1 = dex1.get_swap_fee_points(p1addr) amount0, profit = dex0.compute_flash_loan_best_investment(r0, r1, fp0, rin, rout, fp1) if profit is None: return None, None logger.debug("raw profit: %s, amount0: %s", str(profit), str(amount0)) # The profit is given in token1, if this is not he quote token, then we should convert back # to the token0: if qtoken == t0addr: self.check(t0addr in self.quote_tokens, "Invalid pair with no quote token") # When the quote token is the source token then we are going to refund only the minimal swap amount # to the initial pair and keep the difference from what we borrowed in source token. # the formula to compute the min swap is: # aSwapMin = params[4] + (rin * aRfd * 10000) / ((rout - aRfd) * (10000 - params[1])); # or: # aSwapMin = refoff + (rin * aRfd * 10000) / ((rout - aRfd) * (10000 - fp1)); # with for aRfd: # uint256 aRfd = (uint256(r1) * amount0) / (uint256(r0) - amount0); # aRfd = params[4] + (aRfd * 10000)/(10000 - params[0]) # So we can compute aRfd first: a_rfd = self.refund_offset + (int(r1) * int(amount0) * 10000) // ((int(r0) - int(amount0)) * (10000 - fp0)) a_swap_min = self.refund_offset + (int(rin) * a_rfd * 10000) // ((int(rout) - a_rfd) * (10000 - fp1)) profit = amount0 - a_swap_min # At this point the profit is always given in the quote token, # so we can report it and then select the best pair: return amount0, profit * **Note**: this function is not really complete yet: for instance we are not taking into account that flashloans might not be possible on a given exchange: so, some pairs have to be swapped, but anyway. * And then we select only the 2 pairs that will give use the best profit: def process_arb_setups(self, plist, pair_reserves): """Process the best arbitrage setup between any 2 pairs of a given pair list""" t0addr = plist[0][2] t1addr = plist[0][3] qaddr = t1addr if t1addr in self.quote_tokens else t0addr # best (amount0,profit,p0,p1) best_amount0 = 0 best_profit = 0.0 best_p0 = None best_p1 = None npairs = len(plist) for i in range(npairs - 1): for j in range(i + 1, npairs): # Check for the arb setup between pairs i and j p0 = plist[i] p1 = plist[j] am0, pval = self.compute_arb_profit(p0, p1, pair_reserves, qaddr) if am0 is not None and pval > best_profit: best_amount0 = am0 best_profit = pval best_p0 = p0 best_p1 = p1 # Check if we found something value: if best_amount0 == 0: # Nothing found. return # Convert the profit to value: qtoken = self.chain.get_token(qaddr) best_profit = qtoken.to_value(best_profit) # Get the tokens: t0 = self.chain.get_token(t0addr) t1 = self.chain.get_token(t1addr) logger.info( "Found arb setup of %.4g %s with pairs on %s/%s", best_profit, qtoken.symbol(), t0.symbol(), t1.symbol() ) * And with that we already get some non-crazy outputs: 2022/06/02 09:48:44 [__main__] INFO: Processing block 18333628 2022/06/02 09:48:44 [__main__] INFO: Found arb setup of 0.01108 USDT with pair USDT/LIZ#2 2022/06/02 09:48:44 [__main__] INFO: Found arb setup of 0.03424 USDT with pair LUNA#13/USDT 2022/06/02 09:48:47 [__main__] INFO: Processing block 18333629 2022/06/02 09:48:47 [__main__] INFO: Found arb setup of 1.918e-05 WBNB with pair DOGE#5/WBNB 2022/06/02 09:48:49 [__main__] INFO: Processing block 18333630 2022/06/02 09:48:49 [__main__] INFO: Found arb setup of 4.941e-08 WBNB with pair WBNB/GNT#4 2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.05885 BUSD with pair ADA/BUSD 2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.001192 WBNB with pair HotDoge/WBNB 2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.001166 USDT with pair SHIB#34/USDT 2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.0001915 WBNB with pair WBNB/BabyDoge 2022/06/02 09:48:53 [__main__] INFO: Processing block 18333631 2022/06/02 09:48:53 [__main__] INFO: Found arb setup of 8.468e-06 WBNB with pair PIT/WBNB 2022/06/02 09:48:54 [__main__] INFO: Found arb setup of 0.1356 BUSD with pair UST/BUSD 2022/06/02 09:48:55 [__main__] INFO: Processing block 18333632 * Except that, of course, all those arbitrages are no-profitable due to the gas fees we need to pay for them 😂. ===== Interesting error on parsing of the transaction input data ===== * Eventually I got this error reported: 2022/06/02 10:42:04 [__main__] INFO: Processing block 18334695 Traceback (most recent call last): File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 408, in "monitor-arbs": None, File "D:\Projects\NervProj\nvp\nvp_component.py", line 73, in run res = self.process_command(cmd) File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 378, in process_command logger.debug("Ignoring double quote pair %s", pair_name) File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 82, in monitor_arbitrages try: File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 106, in handle_arbitrage_setups File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 280, in collect_potential_arb_pairs def collect_potential_arb_pairs(self, txlist): File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 262, in parse_input_data reserves0 = result[1] File "D:\Projects\NervProj\nvp\nvp_object.py", line 69, in check raise NVPCheckError(fmt % args) nvp.nvp_object.NVPCheckError: Invalid register length: 00000000000000000000000000000000000000000000000000000039d5f8fac00000000000000000000000000000000000000000000000000 058785d816202d900000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000380264831e98cd8f4c1b309d13d0b2f6f9f35da50000000000000000000000000 000000000000000000000000000000062988a980000000000000000000000000000000000000000000000000000000000000002000000000000000000000000a8d22026119bc8a33941853614148fda61971d4d0 00000000000000000000000bb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c88 (tx: 0xb2c5192c2b6f7a17cfa303da3f02fe8c44c6bf6d4ef10369efaceb8089b4e63f) * And checking the transaction, it would seem that we have that additional '88' values at the end of the input data that is not related to anything ? i'm really wondering how this could happen... * => So in this case, we should probably discard that final part of the input (?). ===== Filtering arbitrage setup with min value ===== * Naturally, our next step should then be to filter all those arb setups that are not profitables anyway. * Basically, I have a fixed gas cost of about 0.001 BNB when executing a (successful) arbitrage setup (and about 0.0001885 BNB when the setup fails) * => So I would like to only keep the setups that are at least somewhat above that mean cost level. * Which means that I need to get the value of all quote tokens into WBNB before we start the arbitrage monitoring. * **OK**: So now added support to retrieve the value of the quote tokens in the native token: def update_quote_token_values(self): """Update the value of the quote tokens in the native currency""" self.quote_token_values = {} dex = self.chain.get_default_exchange() native_sym = self.chain.get_wrapped_native_token().symbol() for addr in self.quote_tokens: val = dex.get_quote(1.0, addr, native_sym) self.quote_token_values[addr] = val logger.info("Quote token %s value: %.6g BNB", self.chain.get_token_symbol(addr), val) * Which give us currently: 2022/06/02 11:38:25 [__main__] INFO: wrapped native token address: 0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c 2022/06/02 11:38:25 [__main__] INFO: Collected 37525 arb pairs 2022/06/02 11:38:25 [__main__] INFO: Quote token WBNB value: 1 BNB 2022/06/02 11:38:26 [__main__] INFO: Quote token BUSD value: 0.00329429 BNB 2022/06/02 11:38:26 [__main__] INFO: Quote token USDT value: 0.00329267 BNB 2022/06/02 11:38:26 [__main__] INFO: Quote token Cake value: 0.014148 BNB 2022/06/02 11:38:26 [__main__] INFO: Quote token ETH value: 5.98795 BNB 2022/06/02 11:38:26 [__main__] INFO: Quote token BTCB value: 98.3489 BNB 2022/06/02 11:38:27 [__main__] INFO: Quote token USDC value: 0.00329528 BNB 2022/06/02 11:38:27 [__main__] INFO: Quote token SAFEMOON value: 3.54682e-10 BNB 2022/06/02 11:38:27 [__main__] INFO: Quote token DAI value: 0.00328415 BNB 2022/06/02 11:38:27 [__main__] INFO: Quote token BANANA value: 0.000508487 BNB * So next, we need to convert the qtoken profit to BNB profit value in all cases when checking the arbitrage setups, which is a simple multiplication now: # Convert the profit to value: qtoken = self.chain.get_token(qaddr) best_profit = qtoken.to_value(best_profit) # Convert the qtoken profit value into native (wrapped) token value: best_profit *= self.quote_token_values[qaddr] # Get the tokens: t0 = self.chain.get_token(t0addr) t1 = self.chain.get_token(t1addr) logger.info( "Found arb setup of %.4g %s with pairs on %s/%s", best_profit, self.native_symbol, t0.symbol(), t1.symbol() ) * And now we can filter the arbitrage setups with a minimal profit value yeah 🥳!: # Convert the profit to value: qtoken = self.chain.get_token(qaddr) best_profit = qtoken.to_value(best_profit) # Convert the qtoken profit value into native (wrapped) token value: best_profit *= self.quote_token_values[qaddr] if best_profit < self.min_profit: # Profit too low, so we ignore this arb setup. return # Get the tokens: t0 = self.chain.get_token(t0addr) t1 = self.chain.get_token(t1addr) logger.info( "Block %d: Found arb setup of %.4g %s with pairs on %s/%s", self.last_block_number, best_profit, self.native_symbol, t0.symbol(), t1.symbol(), ) * And from that I get the following kind of outputs: $ nvp arbman monitor-arbs 2022/06/02 11:52:53 [nvh.crypto.blockchain.evm_blockchain] INFO: Keeping 409133/417141 quotable pairs. 2022/06/02 11:52:54 [nvh.crypto.blockchain.evm_blockchain] INFO: Found 37562 arb compatible pairs 2022/06/02 11:52:54 [__main__] INFO: wrapped native token address: 0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c 2022/06/02 11:52:54 [__main__] INFO: Collected 37562 arb pairs 2022/06/02 11:52:54 [__main__] INFO: Min profit value: 0.001200 2022/06/02 11:52:54 [__main__] INFO: Quote token WBNB value: 1 WBNB 2022/06/02 11:52:54 [__main__] INFO: Quote token BUSD value: 0.00329354 WBNB 2022/06/02 11:52:55 [__main__] INFO: Quote token USDT value: 0.00328904 WBNB 2022/06/02 11:52:55 [__main__] INFO: Quote token Cake value: 0.0141319 WBNB 2022/06/02 11:52:55 [__main__] INFO: Quote token ETH value: 5.99574 WBNB 2022/06/02 11:52:56 [__main__] INFO: Quote token BTCB value: 98.3494 WBNB 2022/06/02 11:52:56 [__main__] INFO: Quote token USDC value: 0.00329315 WBNB 2022/06/02 11:52:56 [__main__] INFO: Quote token SAFEMOON value: 3.54682e-10 WBNB 2022/06/02 11:52:56 [__main__] INFO: Quote token DAI value: 0.00327899 WBNB 2022/06/02 11:52:56 [__main__] INFO: Quote token BANANA value: 0.000508487 WBNB 2022/06/02 11:55:38 [__main__] INFO: Block 18336166: Found arb setup of 0.002436 WBNB with pairs on WBNB/QUACK#3 2022/06/02 11:55:48 [__main__] INFO: Block 18336169: Found arb setup of 0.03963 WBNB with pairs on ASS#7/WBNB 2022/06/02 11:56:20 [__main__] INFO: Block 18336180: Found arb setup of 0.001731 WBNB with pairs on WBNB/GINUX ===== Conclusion ===== * Okay, so, this is all very interesting, but there are still a lot of steps to take before we can effectively do our first arbitrage transaction here lol. * Yet, as usual, this article is getting too big, so I will stop it here, and we will continue this journey in our next post ✌!