Crypto: Initial ArbitrageManager implementation

In this post, we will start with some preliminary analysis of the blocks/trnsactions content and progressively build our way to an Arbitrage management system for EVM blockchains 😎 Or well… an initial version of it at least lol.

  • Let's start with some simple overview on the transactions we can extract from each block…
  • When there is no input data for a transaction it means we are simply transfering funds, like in this one:
    {'blockHash': HexBytes('0xe1bb91e1dd02d2032a760aefaa9434e26d74e252871ca987a0147ce08ae75129'),
       'blockNumber': 18224230,
       'from': '0xBD612a3f30dcA67bF60a39Fd0D35e39B7aB80774',
       'gas': 207128,
       'gasPrice': 10000000000,
       'hash': HexBytes('0xe81d4a96e5259149353554a012b040dd29d82df9a6a1147c392ca24c007aee75'),
       'input': '0x',
       'nonce': 3579524,
       'to': '0x00fC8a4aF0Cc0B917EA935a33e0369153B63B357',
       'transactionIndex': 10,
       'value': 1931268460000000000,
       'type': '0x0',
       'v': 147,
       'r': HexBytes('0x75bd9f1b6a3b8d1013b7ddedcdf103146e4304af6a8b4dc4f481d3a569d405e9'),
       's': HexBytes('0x6d750340acef0f58d2d0474cdf133bfbe964658c5058d2b3fbfeb02e72556db0')}
  • I used the following code to try to collect all possible method signatures/IDs:
    bck = chain.get_block('latest', full_tx=True)
    
    # find all the transactions occuring on a dex router
    # Checking with PancakeSwap2 router only below:
    routers = ['0x10ED43C718714eb63d5aA57B78B54704E256024E']
    alltx = bck["transactions"]
    tx_count = 0
    sigs = set()
    
    for txn in alltx:
        if txn['to'] in routers:
            print(f"Found transaction on router: {txn}")
            input = txn['input']
            print(f"Hash: {txn['hash'].hex()}")
            print(f"Input: {input}")
            # print(f"Value: {txn['value']}")
            sigs.add(input[:10])
            tx_count += 1
    
    print(f"Keeping {tx_count} transactions on {len(alltx)}")
    print(f"List of method signatures: {sigs}")
  • 0xa9059cbb is the method id for transfer(address recipient, uint256 amount)
  • Some of the signatures of interest to us are:
    • 0x7ff36ab5 is for swapExactETHForTokens(uint256 amountOutMin, address[] path, address to, uint256 deadline)
    • 0x18cbafe5 is for swapExactTokensForETH(uint256 amountIn, uint256 amountOutMin, address[] path, address to, uint256 deadline)
    • 0x8803dbee is for swapTokensForExactTokens(uint256 amountOut, uint256 amountInMax, address[] path, address to, uint256 deadline)
    • 0xe8e33700 is for addLiquidity(address tokenA, address tokenB, uint256 amountADesired, uint256 amountBDesired, uint256 amountAMin, uint256 amountBMin, address to, uint256 deadline)
    • 0x38ed1739 is for swapExactTokensForTokens(uint256 amountIn, uint256 amountOutMin, address[] path, address to, uint256 deadline)
    • 0xf305d719 is for addLiquidityETH(address token, uint256 amountTokenDesired, uint256 amountTokenMin, uint256 amountETHMin, address to, uint256 deadline)
  • Now trying to collect on more blocks:
    bck = chain.get_block('latest', full_tx=True)
    last_num = bck['number']
    
    # find all the transactions occuring on a dex router
    routers = ['0x10ED43C718714eb63d5aA57B78B54704E256024E']
    tx_count = 0
    all_count = 0
    sigs = set()
    
    for i in range(20):
        num = last_num - i
        print(f"Checking block {num}...")
        
        alltx = bck["transactions"]
        all_count += len(alltx)
        
        for txn in alltx:
            if txn['to'] in routers:
                # print(f"Found transaction on router: {txn}")
                input = txn['input']
                print(f"Hash: {txn['hash'].hex()}")
                print(f"Input: {input}")
                # print(f"Value: {txn['value']}")
                sigs.add(input[:10])
                tx_count += 1
       
        bck = chain.get_block(num-1, full_tx=True)
        
    print(f"Keeping {tx_count} transactions on {all_count}")
    print(f"List of method signatures: {sigs}")
  • I then tried to check what were the most used contracts in recent blocks with the following script:
    bck = chain.get_block('latest', full_tx=True)
    last_num = bck['number']
    
    all_count = 0
    targets = {}
    
    for i in range(100):
        num = last_num - i
        # print(f"Checking block {num}...")
        
        alltx = bck["transactions"]
        all_count += len(alltx)
        
        for txn in alltx:
            addr = txn['to']
            targets[addr] = targets.get(addr,0)+1
    
        bck = chain.get_block(num-1, full_tx=True)
    
    arr = [(key, val) for key, val in targets.items()]
    arr.sort(key=lambda item: item[1], reverse=True)
    
    print(f"Most used contracts:")
    for idx, item in enumerate(arr[:10]):
        print(f"- {item[0]}:  {item[1]*100/all_count:.2f}%")
  • And I got the following results with that:
    Most used contracts:
    - 0x10ED43C718714eb63d5aA57B78B54704E256024E:  16.89%
    - 0x55d398326f99059fF775485246999027B3197955:  6.20%
    - 0x4a2c860cEC6471b9F5F5a336eB4F38bb21683c98:  1.82%
    - 0x18B2A687610328590Bc8F2e5fEdDe3b582A49cdA:  1.58%
    - 0xe9e7CEA3DedcA5984780Bafc599bD69ADd087D56:  1.43%
    - 0xfe702057BDD6E725d8372708d16143F6928B1383:  1.40%
    - 0x093C639e4291fbDcC339557ACCDAFF52aaAC37d9:  1.10%
    - 0x760B8237e457e6A2b5b196ae073EF0301b8316a2:  1.04%
    - 0x1B6C9c20693afDE803B27F8782156c0f892ABC2d:  0.97%
    - 0xca4533591F5E5256f1bDb0f07FEe3be76a1aae35:  0.88%
  • Note: the first address on that list (0x10ED43C718714eb63d5aA57B78B54704E256024E) is the PancakeSwap2 router address.
  • In my arbitrage mechanism I do not check for arbitrage setup between all tokens: instead I rely on swaps with major tokens, which is what I call “quote tokens”. These are the tokens that are used most often in all the available pairs.
  • So first we need to find those quote tokens (by default trying to collect only 10 of them) (in jupyter):
    # Retrieve the quote tokens:
    allpairs = db.get_all_pairs()
    npairs = len(allpairs)
    print(f"Num pairs: {npairs}")
    
    tokens = {}
    tot_count = npairs*2
    
    for pair in allpairs:
        t0addr = pair[2]
        t1addr = pair[3]
        tokens[t0addr] = tokens.get(t0addr,0)+1
        tokens[t1addr] = tokens.get(t1addr,0)+1
        
    arr = [ (key,val) for key,val in tokens.items()]
    arr.sort(key=lambda item: item[1], reverse=True)
    # arr[:30]
    arr = arr[:10]
    
    quotes = []
    for item in arr:
        token = chain.get_token(item[0])
        quotes.append([item[0], token.symbol(), 100.0*item[1]/tot_count])
    
    quotes
  • And this gives us the results:
    Num pairs: 231088
    [['0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c', 'WBNB', 45.35523263864848],
     ['0xe9e7CEA3DedcA5984780Bafc599bD69ADd087D56', 'BUSD', 1.9501228969050752],
     ['0x55d398326f99059fF775485246999027B3197955', 'USDT', 0.818952087516444],
     ['0x0E09FaBB73Bd3Ade0a17ECC321fD13a19e81cE82', 'Cake', 0.2633196011908883],
     ['0x2170Ed0880ac9A755fd29B2688956BD959F933F8', 'ETH', 0.10537111403448036],
     ['0x7130d2A12B9BCbFAe4f2634d864A1Ee1Ce3Ead9c', 'BTCB#2', 0.07875787578757876],
     ['0x8AC76a51cc950d9822D68b83fE1Ad97B32Cd580d', 'USDC', 0.07507962334695008],
     ['0x8076C74C5e3F5852037F31Ff0093Eeb8c8ADd8D3',
      'SAFEMOON',
      0.06382849823443883],
     ['0x603c7f932ED1fc6575303D8Fb018fDCBb0f39a95',
      'BANANA#2',
      0.06296302707193796],
     ['0xE02dF9e3e622DeBdD69fb838bB799E3F168902c5', 'BAKE', 0.05452468323755452]]
  • In the list above, we see that the symbol with use for BTCB and BANANA should be renamed in our database: let's implement support for that.
  • ⇒ I added this command in the BlockchainManager class:
            if cmd == 'swap-symbols':
                chain_name = self.get_param("chain")
                chain: EVMBlockchain = self.get_component(f"{chain_name}_chain")
    
                sym1 = self.get_param("symbol1")
                sym2 = self.get_param("symbol2")
                self.swap_token_symbols(chain, sym1, sym2)
                return True
  • And now we can swap the symbols for BTCB and BTCB 2
    nvp bchain swap-symbols -c bsc BTCB BTCB#2
  • Oh crap… this doesn't work because we are violating a constraint in the process 😅:
    $ nvp bchain swap-symbols -c bsc BTCB BTCB#2
    Traceback (most recent call last):
      File "D:\Projects\NervHome\nvh\crypto\blockchain\blockchain_manager.py", line 171, in <module>
        comp.run()
      File "D:\Projects\NervProj\nvp\nvp_component.py", line 69, in run
        res = self.process_command(cmd)
      File "D:\Projects\NervHome\nvh\crypto\blockchain\blockchain_manager.py", line 54, in process_command
        self.swap_token_symbols(chain, sym1, sym2)
      File "D:\Projects\NervHome\nvh\crypto\blockchain\blockchain_manager.py", line 31, in swap_token_symbols
        chain.get_db().update_tokens([desc1, desc2])
      File "D:\Projects\NervHome\nvh\crypto\blockchain\chain_db.py", line 240, in update_tokens
        self.execute(SQL_UPDATE_TOKEN, data, many=True, commit=True)
      File "D:\Projects\NervHome\nvh\crypto\blockchain\chain_db.py", line 95, in execute
        return self.sql_db.execute(*args, **kaargs)
      File "D:\Projects\NervHome\nvh\core\postgresql_db.py", line 60, in execute
        c.executemany(code, data)
    psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "tokens_symbol_key"
    DETAIL:  Key (symbol)=(BTCB#2) already exists.
    
  • So testing with an updated version now:
        def swap_token_symbols(self, chain, sym1, sym2):
            """Swap the symbols for 2 tokens"""
            desc1 = chain.get_token_desc(symbol=sym1)
            desc2 = chain.get_token_desc(symbol=sym2)
            self.check(desc1 is not None, "Cannot find token with symbol %s", sym1)
            self.check(desc2 is not None, "Cannot find token with symbol %s", sym2)
    
            # Rename token 1 with temp symbol name:
            desc1['symbol'] = "<temp_symbol_name>"
            chain.get_db().update_token(desc1)
    
            # Renaming token2:
            logger.info("Renaming %s to %s...", sym2, sym1)
            desc2['symbol'] = sym1
            chain.get_db().update_token(desc2)
            # Renaming token1:
            logger.info("Renaming %s to %s...", sym1, sym2)
            desc1['symbol'] = sym2
            chain.get_db().update_token(desc1)
            logger.info("Done.")
  • This time this works:
    $ nvp bchain swap-symbols -c bsc BTCB BTCB#2
    2022/05/30 08:32:16 [nvh.crypto.blockchain.chain_db] INFO: Updating token row: ('BTCB Token', '<temp_symbol_name>', 18, None, None,
    0, '0x9621579a532C9052B1F3CE8e3022716120Ba4187')
    2022/05/30 08:32:16 [__main__] INFO: Renaming BTCB#2 to BTCB...
    2022/05/30 08:32:16 [nvh.crypto.blockchain.chain_db] INFO: Updating token row: ('BTCB Token', 'BTCB', 18, None, None, 0, '0x7130d2A1
    2B9BCbFAe4f2634d864A1Ee1Ce3Ead9c')
    2022/05/30 08:32:17 [__main__] INFO: Renaming BTCB to BTCB#2...
    2022/05/30 08:32:17 [nvh.crypto.blockchain.chain_db] INFO: Updating token row: ('BTCB Token', 'BTCB#2', 18, None, None, 0, '0x962157
    9a532C9052B1F3CE8e3022716120Ba4187')
    2022/05/30 08:32:17 [__main__] INFO: Done.
  • Doing the same for BANANA (but with less outputs):
    $ nvp bchain swap-symbols -c bsc BANANA BANANA#2
    2022/05/30 08:34:44 [__main__] INFO: Renaming BANANA#2 to BANANA...
    2022/05/30 08:34:44 [__main__] INFO: Renaming BANANA to BANANA#2...
    2022/05/30 08:34:44 [__main__] INFO: Done.
  • Now checking again the quote tokens: All OK
  • ⇒ In those quote tokens, WBNB represents a large 45.36% of all the pair legs, and then we have the following additional quotes:

  • Now that we can get a list of valid quote tokens, the next step would be to filter the full list of pairs eliminating the too exotic pairs (those not using one of our accepted quote tokens):
    # Filter all pairs:
    qaddrs = [item[0] for item in quotes]
    
    valid_pairs = []
    
    for pair in allpairs:
        if pair[2] in qaddrs or pair[3] in qaddrs:
            valid_pairs.append(pair)
    
    print(f"quote filter: kept {len(valid_pairs)} on {len(allpairs)} pairs")
  • And we still get quite a lot of pairs from that filtering 👍:
    quote filter: kept 227846 on 233463 pairs
  • To collect the pairs that we can use for arbitrage setup construction I can use the following script:
    tokens_map = {}
    for pair in valid_pairs:
        t0 = pair[2]
        t1 = pair[3]
        t0, t1 = chain.sort_tokens(t0, t1)
        key = f"{t0}_{t1}"
        if key not in tokens_map:
            tokens_map[key] = [pair]
        else:
            tokens_map[key].append(pair)
    
    # And we keep only the token pairs where we have at least 2 pairs:
    compat_pairs = {}
    count = 0
    for key, plist in tokens_map.items():
        npairs = len(plist)
        if npairs < 2:
            continue
        # if npairs > 2:
        #     print(f"Found {npairs} compat pairs.")
    
        for i in range(npairs):
            others = plist.copy()
            del others[i]
            compat_pairs[plist[i]] = others 
            count += 1
    
    print(f"Found {count} arb compatible pairs")
  • ⇒ Currently I can find about 53156 pairs which is quite good already.
  • The ArbitrageManager class will be responsible for finding those arbitrage setups, so it should be able to perform all the steps described above internally.
  • Here is an initial minimal version of that component:
    """arbitrage manager class"""
    
    import logging
    # import signal
    # import sys
    import time
    
    from nvp.nvp_component import NVPComponent
    from nvp.nvp_context import NVPContext
    
    from nvh.crypto.blockchain.evm_blockchain import EVMBlockchain
    
    # from nvp.nvp_object import NVPCheckError
    
    
    logger = logging.getLogger(__name__)
    
    
    # def signal_handler(_sig, _frame):
    #     """Handler for ctrl+c signal"""
    #     logger.info("Ctrl+C pressed, exiting.")
    #     sys.exit(0)
    
    
    # signal.signal(signal.SIGINT, signal_handler)
    
    
    class ArbitrageManager(NVPComponent):
        """ArbitrageManager component class"""
    
        def __init__(self, ctx):
            """ArbitrageManager base constructor"""
            NVPComponent.__init__(self, ctx)
            self.arb_pairs = None
            self.chain = None
            self.routers = None
            self.quote_tokens = None
            self.native_address = None
            self.ignore_dual_quote_pairs = True
            self.last_block_number = 0
    
        def collect_quote_tokens(self):
            """Collect the best quote tokens from the current pairs"""
    
        def monitor_arbitrages(self, chain: EVMBlockchain):
            """Method used to monitor the availabla arbitrage setups."""
    
            # Prepare the monitoring context:
            self.chain = chain
            self.arb_pairs, self.quote_tokens = chain.collect_arb_compatible_pairs()
            self.routers = chain.get_router_addresses()
            self.native_address = chain.get_wrapped_native_token().address()
            logger.info("wrapped native token address: %s", self.native_address)
            logger.info("Collected %d arb pairs", len(self.arb_pairs))
    
            result = True
            while result:
                try:
                    result = self.handle_arbitrage_setups()
                except KeyboardInterrupt:
                    logger.info("Stopping arbitrage monitoring on %s", chain.get_short_name())
                    break
    
        def handle_arbitrage_setups(self):
            """Method used to find and handle arbitrage setups"""
    
            # Collect the latest block:
            bck = self.chain.get_block("latest", full_tx=True)
    
            # Get the block number:
            bnum = bck["number"]
            if bnum == self.last_block_number:
                # logger.info("Waiting for new block...")
                time.sleep(0.1)
                return True
    
            logger.info("Processing block %s", bnum)
            self.last_block_number = bnum
    
            # Get all transactions:
            transactions = bck["transactions"]
    
            pairs = self.collect_potential_arb_pairs(transactions)
            npairs = len(pairs)
            if npairs > 0:
                logger.info("Collected %d potential arb pairs: %s", npairs, pairs)
    
            return True
    
        def parse_input_data(self, inputdat):
            """Parse the input data from a given transaction"""
            sig = inputdat[:10]
            regs = inputdat[10:]
            # the remaining "regs" length should be a multiple of 64 chars
            lenregs = len(regs)
            self.check(lenregs % 64 == 0, "Invalid register length: %s", regs)
    
            return {"sig": sig, "regs": [regs[i : i + 64] for i in range(0, lenregs, 64)]}
    
        def collect_potential_arb_pairs(self, txlist):
            """Collect the potential arb pairs from a list of transactions"""
            pairs = set()
            for txn in txlist:
                # Check what kind of transaction this was based on the 'to' value
                # and 'input' value:
                # logger.info("Parsing transation: %s", txn)
                if txn["to"] not in self.routers:
                    # We ignore this transaction
                    continue
    
                # We should check that transaction:
                input_data = self.parse_input_data(txn["input"])
                # logger.info("Collected input data: %s", self.pretty_print(input_data))
    
                # Now we can collect the pairs involved in this transaction depending on the method signature:
                # try:
                tx_pairs = self.parse_input_pairs(input_data, txn["hash"])
                # except NVPCheckError as err:
                #     logger.error("Error while processing input data from %s", txn["hash"])
                #     raise err
    
                if tx_pairs is not None:
                    pairs = pairs.union(tx_pairs)
    
            return list(pairs)
    
        def parse_input_pairs(self, input_data, txh):
            """Parse the input data to extract the involved pairs"""
            regs = input_data["regs"]
            sig = input_data["sig"]
            tokens = None
    
            if sig in ["0x7ff36ab5", "0xb6f9de95", "0xfb3bdb41"]:
                self.check(int(regs[1], 16) == 8 * 16, "Invalid value for regs[1]: %s", regs[1])
                # get the number of tokens:
                ntokens = int(regs[4], 16)
                self.check(len(regs) == 5 + ntokens, "Invalid number of tokens: %s (hash=%s)", regs)
                # Get each token:
                tokens = [self.chain.to_checksum_address(regs[5 + i][-40:]) for i in range(ntokens)]
    
            if tokens is None:
                # Nothing by default:
                return None
    
            # now we construct the pair names:
            pnames = []
            for i in range(ntokens - 1):
                t0, t1 = tokens[i], tokens[i + 1]
                if t0 == t1:
                    logger.error("Detected same token (%s) in pair %d from %s", t0, i, txh)
                    continue
    
                t0, t1 = self.chain.sort_tokens(t0, t1)
    
                pair_name = f"{t0}_{t1}"
    
                # We only keep that pair if it is part of the arbitrage compatible pairs:
                if pair_name not in self.arb_pairs:
                    continue
    
                # If both tokens are quote tokens we should ignore the pair: we will probably not get
                # any valid arb setup on these:
                if self.ignore_dual_quote_pairs and t0 in self.quote_tokens and t1 in self.quote_tokens:
                    logger.debug("Ignoring double quote pair %s", pair_name)
                    continue
    
                # Otherwise we keep that pair name:
                pnames.append(pair_name)
    
            return set(pnames)
    
        def process_command(self, cmd):
            """Check if this component can process the given command"""
    
            if cmd == "monitor-arbs":
                chain_name = self.get_param("chain")
                chain: EVMBlockchain = self.get_component(f"{chain_name}_chain")
                self.monitor_arbitrages(chain)
                return True
    
            return False
    
    
    if __name__ == "__main__":
        # Create the context:
        context = NVPContext()
    
        # Add our component:
        comp = context.register_component("arb_man", ArbitrageManager(context))
    
        context.define_subparsers(
            "main",
            {
                "monitor-arbs": None,
            },
        )
    
        psr = context.get_parser("main.monitor-arbs")
        psr.add_argument(
            "-c",
            "--chain",
            dest="chain",
            type=str,
            default="bsc",
            help="Blockchain from where to monitor the arbitrage setups",
        )
    
        comp.run()
    
  • And with that code I can start collecting some arbitrage compatible pairs:
    $ nvp arbman monitor-arbs
    2022/06/01 22:26:39 [nvh.crypto.blockchain.evm_blockchain] INFO: Keeping 369459/377028 quotable pairs.
    2022/06/01 22:26:40 [nvh.crypto.blockchain.evm_blockchain] INFO: Found 33524 arb compatible pairs
    2022/06/01 22:26:40 [__main__] INFO: wrapped native token address: 0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c
    2022/06/01 22:26:40 [__main__] INFO: Collected 33524 arb pairs
    2022/06/01 22:26:41 [__main__] INFO: Processing block 18320121
    2022/06/01 22:26:41 [__main__] INFO: Collected 1 potential arb pairs: ['0x9fD87aEfe02441B123c3c32466cD9dB4c578618f_0xbb4CdB9CBd36B0
    1bD1cBaEBF2De08d9173bc095c']
    2022/06/01 22:26:42 [__main__] INFO: Processing block 18320122
    2022/06/01 22:26:42 [__main__] INFO: Collected 1 potential arb pairs: ['0x1CE0c2827e2eF14D5C4f29a091d735A204794041_0xbb4CdB9CBd36B0
    1bD1cBaEBF2De08d9173bc095c']
    2022/06/01 22:26:45 [__main__] INFO: Processing block 18320123
    2022/06/01 22:26:45 [__main__] INFO: Collected 2 potential arb pairs: ['0x23396cF899Ca06c4472205fC903bDB4de249D6fC_0xe9e7CEA3DedcA5
    984780Bafc599bD69ADd087D56', '0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c_0xc748673057861a797275CD8A068AbB95A902e8de']
    
  • Now that we can collect some potential arbitrage setup pairs, we need to retrieve all the reverses for those pairs, which should be easy and fast to do using the “DualReserves” smart contract I created sometime again for that: it will accept a list of pair addresses, and output the reserves for the token0 and the token1 in 2 separate arrays for all those pairs.
  • ⇒ Let's restore the usage of that contract now.
  • So we get the list of pair addresses from the “token_pairs” of interest to ust with this kind of code:
            pair_tk_map = {}
            for tkpair in tkpairs:
                # That name must be in the global mapping:
                plist = self.arb_pairs[tkpair]
                for pdesc in plist:
                    paddr = pdesc[1]  # address is element 1 in this row.
                    pair_tk_map[paddr] = tkpair
    
            # Get the list of addresses:
            paddrs = list(pair_tk_map.keys())
            logger.debug("Should collect reserves for %d pairs: %s", len(paddrs), paddrs)
    
            # Now we call the DualReserves contract:
            pair_reserves, block_ref = self.get_all_reserves(paddrs)
            logger.info("Got the pair reserves: %s", pair_reserves)
            if block_ref != bnum:
                logger.info("Pair reserves block offset: %d", block_ref - bnum)
  • And we collect the reserves for each pair with the following call to our internal GetDualReserves contract:
        def get_all_reserves(self, paddrs):
            """Retrieve the reserves for all the pairs provided as input"""
            # logger.info("Calling getDualReserves with paddrs=%s", paddrs)
            result = self.dual_reserves_sc.call_function("getDualReserves", paddrs)
            block_ref = result[0]
            reserves0 = result[1]
            reserves1 = result[2]
            reserves = {}
            for idx, paddr in enumerate(paddrs):
                reserves[paddr] = (reserves0[idx], reserves1[idx])
    
            return reserves, block_ref
  • This work just fine and will give us some valid outputs (and we don't seem to have any block offset for now which is nice):
    2022/06/02 07:17:00 [__main__] INFO: Processing block 18330640
    2022/06/02 07:17:00 [__main__] INFO: Got the pair reserves: {'0x1494CB2920AF3b296556a2569D637ed36EE3E653': (6679798815876686757, 321226250680082), '0xa584A5805DDE848cb2
    00Da49a2A58174cCfd1Ba8': (56610244478243821, 1600579074), '0x038625dDCD10fde5253f1b3db84E53339D50f82F': (501060929594755052116933, 18729657199448284460), '0x438F9009A1a
    2A40301D4663366Af7eb7ffc91ff3': (2382367943914026, 179616631182), '0x756Ed14D42475b1EE7Cf1c5Ba035E3D1A9295061': (15935285709348273686573965, 337309236320016822568), '0x
    1f5C12a536d13689d0198B8df0F9637E15Bf8A29': (15272089060184523372197, 349075381400303126), '0x250B6610717Ef6F1F5e062d61FD5F3f075f201bF': (3334496309106401971007, 5333080
    855004205), '0xb5BcdAeDC04dBd0742fD9dC342dE7C427192Cf3c': (50801579006377274100, 88942202633920), '0x265E2B86Cff3d43e2cB732f1fD547775b03eD71C': (4959307302118341344, 16
    432324013437), '0x24E471d46596750BAAE5881CBB878c0ba929a288': (5186144545154995305313, 194895216166968), '0xc592F85794d80D8C048dF047E7091F74747D9B72': (14686598592882386
    211786, 561017969257450), '0xC0AFB6078981629F7eAe4f2ae93b6DBEA9D7a7e9': (12897197441848189284851144042, 463137737246621906914), '0x343e0c149cA3ba381F16485746B26FB81807a
    334': (1160295101461103435134530, 42751428133928641), '0x6b7b3523a6660a5fcE3c28E1536CC8dd8D57f7E0': (7012481874296233978981009795, 251414318937330880508)}
    2022/06/02 07:17:01 [__main__] INFO: Processing block 18330641
    2022/06/02 07:17:05 [__main__] INFO: Processing block 18330642
    2022/06/02 07:17:05 [__main__] INFO: Got the pair reserves: {'0xA0a414df3481c68b07B9FD809ee7Fa9f4d05A5d3': (157901004649148527, 206740412469435), '0x8CA3fF14A52b080C54A
    6d1a405eecA02959d39fE': (1128915779061263920786382, 2149602698120994035992), '0x81499F67774F8e5fc60d65F23cd799461b172BD3': (5321889577169921053, 6544234512333964371), '
    0x1A43917e018Fac283b520D1c3e3dc09072dCA343': (120449873172412, 119057808912149), '0xbB69c66E6005D810b13D65bA71ccfF5480AA1278': (5005476603695973, 3492139298165763), '0x
    A1f1AAe9aBd24C7C7fA08b86DD096710e02B3bc7': (574226955524916956552, 876692518047456226607), '0x0A147bf930ED01d92B9Cd83A87482020e66617e2': (4291124175, 217174449541635278
    1), '0x955631FEDea348D237F4FD772BD5bc81b4c651Bf': (4232821968, 2175965266826955625), '0xB450CBF17F6723Ef9c1bf3C3f0e0aBA368D09bF5': (3052235222724935529443704, 131071950
    94219077811599), '0xc232A127F9f57C2006bCABa8C9D813CE0811cf23': (53435982089657, 306540307770), '0x1C5041ca551e6C58381dC3f66500a477E11033A5': (204251654999022668, 106323
    5502407882), '0xc1b047cda56754eCA893cF0226274365EeE0b6D6': (838050760916017980147, 3529160662798421453), '0xfff58a50Fdde55F5B5626BB66dE476F63155E078': (7663654884354897
    9981354, 344099901419847577931)}
    
  • So now we should try all the possible combinations between 2 pairs from the same tk_pair to see if we indeed have an arbitrage setup 👍!
  • So here is the main function to compute the best possible arb setup given 2 pairs:
        def compute_arb_profit(self, p0, p1, pair_reserves, qtoken):
            """Compute the arbitrage profit between 2 pairs"""
            p0addr = p0[1]
            p1addr = p1[1]
    
            # We really expect to have the same t0 and t1 values:
            t0addr = p0[2]
            t1addr = p0[3]
            self.check(t0addr == p1[2], "Mismatch in token0 between %s and %s", p0addr, p1addr)
            self.check(t1addr == p1[3], "Mismatch in token1 between %s and %s", p0addr, p1addr)
    
            # Get the exchanges from the indices:
            dex0 = self.chain.get_exchange(p0[8])
            dex1 = self.chain.get_exchange(p1[8])
    
            # Get the reserves:
            r0, r1 = pair_reserves[p0addr]
            rin, rout = pair_reserves[p1addr]
    
            # Get the fee points:
            fp0 = dex0.get_swap_fee_points(p0addr)
            fp1 = dex1.get_swap_fee_points(p1addr)
    
            amount0, profit = dex0.compute_flash_loan_best_investment(r0, r1, fp0, rin, rout, fp1)
    
            if profit is None:
                return None, None
    
            logger.debug("raw profit: %s, amount0: %s", str(profit), str(amount0))
    
            # The profit is given in token1, if this is not he quote token, then we should convert back
            # to the token0:
    
            if qtoken == t0addr:
                self.check(t0addr in self.quote_tokens, "Invalid pair with no quote token")
    
                # When the quote token is the source token then we are going to refund only the minimal swap amount
                # to the initial pair and keep the difference from what we borrowed in source token.
    
                # the formula to compute the min swap is:
                # aSwapMin = params[4] + (rin * aRfd * 10000) / ((rout - aRfd) * (10000 - params[1]));
                # or:
                # aSwapMin = refoff + (rin * aRfd * 10000) / ((rout - aRfd) * (10000 - fp1));
                # with for aRfd:
                # uint256 aRfd = (uint256(r1) * amount0) / (uint256(r0) - amount0);
                # aRfd = params[4] + (aRfd * 10000)/(10000 - params[0])
                # So we can compute aRfd first:
                a_rfd = self.refund_offset + (int(r1) * int(amount0) * 10000) // ((int(r0) - int(amount0)) * (10000 - fp0))
                a_swap_min = self.refund_offset + (int(rin) * a_rfd * 10000) // ((int(rout) - a_rfd) * (10000 - fp1))
                profit = amount0 - a_swap_min
    
            # At this point the profit is always given in the quote token,
            # so we can report it and then select the best pair:
            return amount0, profit
  • Note: this function is not really complete yet: for instance we are not taking into account that flashloans might not be possible on a given exchange: so, some pairs have to be swapped, but anyway.
  • And then we select only the 2 pairs that will give use the best profit:
        def process_arb_setups(self, plist, pair_reserves):
            """Process the best arbitrage setup between any 2 pairs of a given pair list"""
    
            t0addr = plist[0][2]
            t1addr = plist[0][3]
            qaddr = t1addr if t1addr in self.quote_tokens else t0addr
    
            # best (amount0,profit,p0,p1)
            best_amount0 = 0
            best_profit = 0.0
            best_p0 = None
            best_p1 = None
    
            npairs = len(plist)
            for i in range(npairs - 1):
                for j in range(i + 1, npairs):
                    # Check for the arb setup between pairs i and j
                    p0 = plist[i]
                    p1 = plist[j]
    
                    am0, pval = self.compute_arb_profit(p0, p1, pair_reserves, qaddr)
    
                    if am0 is not None and pval > best_profit:
                        best_amount0 = am0
                        best_profit = pval
                        best_p0 = p0
                        best_p1 = p1
    
            # Check if we found something value:
            if best_amount0 == 0:
                # Nothing found.
                return
    
            # Convert the profit to value:
            qtoken = self.chain.get_token(qaddr)
            best_profit = qtoken.to_value(best_profit)
    
            # Get the tokens:
            t0 = self.chain.get_token(t0addr)
            t1 = self.chain.get_token(t1addr)
            logger.info(
                "Found arb setup of %.4g %s with pairs on %s/%s", best_profit, qtoken.symbol(), t0.symbol(), t1.symbol()
            )
  • And with that we already get some non-crazy outputs:
    2022/06/02 09:48:44 [__main__] INFO: Processing block 18333628
    2022/06/02 09:48:44 [__main__] INFO: Found arb setup of 0.01108 USDT with pair USDT/LIZ#2
    2022/06/02 09:48:44 [__main__] INFO: Found arb setup of 0.03424 USDT with pair LUNA#13/USDT
    2022/06/02 09:48:47 [__main__] INFO: Processing block 18333629
    2022/06/02 09:48:47 [__main__] INFO: Found arb setup of 1.918e-05 WBNB with pair DOGE#5/WBNB
    2022/06/02 09:48:49 [__main__] INFO: Processing block 18333630
    2022/06/02 09:48:49 [__main__] INFO: Found arb setup of 4.941e-08 WBNB with pair WBNB/GNT#4
    2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.05885 BUSD with pair ADA/BUSD
    2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.001192 WBNB with pair HotDoge/WBNB
    2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.001166 USDT with pair SHIB#34/USDT
    2022/06/02 09:48:50 [__main__] INFO: Found arb setup of 0.0001915 WBNB with pair WBNB/BabyDoge
    2022/06/02 09:48:53 [__main__] INFO: Processing block 18333631
    2022/06/02 09:48:53 [__main__] INFO: Found arb setup of 8.468e-06 WBNB with pair PIT/WBNB
    2022/06/02 09:48:54 [__main__] INFO: Found arb setup of 0.1356 BUSD with pair UST/BUSD
    2022/06/02 09:48:55 [__main__] INFO: Processing block 18333632
    
  • Except that, of course, all those arbitrages are no-profitable due to the gas fees we need to pay for them 😂.
  • Eventually I got this error reported:
    2022/06/02 10:42:04 [__main__] INFO: Processing block 18334695
    Traceback (most recent call last):
      File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 408, in <module>
        "monitor-arbs": None,
      File "D:\Projects\NervProj\nvp\nvp_component.py", line 73, in run
        res = self.process_command(cmd)
      File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 378, in process_command
        logger.debug("Ignoring double quote pair %s", pair_name)
      File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 82, in monitor_arbitrages
        try:
      File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 106, in handle_arbitrage_setups
    
      File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 280, in collect_potential_arb_pairs
        def collect_potential_arb_pairs(self, txlist):
      File "D:\Projects\NervHome\nvh\crypto\blockchain\arbitrage_manager.py", line 262, in parse_input_data
        reserves0 = result[1]
      File "D:\Projects\NervProj\nvp\nvp_object.py", line 69, in check
        raise NVPCheckError(fmt % args)
    nvp.nvp_object.NVPCheckError: Invalid register length: 00000000000000000000000000000000000000000000000000000039d5f8fac00000000000000000000000000000000000000000000000000
    058785d816202d900000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000380264831e98cd8f4c1b309d13d0b2f6f9f35da50000000000000000000000000
    000000000000000000000000000000062988a980000000000000000000000000000000000000000000000000000000000000002000000000000000000000000a8d22026119bc8a33941853614148fda61971d4d0
    00000000000000000000000bb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c88 (tx: 0xb2c5192c2b6f7a17cfa303da3f02fe8c44c6bf6d4ef10369efaceb8089b4e63f)
    
  • And checking the transaction, it would seem that we have that additional '88' values at the end of the input data that is not related to anything ? i'm really wondering how this could happen…
  • ⇒ So in this case, we should probably discard that final part of the input (?).
  • Naturally, our next step should then be to filter all those arb setups that are not profitables anyway.
  • Basically, I have a fixed gas cost of about 0.001 BNB when executing a (successful) arbitrage setup (and about 0.0001885 BNB when the setup fails)
  • ⇒ So I would like to only keep the setups that are at least somewhat above that mean cost level.
  • Which means that I need to get the value of all quote tokens into WBNB before we start the arbitrage monitoring.
  • OK: So now added support to retrieve the value of the quote tokens in the native token:
        def update_quote_token_values(self):
            """Update the value of the quote tokens in the native currency"""
            self.quote_token_values = {}
    
            dex = self.chain.get_default_exchange()
    
            native_sym = self.chain.get_wrapped_native_token().symbol()
    
            for addr in self.quote_tokens:
                val = dex.get_quote(1.0, addr, native_sym)
                self.quote_token_values[addr] = val
                logger.info("Quote token %s value: %.6g BNB", self.chain.get_token_symbol(addr), val)
  • Which give us currently:
    2022/06/02 11:38:25 [__main__] INFO: wrapped native token address: 0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c
    2022/06/02 11:38:25 [__main__] INFO: Collected 37525 arb pairs
    2022/06/02 11:38:25 [__main__] INFO: Quote token WBNB value: 1 BNB
    2022/06/02 11:38:26 [__main__] INFO: Quote token BUSD value: 0.00329429 BNB
    2022/06/02 11:38:26 [__main__] INFO: Quote token USDT value: 0.00329267 BNB
    2022/06/02 11:38:26 [__main__] INFO: Quote token Cake value: 0.014148 BNB
    2022/06/02 11:38:26 [__main__] INFO: Quote token ETH value: 5.98795 BNB
    2022/06/02 11:38:26 [__main__] INFO: Quote token BTCB value: 98.3489 BNB
    2022/06/02 11:38:27 [__main__] INFO: Quote token USDC value: 0.00329528 BNB
    2022/06/02 11:38:27 [__main__] INFO: Quote token SAFEMOON value: 3.54682e-10 BNB
    2022/06/02 11:38:27 [__main__] INFO: Quote token DAI value: 0.00328415 BNB
    2022/06/02 11:38:27 [__main__] INFO: Quote token BANANA value: 0.000508487 BNB
    
  • So next, we need to convert the qtoken profit to BNB profit value in all cases when checking the arbitrage setups, which is a simple multiplication now:
            # Convert the profit to value:
            qtoken = self.chain.get_token(qaddr)
            best_profit = qtoken.to_value(best_profit)
    
            # Convert the qtoken profit value into native (wrapped) token value:
            best_profit *= self.quote_token_values[qaddr]
    
            # Get the tokens:
            t0 = self.chain.get_token(t0addr)
            t1 = self.chain.get_token(t1addr)
            logger.info(
                "Found arb setup of %.4g %s with pairs on %s/%s", best_profit, self.native_symbol, t0.symbol(), t1.symbol()
            )
  • And now we can filter the arbitrage setups with a minimal profit value yeah 🥳!:
            # Convert the profit to value:
            qtoken = self.chain.get_token(qaddr)
            best_profit = qtoken.to_value(best_profit)
    
            # Convert the qtoken profit value into native (wrapped) token value:
            best_profit *= self.quote_token_values[qaddr]
    
            if best_profit < self.min_profit:
                # Profit too low, so we ignore this arb setup.
                return
    
            # Get the tokens:
            t0 = self.chain.get_token(t0addr)
            t1 = self.chain.get_token(t1addr)
            logger.info(
                "Block %d: Found arb setup of %.4g %s with pairs on %s/%s",
                self.last_block_number,
                best_profit,
                self.native_symbol,
                t0.symbol(),
                t1.symbol(),
            )
  • And from that I get the following kind of outputs:
    $ nvp arbman monitor-arbs
    2022/06/02 11:52:53 [nvh.crypto.blockchain.evm_blockchain] INFO: Keeping 409133/417141 quotable pairs.
    2022/06/02 11:52:54 [nvh.crypto.blockchain.evm_blockchain] INFO: Found 37562 arb compatible pairs
    2022/06/02 11:52:54 [__main__] INFO: wrapped native token address: 0xbb4CdB9CBd36B01bD1cBaEBF2De08d9173bc095c
    2022/06/02 11:52:54 [__main__] INFO: Collected 37562 arb pairs
    2022/06/02 11:52:54 [__main__] INFO: Min profit value: 0.001200
    2022/06/02 11:52:54 [__main__] INFO: Quote token WBNB value: 1 WBNB
    2022/06/02 11:52:54 [__main__] INFO: Quote token BUSD value: 0.00329354 WBNB
    2022/06/02 11:52:55 [__main__] INFO: Quote token USDT value: 0.00328904 WBNB
    2022/06/02 11:52:55 [__main__] INFO: Quote token Cake value: 0.0141319 WBNB
    2022/06/02 11:52:55 [__main__] INFO: Quote token ETH value: 5.99574 WBNB
    2022/06/02 11:52:56 [__main__] INFO: Quote token BTCB value: 98.3494 WBNB
    2022/06/02 11:52:56 [__main__] INFO: Quote token USDC value: 0.00329315 WBNB
    2022/06/02 11:52:56 [__main__] INFO: Quote token SAFEMOON value: 3.54682e-10 WBNB
    2022/06/02 11:52:56 [__main__] INFO: Quote token DAI value: 0.00327899 WBNB
    2022/06/02 11:52:56 [__main__] INFO: Quote token BANANA value: 0.000508487 WBNB
    2022/06/02 11:55:38 [__main__] INFO: Block 18336166: Found arb setup of 0.002436 WBNB with pairs on WBNB/QUACK#3
    2022/06/02 11:55:48 [__main__] INFO: Block 18336169: Found arb setup of 0.03963 WBNB with pairs on ASS#7/WBNB
    2022/06/02 11:56:20 [__main__] INFO: Block 18336180: Found arb setup of 0.001731 WBNB with pairs on WBNB/GINUX
    
  • Okay, so, this is all very interesting, but there are still a lot of steps to take before we can effectively do our first arbitrage transaction here lol.
  • Yet, as usual, this article is getting too big, so I will stop it here, and we will continue this journey in our next post ✌!
  • blog/2022/0602_crypto_initial_arb_manager.txt
  • Last modified: 2022/06/02 11:47
  • by 127.0.0.1