UNPKG

@bcoders.gr/evm-disassembler

Version:

A comprehensive EVM bytecode disassembler and analyzer with support for multiple EVM versions

575 lines (457 loc) 18.6 kB
# EVM Disassembler A comprehensive Node.js library for disassembling and analyzing Ethereum Virtual Machine (EVM) bytecode. This tool provides detailed analysis including function detection, stack analysis, security checks, and multiple output formats. ## Features - 🔍 **Complete Bytecode Decoding** - Decode all EVM opcodes with support for multiple EVM versions - 🎯 **Function Detection** - Automatically detect function selectors and signatures - 📊 **Stack Analysis** - Track stack depth changes and detect potential issues - 🔒 **Security Analysis** - Identify dangerous opcodes and potential vulnerabilities - 📝 **Multiple Output Formats** - Text, JSON, Assembly, Markdown, and CSV - 🏷️ **Metadata Detection** - Extract compiler information and metadata - ⚡ **Performance Optimized** - Efficient parsing and analysis algorithms - 🔧 **Modular Architecture** - Use individual components as needed ## Installation ```bash npm install @bcoders.gr/evm-disassembler ``` ## Quick Start ```javascript const { EVMDisassembler } = require('@bcoders.gr/evm-disassembler'); // Create disassembler instance const disassembler = new EVMDisassembler(); // Example bytecode (simple contract) const bytecode = '0x608060405234801561001057600080fd5b50610150806100206000396000f3fe...'; // Disassemble with full analysis const result = disassembler.disassemble(bytecode); // Format as text console.log(disassembler.format(result, 'text')); // Get JSON output const jsonOutput = disassembler.format(result, 'json', { pretty: true }); ``` ## API Reference ### EVMDisassembler Class #### Constructor ```javascript new EVMDisassembler(options) ``` Options: - `evmVersion` (string): EVM version to use ('homestead', 'byzantium', 'constantinople', 'istanbul', 'berlin', 'london', 'paris', 'shanghai', 'cancun', 'latest'). Default: 'latest' - `includeMetadata` (boolean): Include metadata detection. Default: true - `stopAtMetadata` (boolean): Stop disassembly at metadata boundary. Default: true - `performStackAnalysis` (boolean): Perform stack depth analysis. Default: true - `performFunctionAnalysis` (boolean): Detect functions and signatures. Default: true - `performSecurityAnalysis` (boolean): Perform security checks. Default: true #### Methods ##### disassemble(bytecode) Perform complete disassembly with all analysis. ```javascript const result = disassembler.disassemble(bytecode); ``` Returns an object containing: - `instructions`: Array of decoded instructions - `metadata`: Detected compiler metadata - `functions`: Detected function signatures - `stack`: Stack analysis results - `security`: Security analysis findings - `summary`: High-level summary ##### decode(bytecode) Decode bytecode without analysis. ```javascript const instructions = disassembler.decode(bytecode); ``` ##### format(results, format, options) Format disassembly results. Supported formats: - `'text'`: Human-readable text format - `'json'`: JSON format - `'assembly'` or `'asm'`: Assembly-like syntax - `'markdown'` or `'md'`: Markdown documentation - `'csv'`: CSV format ```javascript const textOutput = disassembler.format(result, 'text'); const jsonOutput = disassembler.format(result, 'json', { pretty: true }); ``` ##### validate(bytecode) Quick validation without full disassembly. ```javascript const validation = disassembler.validate(bytecode); if (validation.valid) { console.log('Bytecode is valid'); } ``` ##### analyzeFunctionsWithPatterns(bytecode) Get detailed function analysis including opcode patterns for comparison. ```javascript const functionAnalysis = disassembler.analyzeFunctionsWithPatterns(bytecode); // Access function patterns functionAnalysis.functionsWithPatterns.forEach(func => { console.log(`Function ${func.selector} (${func.signature}):`); console.log(` Pattern Hash: ${func.patternHash}`); console.log(` Instructions: ${func.instructionCount}`); console.log(` Opcode Pattern: ${func.opcodePattern.join(', ')}`); console.log(` Stack Ops: ${JSON.stringify(func.stackOperations)}`); console.log(` Storage Ops: ${JSON.stringify(func.storageOperations)}`); }); // Find similar functions if (functionAnalysis.patternComparisons.exactMatches.length > 0) { console.log('Functions with identical patterns:'); functionAnalysis.patternComparisons.exactMatches.forEach(match => { console.log(` Pattern ${match.patternHash}: ${match.count} functions`); match.functions.forEach(f => console.log(` - ${f.selector}: ${f.signature}`)); }); } if (functionAnalysis.patternComparisons.similarFunctions.length > 0) { console.log('Functions with similar patterns:'); functionAnalysis.patternComparisons.similarFunctions.forEach(comp => { console.log(` ${comp.function1.selector} ~ ${comp.function2.selector} (${Math.round(comp.similarity * 100)}% similar)`); }); } ``` ##### compareFunctionPatterns(bytecode) Compare function patterns to find similarities. ```javascript const comparison = disassembler.compareFunctionPatterns(bytecode); console.log(`Found ${comparison.exactMatches.length} exact pattern matches`); console.log(`Found ${comparison.similarFunctions.length} similar function pairs`); ``` ### Convenience Functions ```javascript const { disassemble, decode } = require('evm-disassembler'); // Quick disassembly const result = disassemble(bytecode); // Quick decode const instructions = decode(bytecode); ``` ## Output Examples ### Text Format ``` PC | OPCODE | HEX | DATA -------------------------------------------------- 0 | PUSH1 | 60 | 0x80 2 | PUSH1 | 60 | 0x40 4 | MSTORE | 52 | 5 | CALLVALUE | 34 | 6 | DUP1 | 80 | 7 | ISZERO | 15 | 8 | PUSH2 | 61 | 0x0010 ``` ### Assembly Format ```assembly PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x0010 ; 16 JUMPI label_0: PUSH1 0x00 DUP1 REVERT ``` ### JSON Format ```json { "summary": { "totalInstructions": 150, "bytecodeSize": 336, "functionCount": 5, "securityScore": 85 }, "functions": [ { "selector": "a9059cbb", "signature": "transfer(address,uint256)", "isKnown": true, "opcodePattern": ["PUSH", "CALLDATALOAD", "PUSH", "SHR", "DUP1", "PUSH", "EQ", "PUSH", "JUMPI"], "patternHash": "a1b2c3d4e5f6789a", "instructionCount": 42, "stackOperations": { "pushes": 15, "pops": 8, "dups": 3, "swaps": 2 }, "storageOperations": { "loads": 2, "stores": 1 }, "memoryOperations": { "loads": 1, "stores": 0 }, "controlFlow": { "jumps": 3, "calls": 0, "returns": 1 } } ], "patternComparisons": { "exactMatches": [ { "patternHash": "a1b2c3d4e5f6789a", "functions": [ {"selector": "a9059cbb", "signature": "transfer(address,uint256)"}, {"selector": "23b872dd", "signature": "transferFrom(address,address,uint256)"} ], "count": 2 } ], "similarFunctions": [ { "function1": {"selector": "a9059cbb", "signature": "transfer(address,uint256)"}, "function2": {"selector": "095ea7b3", "signature": "approve(address,uint256)"}, "similarity": 0.85, "similarityType": "high" } ] }, "instructions": [ { "pc": 0, "opcode": "PUSH1", "pushData": "0x80" } ] } ``` ## Advanced Usage ### Custom EVM Version ```javascript const disassembler = new EVMDisassembler({ evmVersion: 'london' }); ``` ### Security-Focused Analysis ```javascript const result = disassembler.disassemble(bytecode); if (result.security.score < 70) { console.warn('Security issues detected:'); result.security.potentialVulnerabilities.forEach(vuln => { console.warn(`- ${vuln.type}: ${vuln.description}`); }); } ``` ### Function Detection Only ```javascript const functions = disassembler.detectFunctions(bytecode); console.log(`Found ${functions.totalFunctions} functions`); functions.functions.forEach(func => { console.log(`- ${func.selector}: ${func.signature}`); }); ``` ### Stack Analysis ```javascript const stackAnalysis = disassembler.analyzeStack(bytecode); console.log(`Max stack depth: ${stackAnalysis.maxDepth}`); if (stackAnalysis.hasErrors) { console.error('Stack errors detected:', stackAnalysis.errors); } ``` ## Error Handling ```javascript const { InvalidBytecodeError } = require('evm-disassembler'); try { const result = disassembler.disassemble(bytecode); } catch (error) { if (error instanceof InvalidBytecodeError) { console.error('Invalid bytecode:', error.message); } else { console.error('Disassembly failed:', error.message); } } ``` ## Pattern Analysis and Function Comparison The disassembler now includes advanced pattern analysis to help identify similar functions and code reuse: ```javascript const { EVMDisassembler } = require('@bcoders.gr/evm-disassembler'); const disassembler = new EVMDisassembler(); const analysis = disassembler.analyzeFunctionsWithPatterns(bytecode); // Find functions with identical implementations console.log('Identical Functions:'); analysis.patternComparisons.exactMatches.forEach(match => { console.log(`Pattern ${match.patternHash}:`); match.functions.forEach(func => { console.log(` - ${func.selector}: ${func.signature}`); }); }); // Find functions with similar implementations console.log('Similar Functions:'); analysis.patternComparisons.similarFunctions.forEach(pair => { const similarity = Math.round(pair.similarity * 100); console.log(`${pair.function1.signature} ≈ ${pair.function2.signature} (${similarity}% similar)`); }); // Analyze function complexity analysis.functionsWithPatterns.forEach(func => { console.log(`${func.signature}:`); console.log(` Instructions: ${func.instructionCount}`); console.log(` Stack Operations: ${func.stackOperations.pushes} pushes, ${func.stackOperations.pops} pops`); console.log(` Storage Access: ${func.storageOperations.loads} reads, ${func.storageOperations.stores} writes`); console.log(` External Calls: ${func.controlFlow.calls}`); }); ``` #### Pattern Hash Each function gets a unique pattern hash based on its opcode sequence. Functions with identical hashes have the same implementation logic (ignoring specific values). #### Similarity Scoring The similarity algorithm compares opcode patterns using Levenshtein distance, normalized by sequence length: - **0.9-1.0**: Very high similarity (likely copy/paste with minor changes) - **0.7-0.9**: High similarity (similar logic, different implementations) - **< 0.7**: Low similarity (filtered out by default) ## ERC20 Source Code Analysis The disassembler now includes advanced ERC20 contract analysis capabilities for Solidity source code: ### Basic ERC20 Analysis ```javascript const { EVMDisassembler } = require('evm-disassembler'); const disassembler = new EVMDisassembler(); // Analyze Solidity source code const solidityCode = ` pragma solidity ^0.8.0; contract MyToken { string public name = "MyToken"; string public symbol = "MTK"; uint256 public totalSupply = 1000000; mapping(address => uint256) public balanceOf; mapping(address => mapping(address => uint256)) public allowance; function transfer(address to, uint256 amount) public returns (bool) { // implementation return true; } function approve(address spender, uint256 amount) public returns (bool) { // implementation return true; } function transferFrom(address from, address to, uint256 amount) public returns (bool) { // implementation return true; } function balanceOf(address account) public view returns (uint256) { // implementation return balanceOf[account]; } function totalSupply() public view returns (uint256) { return totalSupply; } } `; // Extract ERC20 data const erc20Data = disassembler.extractERC20Data(solidityCode); console.log('Contract Analysis:'); console.log(`Contract Name: ${erc20Data.extraction_summary.contract_name}`); console.log(`Total Functions: ${erc20Data.extraction_summary.total_functions}`); console.log(`Public Functions: ${erc20Data.extraction_summary.public_functions}`); console.log(`Total Variables: ${erc20Data.extraction_summary.total_variables}`); console.log(`Total Mappings: ${erc20Data.extraction_summary.total_mappings}`); // List all functions console.log('\nFunctions:'); erc20Data.functions.forEach(func => { console.log(` ${func.name}(${func.parameters.join(', ')}) ${func.visibility} ${func.state_mutability}`); }); // List all variables console.log('\nState Variables:'); erc20Data.variables.forEach(variable => { console.log(` ${variable.type} ${variable.visibility} ${variable.name}`); }); // List all mappings console.log('\nMappings:'); erc20Data.mappings.forEach(mapping => { console.log(` ${mapping.full_type} ${mapping.visibility} ${mapping.name}`); }); ``` ### Combined Bytecode + Source Analysis ```javascript // Analyze both bytecode and source code together const combinedAnalysis = disassembler.analyzeWithSource(bytecode, solidityCode); console.log('Combined Analysis Results:'); console.log(`Has Source Code: ${combinedAnalysis.combined.hasSourceCode}`); console.log(`Is ERC20: ${combinedAnalysis.combined.isERC20}`); console.log(`Source Matches Bytecode: ${combinedAnalysis.combined.sourceMatchesBytecode.match}`); if (combinedAnalysis.combined.sourceMatchesBytecode.match) { const comparison = combinedAnalysis.combined.sourceMatchesBytecode; console.log(`Match Percentage: ${comparison.matchPercentage}%`); console.log(`Common Functions: ${comparison.commonFunctions.join(', ')}`); if (comparison.missingInBytecode.length > 0) { console.log(`Functions in source but not in bytecode: ${comparison.missingInBytecode.join(', ')}`); } if (comparison.extraInBytecode.length > 0) { console.log(`Functions in bytecode but not in source: ${comparison.extraInBytecode.join(', ')}`); } } // Access both analyses const bytecodeAnalysis = { functions: combinedAnalysis.functions, security: combinedAnalysis.security, patterns: combinedAnalysis.patterns }; const sourceAnalysis = combinedAnalysis.sourceAnalysis; ``` ### Individual Extraction Methods ```javascript // Check if contract is ERC20 const isERC20 = disassembler.isERC20Contract(solidityCode); console.log(`Is ERC20: ${isERC20}`); // Extract only functions const functions = disassembler.extractSourceFunctions(solidityCode); functions.forEach(func => { console.log(`${func.name}: ${func.full_signature}`); }); // Extract only variables const variables = disassembler.extractSourceVariables(solidityCode); variables.forEach(variable => { console.log(`${variable.type} ${variable.name}`); }); // Extract only mappings const mappings = disassembler.extractSourceMappings(solidityCode); mappings.forEach(mapping => { console.log(`${mapping.name}: ${mapping.full_type}`); }); // Extract contract name const contractName = disassembler.extractContractName(solidityCode); console.log(`Contract: ${contractName}`); ``` ### Advanced Source Analysis Features - **Function Extraction**: Captures visibility, state mutability, parameters, and return types - **Variable Detection**: Identifies all state variables with their types and visibility - **Mapping Analysis**: Extracts mapping structures with key-value type information - **ERC20 Validation**: Automatically detects if source code implements ERC20 standard - **Bytecode Comparison**: Compares source functions with detected bytecode functions - **Comprehensive Reporting**: Provides detailed statistics and summaries ## 🚀 Latest Features & Improvements ### v2.0 - Advanced Pattern Analysis & ERC20 Support #### 🎯 Function Pattern Analysis - **Opcode Pattern Extraction**: Each detected function now includes its complete opcode pattern for comparison - **Pattern Hashing**: Unique fingerprints for identical function implementations - **Similarity Detection**: Advanced algorithm to find functions with similar logic (70%+ similarity threshold) - **Cross-Function Comparison**: Automatic detection of code reuse and similar implementations #### 🪙 ERC20 Source Code Analysis - **Complete ERC20 Detection**: Automatically identifies ERC20 contracts in Solidity source code - **Function Extraction**: Detailed analysis of all functions with parameters, visibility, and state mutability - **Variable Analysis**: Extraction of state variables with type and visibility information - **Mapping Detection**: Comprehensive mapping structure analysis - **Source-Bytecode Correlation**: Compare source code functions with detected bytecode functions #### 🔍 Enhanced Analysis Capabilities - **Pattern Comparison Engine**: Find duplicate and similar function implementations - **Levenshtein Distance Algorithm**: Precise similarity scoring between opcode patterns - **Security Pattern Detection**: Identify common security patterns (Ownable, Pausable, etc.) - **Complexity Metrics**: Advanced code complexity analysis based on patterns #### 📊 Improved Output Formats - **Extended JSON Output**: Includes pattern data, similarity scores, and source analysis - **Pattern Visualization**: Clear representation of function opcode patterns - **Comparison Reports**: Detailed similarity analysis between functions ### New API Methods ```javascript // Pattern analysis const patterns = disassembler.analyzeFunctionsWithPatterns(bytecode); const comparison = disassembler.compareFunctionPatterns(bytecode); // ERC20 source analysis const erc20Data = disassembler.extractERC20Data(sourceCode); const isERC20 = disassembler.isERC20Contract(sourceCode); // Combined analysis const combined = disassembler.analyzeWithSource(bytecode, sourceCode); ``` ### Example Use Cases 1. **Smart Contract Auditing**: Identify similar functions that might share vulnerabilities 2. **Code Reuse Detection**: Find copied implementations across different contracts 3. **Pattern-Based Security Analysis**: Detect common security patterns and anti-patterns 4. **Source Code Verification**: Validate that source code matches deployed bytecode 5. **Contract Classification**: Automatically identify token standards and contract types