# Xatu Data > Ethereum network data via Parquet files or ClickHouse database ## Choose Your Access Method Xatu provides Ethereum network data through two primary access methods. Select the documentation that matches your needs: ### 🌐 Public Parquet Files (No Authentication) **Best for:** Data scientists, researchers, anyone wanting free access to Ethereum data 👉 **See https://raw.githubusercontent.com/ethpandaops/xatu-data/refs/heads/master/llms/parquet/llms.txt** for complete Parquet documentation - Public HTTP access, no credentials needed - Python, R, SQL, DuckDB examples - Daily updates with 1-3 day delay - Privacy-conscious (some columns redacted) 📚 **Advanced Parquet usage:** https://raw.githubusercontent.com/ethpandaops/xatu-data/refs/heads/master/llms/parquet/llms-full.txt --- ### 🔒 ClickHouse Database (Authentication Required) **Best for:** Real-time analysis, ethPandaOps partners, advanced users 👉 **See https://raw.githubusercontent.com/ethpandaops/xatu-data/refs/heads/master/llms/clickhouse/llms.txt** for complete ClickHouse documentation - Direct database access with lower latency - No redactions or delays - Production and experimental endpoints - Advanced query capabilities 📚 **Advanced ClickHouse usage:** https://raw.githubusercontent.com/ethpandaops/xatu-data/refs/heads/master/llms/clickhouse/llms-full.txt --- ## Quick Overview ### What is Xatu? Xatu is a comprehensive Ethereum network data collection and processing pipeline: - Multiple collection modules for different data types - Data stored in ClickHouse database - Public Parquet exports available - Run by ethPandaOps with community contributions ### Available Data - **Beacon API Event Stream** - Block/attestation timing from multiple sentries - **Canonical Beacon/Execution** - Deduplicated, authoritative chain data - **MEV Relay** - Block auction and builder data - **P2P Network Events** - Consensus and execution layer propagation - **CBT Tables** - Pre-aggregated analytics (ClickHouse only). Exists in $network databases. ### Networks - Mainnet (production Ethereum) - Holesky (testnet) - Sepolia (testnet) - Hoodi (devnet) - Experimental networks like devnets (via experimental endpoint) ## ⚠️ Critical: Query Performance **ALWAYS filter on partitioning columns** when querying: - Datetime tables: Filter on date/time columns (e.g., `slot_start_date_time`) - Integer tables: Filter on ranges (e.g., `block_number`) - Failure to partition = scanning billions of rows = very slow queries ## Getting Started 1. **Choose your access method** (Parquet or ClickHouse) 2. **Read the appropriate documentation** (links above) 3. **Check table catalog** for available data and date ranges 4. **Start with small queries** on recent data (last 24 hours) 5. **Always use partition filters** for better performance ## Additional Resources - **GitHub Repository**: https://github.com/ethpandaops/xatu-data - **Schema Repository**: https://github.com/ethpandaops/xatu-data/tree/master/schema/clickhouse - **Config File**: https://raw.githubusercontent.com/ethpandaops/xatu-data/master/config.yaml - **Contact**: ethpandaops@ethereum.org ## License Data licensed under CC BY 4.0