Skip to content

Data Engineering Resources

📊 Data Engineering Resources

Resources for data pipelines, ETL, big data, and data infrastructure.

📖 Books

Essential Data Engineering Books
  1. Fundamentals of Data Engineering by Joe Reis - Data engineering basics
  2. Designing Data-Intensive Applications by Martin Kleppmann - Data systems
  3. The Data Warehouse Toolkit by Ralph Kimball - Data warehousing
  4. Building Data Science Applications with FastAPI - Data applications

📄 Research Papers

Data Engineering Research
  1. MapReduce: Simplified Data Processing - Distributed processing
  2. The Google File System - Distributed storage
  3. Bigtable: A Distributed Storage System - NoSQL database
  4. Apache Spark: A Unified Engine - Spark architecture

⭐ GitHub Repositories

Important Data Engineering Repos
  1. Awesome Data Engineering - Curated resources
  2. Data Engineering Projects - Data projects
  3. Apache Spark - Big data processing
  4. Apache Airflow - Workflow orchestration
  5. Data Engineering Zoomcamp - Free course

🎥 Videos & Courses

Video Resources
  1. Data Engineering Zoomcamp - Free course
  2. Data Engineering Tutorials - Tutorials
  3. Apache Spark Tutorials - Spark tutorials

📰 Articles & Blogs

Recommended Blogs
  1. Seattle Data Guy - Data engineering blog
  2. Locally Optimistic - Data team blog
  3. Data Engineering Podcast - Podcast and blog
  4. Airflow Blog - Airflow updates
Additional Resources
  1. Data Engineering Roadmap - Learning path
  2. Data Engineering Guide - Data engineering cookbook
  3. Data Engineering Resources - Wiki resources