I designed and implemented a complete post-merger sales data platform using Databricks, PySpark, SQL, Delta Lake, and AWS S3, focused on incremental processing, automation, and analytics delivery. ✅ What I Implemented 🥉 Bronze Layer (Raw Ingestion) Incremental ingestion from AWS S3 Raw data preserved for audit & replay 🥈 Silver Layer (Data Cleaning & Standardization) City name typo standardization Customer name normalization (trim, initcap) Invalid & negative price correction Duplicate record removal Schema alignment across merged datasets 🥇 Gold Layer (Business-Ready Data) Daily → monthly sales aggregation Upserts (MERGE) into unified fact tables Denormalized views for fast analytics ⏱️ Automation & Scheduling Databricks Jobs for scheduled pipeline execution Incremental loads across all layers 📊 Databricks SQL Dashboard Revenue KPIs Monthly revenue trends Top products & customers Channel-wise revenue contribution Leadership-ready single analytics layer 🛠️ Tech Stack Databricks | PySpark | Spark SQL | Delta Lake | AWS S3 | Python | Databricks Jobs | Databricks SQL Dashboards 🔗 GitHub (Full Code & Pipeline): 👉 https://lnkd.in/gAtDBafu