SEO Reporting Pipeline
(Data Engineering)
Built a production ETL pipeline for Herd, merging historical ranking data
with a new tracking platform for one of the world's leading airlines and hospitality brands.
Validated the existing system, expanded its capabilities, and redesigned the architecture
for multi-client scalability across 15+ brand websites and 13 international markets.
The Challenge
The agency needed to migrate from a legacy ranking tool to a new platform while preserving years of historical data. Their existing reporting system had two significant problems: Performance and reliability - all calculations and filtering happened within Looker Studio, making reports slow and prone to bugs. Scalability - each client had a separate codebase to generate reporting tables, making maintenance difficult and expansion costly. I took full ownership of the project, delivering a complex data engineering system end-to-end.
What I Delivered
- Validated and enhanced the calculated fields table - correcting calculation logic, adding filter granularity, and rebuilding metrics to cascade down to multiple levels to support Looker filtering
- Moved calculation logic from Looker into BigQuery - faster queries, fewer bugs, easier to validate
- Designed cross-market aggregation methodology for unified reporting across 5+ international markets
- Consolidated separate per-client codebases into a single templated system - adding new clients now requires only configuration changes, no code modifications
- Built cross-region data transfer process for 1.9 billion rows of historical data
- Created comprehensive documentation and validation queries for ongoing monitoring
Technical Highlights
- Three-stage daily pipeline with scheduled dependencies (extract 05:00 → merge 09:00 → calculate 11:00 UTC)
- Configuration-driven onboarding - new clients added via database tables, not code changes
- Idempotent processing - queries safe to re-run without creating duplicates
- Multi-level hierarchy - metrics calculated at GLOBAL, market, and 5 levels of keyword grouping (L1-L5)
- SEO metrics implemented - visibility score, weighted visibility, search volume, ranking search volume, estimated click volume, position bucket distributions
Key Technologies
BigQuery, SQL, Scheduled Queries, Google Cloud Storage, Looker Studio, Git.
Skills Demonstrated
SQL development, ETL pipeline design, data quality assurance, debugging production systems, technical documentation, stakeholder communication, independent problem-solving.