Welcome
Harsha
Reddy
Principal Data Engineer — building scalable data systems, writing about what I learn along the way.
Apache Spark
Kubernetes
Python
AWS / GCP
Kafka
dbt
Terraform
TypeScript
Latest
Recent Articles
Data Engineering 8 min read
Building Real-Time Data Pipelines with Kafka and Spark Structured Streaming
A practical guide to designing production-grade streaming pipelines that handle millions of events per second — with patterns I've battle-tested across multiple teams.
Cloud & DevOps 7 min read
Terraform Patterns I Actually Use for Multi-Cloud Data Platforms
Forget the toy examples — here are the Terraform patterns that survive contact with real infrastructure across AWS and GCP data platforms.
AI/ML 9 min read
A Practical Guide to Building RAG Pipelines That Don't Hallucinate
RAG sounds simple on paper — retrieve context, generate answer. In practice, there are a dozen ways it can go wrong. Here's how I build them for production.