
System Design for ChatGPT
A deep dive into ChatGPT's system design. Learn to serve 225M DAU, stream sub-500ms responses, optimize expensive GPUs, and efficiently manage context at scale.
Moyez Rabbani
Hi, I'm Moyez — a Software Engineer from Kolkata, India with a genuine curiosity for the deeper side of Computer Science. I don't just use tools, I want to understand how they work under the hood.
Over the past 2.5 years, I've built end-to-end software that works well and looks even better. I care about design and user experience as much as I care about the code behind it — good software should feel effortless to use.
When I'm not coding, you'll find me reading or at the gym. I like building things, staying sharp, and never sitting still for too long.
Featured writing first, then the newest notes on engineering, design, and building useful products.

A deep dive into ChatGPT's system design. Learn to serve 225M DAU, stream sub-500ms responses, optimize expensive GPUs, and efficiently manage context at scale.

Services make Pods reachable; Ingress routes external traffic to the right Service. Learn ClusterIP, NodePort, LoadBalancer, and Ingress (controller vs resource).

How does LeetCode securely handle millions of code submissions? Dive into the system design, exploring Docker containers, worker queues, caching, and vital security constraints.
Leave a message here, or send an email at moyezrabbani.work@gmail.com