How the Threads Team Built the Fastest App to Reach 100 Million Users
December 20, 2023
What does it take to build an app that reaches 100 million users in just 5 days? In 2023, Meta's Instagram team accomplished exactly this with Threads, achieving the fastest initial user acquisition in history - surpassing even ChatGPT's record.
But initial viral growth and sustained product success are different engineering challenges. While Threads captured headlines for its explosive launch, the real engineering lessons lie not in the growth metrics, but in how specific technical decisions enabled rapid development under extreme constraints.
Important caveat: This analysis examines engineering decisions that enabled fast initial deployment, not long-term product success. The techniques that work for rapid market entry under time pressure may differ from those needed for sustained growth and iteration.
The Pragmatic Engineer published an excellent interview with Threads engineers titled "Building Meta's Threads App (Real-World Engineering Challenges)". Let's examine the engineering principles behind their approach - and what they reveal about building products under extreme time and scale constraints.
Small Teams Move Faster - But Why?
When Instagram decided to build Threads, they faced a classic question: should they use a large team to move quickly, or a small team to avoid coordination overhead?
They chose small. The founding team consisted of just 3 product managers, 3 designers, and 60 engineers. But here's the key insight - they specifically recruited senior engineers to maintain high autonomy within the team.
This might seem counterintuitive - wouldn't more people mean faster delivery? The reality is more complex. Communication overhead grows quadratically with team size due to the number of possible communication paths: n(n-1)/2.
With 60 engineers, there are 1,770 potential communication paths. With 120 engineers, this jumps to 7,140 paths - over 4x the coordination complexity. By keeping the team small and hiring senior engineers who needed less guidance and could work more autonomously, they minimized this coordination cost while maintaining high individual productivity.
But wait - couldn't they have just reused Instagram's existing codebase to move even faster? Here's where many engineers get it wrong.
Build from Scratch or Reuse Code?
The Threads team made another counterintuitive choice: they built a completely new prototype instead of modifying Instagram's existing codebase.
Why would starting from scratch be faster than reusing proven code? Consider this: when you modify existing code, you inherit all its constraints, technical debt, and assumptions. You spend time understanding complex systems that weren't designed for your specific use case.
However - and this is crucial - they didn't rebuild everything. They leveraged Instagram's existing infrastructure: the tech stack, UI frameworks, server architecture, and deployment systems.
This hybrid approach optimized for two competing constraints:
- Development velocity: Clean codebase without technical debt or inappropriate abstractions
- Operational risk: Battle-tested infrastructure that could handle 100M+ users without unknown failure modes
The trade-off was strategic: accept the development cost of rebuilding application logic to avoid the operational risk of new infrastructure at unprecedented scale. For a product that needed to handle massive traffic from day one, infrastructure reliability trumped code reuse.
Technology Stack Decisions
How do you choose technology when speed and scale are both critical? The Threads team's stack reveals their decision-making framework. Working within a 5-month timeline from first line of code to App Store submission, they chose:
- Backend: Python + Django for REST APIs
- Inter-service communication: GraphQL using Meta's internal Hack language
- iOS: Swift with some Objective-C
- Android: Jetpack Compose (Kotlin + Java)
This technology choice reveals a crucial principle: optimize for constraints, not novelty.
When building under extreme time pressure (5 months) with aggressive scale targets (100M+ users), each technology decision carries massive risk. Using Instagram's existing stack provided three critical advantages:
- Zero learning curve: Engineers could implement features immediately without mastering new APIs or debugging unfamiliar tooling
- Battle-tested at scale: Every component had already proven itself under Instagram's traffic loads
- Existing infrastructure: Deployment pipelines, monitoring, and operational knowledge transferred directly
The constraint was time-to-market with guaranteed scalability. Novel technology would optimize for different goals - potentially better developer experience or performance - but at the cost of these constraints.
Quality Without Automation - How?
Here's where the Threads approach gets really interesting. They deliberately chose not to invest heavily in automated end-to-end testing initially.
This decision reflects a fundamental trade-off in software development: test investment vs. change velocity.
End-to-end tests create coupling between tests and UI implementation. When you're rapidly iterating on user interface based on feedback, this coupling becomes expensive - every UI change breaks multiple tests, requiring constant test maintenance that slows feature development.
However, they didn't abandon testing entirely. Their approach targeted different risk levels:
Unit tests for business logic: Core functionality (post creation, user management, feed algorithms) remained stable regardless of UI changes. These tests provided value without constraining iteration.
Internal testing ("dogfooding"): Team members used the app extensively, starting with the core team to avoid overwhelming feedback. This caught integration issues without test maintenance overhead.
Manual QA for critical paths: User registration, posting, and core interactions received focused manual testing since these directly impacted growth metrics.
The principle: optimize test strategy for your change patterns. High-change areas (UI) got lightweight testing; stable areas (business logic) got comprehensive coverage.
From Quality to Operations: Measuring What Matters
Their unconventional quality approach required equally thoughtful operational monitoring. Since they couldn't rely on comprehensive automated tests to catch issues before deployment, they needed metrics that would immediately reveal problems in production.
Once Threads launched, the team closely monitored PREQ metrics (Performance, Reliability, Efficiency, Quality). Their key indicators included:
- Time from cold app start to seeing the first post
- Frame drops during scrolling
- QPS and success rates for core APIs
- Queue lengths for complex background tasks
Why these specific metrics? They directly correlate with user experience. A user who can't see content quickly or experiences laggy scrolling will abandon the app, regardless of how well your backend systems are performing.
For engineers starting their careers, this highlights a crucial skill: choosing metrics that actually predict user behavior, not just internal system health.
Engineering Principles Under Extreme Constraints
The Threads case study shows how engineering decisions change when time and scale matter most. They had 5 months to serve 100 million users. The biggest challenge was not perfect code - it was launching fast with systems that could handle massive traffic.
Their approach was simple: match your development process to how often things change. Parts of the app that changed frequently (like the user interface) used lightweight development processes. Parts that stayed stable (like core business logic) used more thorough processes. This wasn't random - it was a deliberate strategy based on risk.
They also made a key trade-off. They chose to rebuild application code instead of modifying Instagram's existing code. They chose manual testing over automated testing. These choices created more work later, but they helped ship faster under their constraints.
The underlying principle: engineering decisions should optimize for your actual constraints, not theoretical best practices. Threads succeeded not by following conventional wisdom, but by understanding their specific trade-offs and optimizing accordingly.
Interestingly, by the time this article was written, Threads had evolved significantly and even launched a web version - built using StyleX, the CSS solution we've discussed in What is StyleX? What Problems Does it Solve?.
Support ExplainThis
If you found this content helpful, please consider supporting our work with a one-time donation of whatever amount feels right to you through this Buy Me a Coffee page, or share the article with your friends to help us reach more readers.
Creating in-depth technical content takes significant time. Your support helps us continue producing high-quality educational content accessible to everyone.