Managing a team and project are quite synonymous. Especially, teams require effective distribution of responsibility / roles. Once that is setup, a proper process guides people to make progress. All this fits into a product lifecycle, which is essential to develop the right product, in the right way, and deliver it at the right time.
● Goal: agility.
● Services are product specific
● Responsible for both frontend & backend services.
● Connect with backend infrastructure services to do heavy lifting.
● Goal: stability.
● Functional -> Product Infrastructure -> Core infrastructure
● Each service is used by variety of clients.
● Pre-contracted SLA contract: QPS, performance
● Company-wide / Product-wide impact.
Infrastructure teams: examples
● Common layer for container deployment & orchestration
● Microservices framework.
● Common monitoring framework.
● Common alerting framework.
● Common storage offerings with inbuilt usage monitoring.
● People come up with ideas
● Pitch ideas / prototype
○ To fellow engineers, managers
○ To leadership, customers
● Gain momentum. Move fast. Move faster.
● It’s like running a startup
Effective flat hierarchy operation
● Different functions
○ Product vision
○ Technical complexity
○ Process leadership
○ Maintaining stability
● Different individual skillset
● Different teams different priorities.
4 key players
● Product manager
● Technical Lead (TL)
● Program Manager
● Provides direction & vision for the product
● mini-CEO for the product
● Works with TLs/UX to define detail specifics
● Works with Sales, Marketing for go-to-market
“It is like being the Conductor of an orchestra”
the key role,
but PMs are
Influence without authority
Technical Lead a.k.a TL
● Provides technical direction
○ Getting the technical architecture right
○ Negotiating product specs with PMs
● Responsible for keeping eng-team productive
○ Coordinates sub-tasks within team
● Main point-of-contact with external teams
● Not a manager, and has no reports usually
(my code, my algos)
(my team x 10)
From Junior SWE
● Provides structure to big projects
● Influence spans multiple teams
● Manage plans, schedules and drive deployment of projects
● Proficient in technical details and organizational skills
● Influence authority through processes and structure
● Responsible for keeping downtimes low (add another 9)
● Responsible for building and maintaining large scale
● Responsible for capacity planning
● Not all teams have SREs - they have to earn the traffic to
SREs optimize for uptime, not for egos of individual
teams / people
What do they do ?
● Coach. Help people achieve their individual goals
● Well-being & employee success. Finding the right team-fit.
● Collate feedback from a person’s peers.
● Resolve interpersonal issues.
Who are they ?
● Sometimes TLs are TLM (TL + Manager)
● Part of a larger team, but not responsible for product decisions
One good product, and many many
“ … no good product-market fit”
“ … there were too many things changing. In the end, the backend
architecture was a mess:.
“ … it was too bloated. Maintenance hazard”
“ … too much coordination was required across different teams”.
It’s worthwhile thinking about
Double Diamond: Strategy + Execution of the right solutioin
Step 1: PM / TL have an idea
Step 2: Create a short Product-requirement
document (few paragraphs) and circulate it
quickly among team(s).
Step 3: Discuss & iterate.
- Vision Document
- Product Requirements Document
Step 4: Technical design / architecture proposal
Step 5: Circulate it among your own team first.
Step 6: Resolve feedback, rinse repeat.
Step 7: Circulate the design doc with other impacted teams
Step 8: Define milestones and respective scopes, Two
special milestones - “dogfood” and “launch”
- Technical design doc with full detail
- Milestones & Dependency plan
Step 9: Code, Code, Code. Daily / Bi-weekly
standups until milestones.
Step 10: Dark Launch (feature / product only
enabled for the team)
Step 11: Company dogfood launch
Step 12: Public Launch
- Emergency playbook
Step 13: Discover bugs. Fix bugs. Push
Step 14: More teams want to launch similar stuff.
They want to reuse some components.
Refactoring galore !
Fun fact: Google code changes about 50% in
- Stable Product
- Reusable infrastructure
Step 15: Some more generic infrastructure
components are identified and new backend created.
Step 16: Slice of traffic is sent to new backend to try
Step 17: New backend is stable. Traffic to old
backend is turned off.
Step 18: Old backend is turned off and deleted if no
one is using it.
- Stronger, Faster, Higher
Product Requirements Doc (PRD)
Defines the “What”, “Why” and for “Whom”.
“If PRD is done well, still might not be a successful product, but it is certain that if
the PRD is not done well, it is nearly impossible to build a good product. “
- Silicon Valley Product Group
Technical Design Doc (by Malte Ubl, Tech Lead Google AMP
● Document the software design.
● Clarify the problem being solved.
● Act as discussion platform to further refine
● Explain the reasoning behind those
decisions and tradeoffs made.
● List alternative designs and why they were
● Support future maintainers and other
interested parties to understand the
● Establish non-technical requirements of the
● Be independently understandable by a
person with no background in software
● Act as user documentation for product or
The Chromium projects (Design Docs)
“Hope is not a strategy.”
● Note down past failures,
○ What went wrong
○ What was the immediate fix
○ What is the long term fix
○ How to prevent it in future.
● Avoiding culture of “heros”
Ruthlessly measure progress
● Define milestones
○ Date - Monthly / quaterly
○ Features / items in scope
● Define progress
○ What metrics
○ Define levels of achievement
■ Red = Below Expectations
■ Yellow = Meets expectations
■ Green = Stretch Goal
● Product launch is just one of the milestones
Old adage - what you can measure, is what you can improve.
● Unit of time for progress check
● 2 week to 1 month long
● Priorities should remain constant within a sprint
● Execution style: agile vs waterfall
From one sprint to next, specs might change, priorities might change. Minimize
throw away work.
Software time estimation is hard. Iterations help manage estimation risk.
● Not just about informing stakeholders
● What is working, what is not.
● What is pending, what is not.
● What is working amazing, what needs more work.
Most engineers, what to work independently, and later emerge with their
In practice, results in wasted efforts, and missing corner cases.
“I think, we should cancel that weekly sync, there is no use for it”.
“We should write a script for that. We get too many bugs doing that manually”.
“If we launch this hack, it would help us launch and iterate.”
Remember - Goal of a process is to add efficiency for the overall team towards the
Obviate unnecessary artifacts
● Pretty docs
● Over documentation
● Status meetings
● Lengthy launch approvals
● Milestone planning leads to backlog
● Features / Bugs that were P0 before, may be P2 in later milestones.
● Careful management of backlog helps manage technical debt
● Openness is the key
○ Bugs burn rates
○ Who’s working on what