Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

廣告系統在Docker/Mesos上的可靠性實踐

Ad Serving on Mesos with docker container for reliability

  • Be the first to comment

  • Be the first to like this

廣告系統在Docker/Mesos上的可靠性實踐

  1. 1. 廣告系統在Docker/Mesos上的可靠性實 踐 Michael Apr.2014 聚效广告(MediaV)
  2. 2. Who Am I ?
  3. 3. Where is our system?
  4. 4. Where is our system?
  5. 5. Small Impression with Huge Computing AD Request 10億 200億+ QPS 100萬+1萬 Latency 500ms 10ms 60 DevOps Engineers 2000+ physical server 100+ module with realtime service 99.95% service availability
  6. 6. Why Container? Why Scheduler?
  7. 7. • 人為事故,debug,env changed etc… • 非人為故意,Bug, Crash, OOM, memory leak, disk full etc… • 外部原因,ad code • On-Call 恢復 • Scaling Service • 資源利用率
  8. 8. We are in 2016
  9. 9. We are in 2014
  10. 10. 2014Q4 touch lmctfy 2015Q1 try docker with k8s 2015Q2 docker on mesos/yarn? 2015Q3 we are running docker/mesos etc. 2016Q1 more batch job & LTS online 2015Q4 more service ci/release How to start?
  11. 11. MESOS可以為團隊帶來什麼?
  12. 12. 典型LTS adhoc任务轻服务 Free Free —100% —100% 資源使用分佈DEMO
  13. 13. 服務Docker容器化遇到的典型問題 SE7EN
  14. 14. 1/7
  15. 15. 1/7 “If you run SSHD in your Docker containers, you're doing it wrong!” https://jpetazzo.github.io/2014/06/23/docker- ssh-considered-evil/ –Jérôme Petazzoni
  16. 16. 2/7 where is my debug logs?
  17. 17. 3/7 Docker Network性能差?
  18. 18. /machinezone.github.io/research/networking-solutions-for-kubern
  19. 19. 4/7 如何寫本地文件?如何存儲持久化?
  20. 20. +
  21. 21. 5/7 服務的註冊和發現?
  22. 22. We’re OR
  23. 23. 6/7 如何讓服務可調度性?
  24. 24. 這是一個大問題,留給每個Dev工程師
  25. 25. 7/7 服務器的數據加載問題?
  26. 26. 拋棄 迎接 rsync cp scp ftp Everything API /Thrift
  27. 27. Marathon Framework on MESOS
  28. 28. Chronos Framework on MESOS
  29. 29. Chronos : batch job在分布式系統上的替代品
  30. 30. chronos cron azkaban distributed Yes No half Web UI Yes No Yes Job history Yes,Simple Manual Yes,Full dependency Yes,simple No Yes,full User Auth No No Yes Resource limit (cpu/mem/disk) Yes No No Debug log mesos sandbox Manual web UI
  31. 31. Docker/Mesos實踐過程中需要注意的地方
  32. 32. health check with Marathon on Mesos { "protocol": "COMMAND", "command": { "value": "curl -f -X GET http://$HOST:$PORT0/health" }, "gracePeriodSeconds": 300, "intervalSeconds": 60, "timeoutSeconds": 20, "maxConsecutiveFailures": 3 } { "protocol": "COMMAND", "portIndex": 0, "command": { "value": "nc localhost 8119" }, "gracePeriodSeconds": 300, "intervalSeconds": 60, "timeoutSeconds": 20, "maxConsecutiveFailures": 3, "ignoreHttp1xx": false }
  33. 33. Marathon port resource --resources="ports(*):[8000-9000, 31000-32000]"
  34. 34. Dockerfile review規則 Dockerfile必須Code Review Everything in codebase: code/config 禁止使用不穩定的wget/curl源 Port資源必須申請並註冊
  35. 35. Q&A ? ye.mikez@gmail.com zhangye@mvad.com

×