Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Whoscall 的 Realtime Monitoring 經驗分享

像 Whoscall 這種用戶遍及全球、下載量逾兩千萬的手機軟體,整體服務監控及示警是怎麼做的?此演講會針對數個服務環節,探討我們運用開源軟體及付費服務的實務經驗。

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Login to see the comments

Whoscall 的 Realtime Monitoring 經驗分享

  1. 1. 
 
 
 
 Architect @ Gogolook How Realtime Monitoring
 Works in…
  2. 2. ★ Instant Caller Identification Whoscall identifies background information of incoming unknown calls in seconds through tags reported by other users, Internet search results, and our comprehensive global database. Instant Caller Identification
  3. 3. Incoming Call Dialogue Incoming Call DialogueFraud Call Business Corporation Restaurant
  4. 4. ★ Database with over 700 Million Phone Numbers Whoscall boasts an online database with over 700 million phone numbers. The database of Whoscall covers yellow pages, spammers, telemarketers, costumer services...,etc. with numerous community tags contributed by users and comments based on real users’ experiences. Database & Number Details 0287871XXX 1111 Job Bank 1111 – Job bank of trust No.1, Lane 35, An Ho Road, Taipei City, Taiwan http://www.1111.com
  5. 5. 3 of every 5 strangers’ calls
 can be identified. Over 500M phone calls are identified every month. 3,000 spammer numbers are reported by Whoscall users every month in Taiwan. Number Identification – 2015.03 – 2015.03
  6. 6. Market United States Brazil Saudi Arabia India Thailand Indonesia Malaysia Taiwan Japan South Korea Hong Kong South
 Korea Taiwan Hong
 Kong Japan India Top 5 countries of Whoscall users
  7. 7. Join us in creating a contact network of trust
  8. 8. 
 10
  9. 9. ❶ 11
  10. 10. 12
  11. 11. 12
  12. 12. 13
  13. 13. 14
  14. 14. 15
  15. 15. ❷ 16
  16. 16. 17
  17. 17. 18
  18. 18. • • • • 19 👎
  19. 19. • • • • 19 • • • • 👍👎
  20. 20. 20 Consequence Likelihood Risk register
  21. 21. 20 Consequence Likelihood service outage app crash Risk register
  22. 22. 20 Consequence Likelihood service outage app crash db outage Risk register
  23. 23. 20 Consequence Likelihood service outage app crash db outage malfunction Risk register
  24. 24. 20 Consequence Likelihood service outage app crash db outage poor performance capacity shortage malfunction Risk register
  25. 25. 20 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  26. 26. 21 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register short term detection & recovery
  27. 27. 21 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register long term diagnosis short term detection & recovery
  28. 28. 22 API servers DB servers
  29. 29. 22 Load balancer API servers DB servers
  30. 30. CDN 22 Load balancer API servers DB servers
  31. 31. CDN Virtual Private Cloud 22 Load balancer API servers DB servers
  32. 32. 23 VPC CloudFront ELB API servers MongoDB
  33. 33. 24 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  34. 34. 24 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Service levels: • availability • response time • error Risk register
  35. 35. 25 CloudFront CloudWatch Black Box
  36. 36. 25 CloudFront CloudWatch Black Box
  37. 37. 25 CloudFront CloudWatch Black Box
  38. 38. 26
  39. 39. 27
  40. 40. 27
  41. 41. 27
  42. 42. 28 CloudFront CloudWatch Black Box Crashlytics
  43. 43. 28 CloudFront CloudWatch Black Box Crashlytics
  44. 44. 29 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  45. 45. 29 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register Service levels: • R/W lock • size • load • speed
  46. 46. 30 VPC CloudFront ELB API servers MongoDB
  47. 47. 30 CloudWatch VPC CloudFront ELB API servers MongoDB
  48. 48. 30 CloudWatch VPC CloudFront ELB API servers MongoDB
  49. 49. 31 VPC CloudFront ELB API servers MongoDB
  50. 50. 31 VPC CloudFront ELB API servers MongoDB Cloud Manager
  51. 51. 32
  52. 52. 33 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  53. 53. 33 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register Diagnosis: • instance metrics • app metrics • resource
  54. 54. 34 VPC CloudFront ELB API servers MongoDB
  55. 55. 34 CloudWatch VPC CloudFront ELB API servers MongoDB
  56. 56. 34 CloudWatch VPC CloudFront ELB API servers MongoDB
  57. 57. 35
  58. 58. 35 memory space? disk space?
  59. 59. 36
  60. 60. 36
  61. 61. 36 mnemonic?
  62. 62. 37 CloudWatch CloudFront ELB API servers MongoDB StatsD for long-term metrics
  63. 63. 37 CloudWatch CloudFront ELB API servers MongoDB StatsD for long-term metrics
  64. 64. 37 CloudWatch CloudFront ELB API servers MongoDB StatsD StatsD for long-term metrics
  65. 65. 38 @ Taipei.py — Feb 26, 2015 Centralized logging and monitoring in Fluentd http://www.slideshare.net/suitingtseng/fluentd-49952996 For more details:
  66. 66. 39 http://www.oreilly.com/webops-perf/free/lightweight-systems.csp For more details:
  67. 67. 39 http://www.oreilly.com/webops-perf/free/lightweight-systems.csp For more details:
  68. 68. 39 http://www.oreilly.com/webops-perf/free/lightweight-systems.csp For more details:
  69. 69. 40 CloudWatch CloudFront ELB API servers MongoDB StatsD Application-specific metrics • gauges • counters • histograms • meters & timers
  70. 70. 41
  71. 71. 42 Metrics
  72. 72. 42 http://jcconf.tw/2014/manage- servers-on-the-cloud-with- opensource-tools.html Metrics
  73. 73. 43 CloudWatch CloudFront ELB API servers MongoDB low-level metrics profiling
  74. 74. 43 CloudWatch CloudFront ELB API servers MongoDB low-level metrics profiling
  75. 75. 44
  76. 76. 45
  77. 77. 46
  78. 78. 47
  79. 79. 48
  80. 80. 49 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  81. 81. 49 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register Diagnosis: • logging • patterns • drill down analysis
  82. 82. 50 CloudFront ELB API servers MongoDB CloudWatch
  83. 83. 50 CloudFront ELB API servers MongoDB CloudWatch log in S3
  84. 84. 50 CloudFront ELB API servers MongoDB CloudWatch log in S3
  85. 85. 50 CloudFront ELB API servers MongoDB CloudWatch log in S3
  86. 86. 50 CloudFront ELB API servers MongoDB CloudWatch log in S3
  87. 87. 50 CloudFront ELB API servers MongoDB CloudWatch log in S3 StatsD
  88. 88. 50 CloudFront ELB API servers MongoDB CloudWatch log in S3 StatsD BigQuery
  89. 89. 51 http://www.slideshare.net/tw_dsconf/elasticsearch-kibana Aug 23, 2015
  90. 90. 52 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  91. 91. 52 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register Diagnosis: • logging • context • aggregation
  92. 92. 53 CloudFront ELB API servers MongoDB CloudWatch log in S3 error logs
  93. 93. 53 CloudFront ELB API servers MongoDB CloudWatch log in S3 error logs
  94. 94. 54
  95. 95. 55 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  96. 96. 56 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 StatsD BigQuery
  97. 97. 57 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 StatsD BigQuery
  98. 98. 58 Taipei.py — Feb 26, 2015 Centralized logging and monitoring in Fluentd http://www.slideshare.net/suitingtseng/fluentd-49952996 For more details:
  99. 99. 58 Taipei.py — Feb 26, 2015 Centralized logging and monitoring in Fluentd http://www.slideshare.net/suitingtseng/fluentd-49952996 For more details: We’ve built an
 unified logging mechanism…
  100. 100. 59 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 StatSD BigQuery
  101. 101. 59 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 StatSD BigQuery How about
 unified monitoring alerts ?
  102. 102. 60
  103. 103. • • • • 60
  104. 104. • • • • 60
  105. 105. 61 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 BigQuery
  106. 106. 61 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 BigQuery
  107. 107. 62 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 BigQuery
  108. 108. 63
  109. 109. 64
  110. 110. 64
  111. 111. 65
  112. 112. 66
  113. 113. 66
  114. 114. 67
  115. 115. • • • • 68 • • • • 👍👎
  116. 116. 69 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register
  117. 117. 70 Consequence Likelihood service outage app crash db outage poor performance capacity shortage internal errors malfunction Risk register long term diagnosis short term detection & recovery
  118. 118. 71 CloudFront ELB API servers MongoDB Cloud Manager CloudWatch log in S3 BigQuery
  119. 119. 72
  120. 120. 73

×