Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Uber Mobility Meetup: Mobile Testing

Uber mobility meetup about mobile testing on 7/7/2016

  • Be the first to comment

Uber Mobility Meetup: Mobile Testing

  1. 1. Evolution of Octopus Bian Jiang, Nirav Nagda Uber Mobility Meetup July 7, 2016 From cross-app/cross device testing, to network record/replay, to scenario-based testing
  2. 2. Soon after we started at Uber, we encountered a fresh testing challenge. Can you guess what it is? Our Uber Challenge
  3. 3. Octopus: Our platform agnostic test runner
  4. 4. GTAC 2015 Octopus Takes on the Uber Challenge of Cross-app/Cross-device Testing Uber Eng Blog
  5. 5. EndToEnd (L) Scenario-based Tests (M) Unit Tests (S) Top: small number of large end to end tests (dozens) Middle: middling amount of medium integration tests (hundreds) Bottom: large number of small unit tests (thousands) PREREQUISITES: HEALTHY TEST PYRAMID
  6. 6. Octopus Evolution (Hermetic Tests)
  7. 7. Record mode: - Records the network responses Replay mode: - Plays back the network responses from local files HERMETIC TESTS: USING NETWORK RECORD/REPLAY
  8. 8. ANDROID EXAMPLE: NETWORK RECORD/REPLAY @Test @Replay public void whenDestinationPromptTap_shouldGoToLocationEditor() { HomeLayout .verifyLayout() .clickDestinationPrompt(); } (record = true)
  9. 9. func testFavoriteLocationsScreen() { do { launchAppWithOctopusConfig(try OctopusConfigBuilder() .setNetworkReplayCache("rider_favorite_locations") .setFakeLocation("555 Market St") .build()) } catch { print("Error when launching app") } login(withUsername: "fake_user", password: "fake_password") let favoriteLocationsScreen = FavoriteLocaitonsScreen( testCase: self, application: application) favoriteLocationsScreen.waitForRequiredElements() } IOS EXAMPLE: NETWORK RECORD/REPLAY
  10. 10. Octopus Demo
  11. 11. Octopus Evolution (Rules & Scenario-based testing)
  12. 12. ● What are rules? ● How do they help? ○ Easier test code ○ Shared setup/tearDown code logic ○ Abstract out the complicated logic from tests UBER RULES
  13. 13. ● Experiments ● Location ● Accounts ● Animations ● App reset RULES EXAMPLES
  14. 14. EXAMPLE: LOCATION RULE @Location (device = TestLocation.CHICAGO_SUBURB) @Replay @Test public void whenValidAccount_shouldBeAbleToSignIn() { WelcomeLayout .verifyLayout() .clickLogin(); LoginLayout .verifyLayout() .signInWith(Account.GENERIC.getEmail(), Account.GENERIC.getPassword()); RideLayout.verifyLayout(); }
  15. 15. ● What is scenario-based testing? ● How does it help? ○ Speeds up the test run ○ Easier to maintain ○ Deterministic tests SCENARIO-BASED TESTING
  16. 16. SCENARIO-BASED TESTING: ANDROID EXAMPLE Only when recording Actual test public void cancelTripRequest() { replayScenarioManager.doWhenRecording(new Interactions() { @Override public void perform() { AcceleratorsLayout.verifyLayout().clickItem(0); ProductSelectionLayout.verifyLayout().swipeToIndex(1) .clickProduct(R.string.uberx); ConfirmationLayout.verifyLayout().clickGo(); } }); // The test will start from here in replay mode. DispatchingLayout.verifyLayout().clickCancel(); ConfirmationLayout.verifyLayout(); }
  17. 17. Special thanks to mobile-test-platform@, octopus-eng@, mobile-eng@ and all the amazing people at Uber :) Uber eng blog GTAC talk on Octopus
  18. 18. How an "ideal" CI setup for mobile looks like Valera Zakharov
  19. 19. Ideal = Reliable Scalable Performant Controllable Debuggable
  20. 20. On Mobile Different screen resolutions Different OS versions
  21. 21. On Android Need Android Runtime to run instrumentation tests $ adb shell am instrument -w <test_package_name>/<runner_cla ss>
  22. 22. Basic Building Blocks Orchestration Device ADB Host (build node)
  23. 23. Ideal Orchestration Obtains/releases a device Sets up the device for testing Issues ‘adb shell am instrument’ calls for each test Collects debug info
  24. 24. Ideal Android Debug Bridge Issue shell commands Install apks Push data to device Pull data from device Does all of the above reliably and quickly
  25. 25. Device
  26. 26. Ideal Test Device Starts reliably and quickly Runs as fast as top-of the line phone Allows you to scale Provides tests with control over system settings Test Services Prevents UI popups
  27. 27. Android Vanilla Setup Orchestration Jenkins or another equivalent ./gradlew connectedCheck Adb Stock Device Locally connected emulator or physical device
  28. 28. Vanilla Setup Problems Orchestration one process for all tests no sharding no separate logcat/debug info Adb Slow
  29. 29. Vanilla Setup Problems Device Not test friendly Physical devices not designed for CI usage hard to maintain and scale Emulator: hard to configure properly system images not well maintained
  30. 30. Ideal = Reliable Scalable Performant Controllable Debuggable
  31. 31. Google’s Internal Setup Orchestration Custom (hooked into the internal build system) Adb adb.turbo ™ Device Stock x86 emulator running in dedicated data-center machines Properly configured/maintained Config abstracted away by a script
  32. 32. Existing services come up short Firebase Google Cloud Test Lab Runs all tests in one instrumentation Virtual Devices are slow Not debuggable Xamarin Doesn’t run instrumentation tests AWS Runs all tests in one instrumentation
  33. 33. So… now what ¯_(ツ)_/¯
  34. 34. Some thoughts... Orchestration Tools for discovering tests in an apk (based on dexdump) An alternative to ./gradlew connectedCheck android_test --apk my.apk --tests test.apk ADB adb.turbo (or equivalent) if planning to run at Google-scale Device Custom IActivityController
  35. 35. Faster Tests with Test Parallelization JUSTIN MARTIN & DANIEL RIBEIRO
  36. 36. Too many tests? No such thing!
  37. 37. ~150 minutes ~50 minutes MACHINE TIME WALL TIME Distributed vs. Non-distributed
  38. 38. ~150 minutes ~125 minutes ~50 minutes MACHINE TIME WALL TIME NON-DISTRIBUTED DISTRIBUTED Distributed vs. Non-distributed
  39. 39. ~150 minutes ~125 minutes ~50 minutes MACHINE TIME WALL TIME ~20 minutes Distributed vs. Non-distributed NON-DISTRIBUTED DISTRIBUTED
  40. 40. Separation of Powers BUILD TEST
  41. 41. Separation of Powers BUILD MAC PROS TEST MAC MINIS
  44. 44. Each iteration takes a test class from the queue and assigns it to the machine that is the least busy XCKnife Balancing
  45. 45. Balanced Sharding
  46. 46. Approximate
  47. 47. Key takeaways
  48. 48. Disclaimer
  49. 49. Diminishing Returns
  50. 50. Testing Pyramid
  51. 51. Testing Pyramid
  52. 52. Testing Matrix
  53. 53. Many devices Testing Matrix:
  54. 54. OS versions Testing Matrix:
  55. 55. Orientations Testing Matrix:
  56. 56. localizations Testing Matrix:
  57. 57. Test Explosion
  58. 58. It is distributed
  59. 59. Availability
  60. 60. Operations matter
  61. 61. Test your infrastructure
  62. 62. Everything breaks
  63. 63. When you fix, please open source it
  64. 64. Toolchain
  65. 65. UX Matters
  66. 66. Bad actors
  67. 67. Justin Martin Daniel Ribeiro
  68. 68. 3x3: Speeding Up Mobile Releases Drew Hannay (@drewhannay)
  69. 69. Project Voyager New version of flagship LinkedIn app 250+ committers across Android & iOS ~1 year of development Investment in mobile infrastructure at LinkedIn
  70. 70. Before Voyager 12 releases per year RC build + manual regression suite Mad dash to commit code before the RC cutoff Missing the cutoff meant a long wait for the next release Product & marketing plans were made around the monthly releases Hard to iterate on member feedback
  71. 71. 3x3 Release three times per day, no more than three hours from code commit to member availability
  72. 72. Why three hours? Not enough time for manual testing steps Not enough time to test everything The goal isn’t 100% automation, it’s faster iterations We don’t want engineers spending most of their time maintaining tests that break whenever a design changes UI tests are chosen based on production-critical business flows Faster iteration helps emphasize craftsmanship Devs can take the extra time to write quality code since the next release is soon
  73. 73. Commit Pipeline Code Review Static Analysis Unit Tests Build Release Artifacts UI Tests Alpha Release Feature Development Production Release Beta Release
  74. 74. Commit Pipeline Code Review Static Analysis Unit Tests Build Release Artifacts UI Tests Alpha Release Feature Development Production Release Beta Release
  75. 75. Static analysis Compile-time contract with API server using data templates are shared between API server & clients Provides static analysis checks that guarantee backwards compatibility Client models are code generated for further safety Java Checkstyle Android Lint Over 200 checks provided by Google Several custom checks written for LinkedIn-specific patterns
  76. 76. Building the code Over 500k lines of code between Android & iOS Building production binaries for a large codebase is slow iOS & Swift At one point early on, Swift compilation took over two hours Refactoring into sub-projects and modules lead to a more than 50% speed up Android Split APKs Separate binary for each combination of screen density and CPU architecture Distributed builds
  77. 77. What do we test? Unit tests Layout tests Unit tests for views Stress test views with long strings, short strings Make sure views don’t overlap, and render properly in right-to-left mode Scenario tests Validate that key business metric flows are working properly Usually flows that span multiple screens in the app
  78. 78. Test stability UI tests use Android Espresso & iOS KIF frameworks Needed to create a consistent test environment across dev & build machines Android Self-contained, standardized Android emulator bundle Custom test harness that runs one test per application launch iOS KIF contained hard-coded waits that passed on dev boxes, but failed on slower build servers Forked KIF to convert to a checkpoint-based system, where the code tells the test when to proceed to the next step
  79. 79. Test speed Android Use Espresso’s IdlingResource API to avoid sleeps and waits Run up to 16 Android emulators in parallel on a single build machine Custom test harness allows optimal test parallelization iOS Refactoring KIF cut UI testing time by more than 80% Distributed testing -> Shard tests across multiple machines Significantly faster, but led to greater exposure to any tooling instability
  80. 80. Android multi-emulator test run
  81. 81. iOS KIF refactoring
  82. 82. iOS multi-simulator testing
  83. 83. Partner teams Historically, several partner teams validated the build before a release For example, we needed sign off from the localization team Lint checks catch hardcoded or improperly formatted strings Layout tests catch strings that are too long and RTL layout bugs Semantic correctness of translations is still validated by translators manually
  84. 84. Getting to members Every three hours, internal alpha testers get a new build Mainly members of the Flagship team Product managers, devs, and execs who want to see the latest code ASAP Every week, the rest of the company gets a new beta build iOS build is submitted to Apple for review After a week of beta, the build is promoted to production Assuming Apple’s review is complete, iOS is released Take advantage of Google Play staged rollout for Android
  85. 85. Dogfooding Android: Google Play alpha/beta channel Easy upgrades for employees, even while off the corporate network Somewhat difficult to get set up, but easy once registered iOS: TestFlight Nice, but limited number of users iOS: Custom enterprise distribution Scales to our number of users, but employees must be on corporate wifi to upgrade Splash screen in the production app encourages employees to use beta builds
  86. 86. Minimizing risk & enabling experiments Take advantage of LinkedIn’s existing A/B testing infrastructure New features are developed behind feature flags Code can be ramped dynamically to different groups of members Performance of new features or changes can be monitored Dynamic configuration Server-controlled kill switch Crashing or buggy code can often be disabled without a new build
  87. 87. 3x3 after 6 months: areas to improve Release automation Production uploads to the app stores are still a manual process Getting release notes & translations is painful Automated performance testing We can sample performance of the app in production, but don’t have a great way of catching issues before release Android Monkey testing Enables wide range of API level & device coverage with very low overhead cost iOS speed improvements
  88. 88. Questions
  89. 89. 3x3 blogs & videos 3x3: Speeding up mobile releases 3x3: iOS Build Speed and Stability Test Stability - How We Make UI Tests Stable UI Automation: Keep it Functional - and Stable! Consistent Android Testing Environments with Gradle (slides) Effective Layout Testing Library for iOS Managing iOS Continuous Integration at Enterprise Scale