The Future of Selenium Testing for Mobile Web and Native Apps
Uber Mobility Meetup: Mobile Testing
1.
2. Evolution of Octopus
Bian Jiang, Nirav Nagda
Uber Mobility Meetup
July 7, 2016
From cross-app/cross device testing, to network
record/replay, to scenario-based testing
3. Soon after we started at Uber, we encountered a fresh
testing challenge. Can you guess what it is?
Our Uber Challenge
5. GTAC 2015
Octopus Takes on the Uber Challenge of
Cross-app/Cross-device Testing
https://www.youtube.com/watch?v=p6gsssppeT0
Uber Eng Blog
https://eng.uber.com/rescued-by-octopus/
6. EndToEnd (L)
Scenario-based
Tests (M)
Unit Tests (S)
Top: small number of large end to
end tests (dozens)
Middle: middling amount of medium
integration tests (hundreds)
Bottom: large number of small unit
tests (thousands)
PREREQUISITES: HEALTHY TEST PYRAMID
8. Record mode:
- Records the network responses
Replay mode:
- Plays back the network responses from local files
HERMETIC TESTS: USING NETWORK RECORD/REPLAY
14. ● What are rules?
● How do they help?
○ Easier test code
○ Shared setup/tearDown code logic
○ Abstract out the complicated logic from tests
UBER RULES
17. ● What is scenario-based testing?
● How does it help?
○ Speeds up the test run
○ Easier to maintain
○ Deterministic tests
SCENARIO-BASED TESTING
18. SCENARIO-BASED TESTING: ANDROID EXAMPLE
Only when
recording
Actual test
public void cancelTripRequest() {
replayScenarioManager.doWhenRecording(new Interactions() {
@Override
public void perform() {
AcceleratorsLayout.verifyLayout().clickItem(0);
ProductSelectionLayout.verifyLayout().swipeToIndex(1)
.clickProduct(R.string.uberx);
ConfirmationLayout.verifyLayout().clickGo();
}
});
// The test will start from here in replay mode.
DispatchingLayout.verifyLayout().clickCancel();
ConfirmationLayout.verifyLayout();
}
19. Special thanks to mobile-test-platform@,
octopus-eng@, mobile-eng@ and all the
amazing people at Uber :)
Uber eng blog
https://eng.uber.com/rescued-by-octopus/
GTAC talk on Octopus
https://www.youtube.com/watch?v=p6gsssppeT0
20.
21. How an "ideal" CI setup for mobile looks like
Valera Zakharov
32. Ideal Test Device
Starts reliably and quickly
Runs as fast as top-of the line phone
Allows you to scale
Provides tests with control over system settings
Test Services
Prevents UI popups
35. Vanilla Setup Problems
Device
Not test friendly
Physical devices
not designed for CI usage
hard to maintain and scale
Emulator:
hard to configure properly
system images not well maintained
37. Google’s Internal Setup
Orchestration
Custom (hooked into the internal build system)
Adb
adb.turbo ™
Device
Stock x86 emulator running in dedicated data-center machines
Properly configured/maintained
Config abstracted away by a script
38.
39. Existing services come up short
Firebase Google Cloud Test Lab
Runs all tests in one instrumentation
Virtual Devices are slow
Not debuggable
Xamarin
Doesn’t run instrumentation tests
AWS
Runs all tests in one instrumentation
41. Some thoughts...
Orchestration
Tools for discovering tests in an apk (based on dexdump)
An alternative to ./gradlew connectedCheck
android_test --apk my.apk --tests test.apk
ADB
adb.turbo (or equivalent) if planning to run at Google-scale
Device
Custom IActivityController
42.
43.
44. Faster Tests with Test Parallelization
JUSTIN MARTIN & DANIEL RIBEIRO
82. Project Voyager
New version of flagship
LinkedIn app
250+ committers across
Android & iOS
~1 year of development
Investment in mobile
infrastructure at
LinkedIn
83. Before Voyager
12 releases per year
RC build + manual regression suite
Mad dash to commit code before the RC cutoff
Missing the cutoff meant a long wait for the next release
Product & marketing plans were made around the monthly releases
Hard to iterate on member feedback
84. 3x3
Release three times per day, no more than three
hours from code commit to member availability
85. Why three hours?
Not enough time for manual testing steps
Not enough time to test everything
The goal isn’t 100% automation, it’s faster iterations
We don’t want engineers spending most of their time maintaining tests that break whenever a
design changes
UI tests are chosen based on production-critical business flows
Faster iteration helps emphasize craftsmanship
Devs can take the extra time to write quality code since the next release is soon
88. Static analysis
Compile-time contract with API server using Rest.li
Rest.li data templates are shared between API server & clients
Provides static analysis checks that guarantee backwards compatibility
Client models are code generated for further safety
Java Checkstyle
Android Lint
Over 200 checks provided by Google
Several custom checks written for LinkedIn-specific patterns
89. Building the code
Over 500k lines of code between Android & iOS
Building production binaries for a large codebase is slow
iOS & Swift
At one point early on, Swift compilation took over two hours
Refactoring into sub-projects and modules lead to a more than 50% speed up
Android Split APKs
Separate binary for each combination of screen density and CPU architecture
Distributed builds
90. What do we test?
Unit tests
Layout tests
Unit tests for views
Stress test views with long strings, short strings
Make sure views don’t overlap, and render properly in right-to-left mode
Scenario tests
Validate that key business metric flows are working properly
Usually flows that span multiple screens in the app
91. Test stability
UI tests use Android Espresso & iOS KIF frameworks
Needed to create a consistent test environment across dev & build machines
Android
Self-contained, standardized Android emulator bundle
Custom test harness that runs one test per application launch
iOS
KIF contained hard-coded waits that passed on dev boxes, but failed on slower build servers
Forked KIF to convert to a checkpoint-based system,
where the code tells the test when to proceed to the next step
92. Test speed
Android
Use Espresso’s IdlingResource API to avoid sleeps and waits
Run up to 16 Android emulators in parallel on a single build machine
Custom test harness allows optimal test parallelization
iOS
Refactoring KIF cut UI testing time by more than 80%
Distributed testing -> Shard tests across multiple machines
Significantly faster, but led to greater exposure to any tooling instability
96. Partner teams
Historically, several partner teams validated the
build before a release
For example, we needed sign off from the
localization team
Lint checks catch hardcoded or improperly
formatted strings
Layout tests catch strings that are too long and
RTL layout bugs
Semantic correctness of translations is still
validated by translators manually
97. Getting to members
Every three hours, internal alpha testers get a new build
Mainly members of the Flagship team
Product managers, devs, and execs who want to see the latest code ASAP
Every week, the rest of the company gets a new beta build
iOS build is submitted to Apple for review
After a week of beta, the build is promoted to production
Assuming Apple’s review is complete, iOS is released
Take advantage of Google Play staged rollout for Android
98. Dogfooding
Android: Google Play alpha/beta channel
Easy upgrades for employees, even while off the corporate network
Somewhat difficult to get set up, but easy once registered
iOS: TestFlight
Nice, but limited number of users
iOS: Custom enterprise distribution
Scales to our number of users, but employees must be on corporate wifi to upgrade
Splash screen in the production app encourages employees to use beta builds
99. Minimizing risk & enabling experiments
Take advantage of LinkedIn’s existing A/B testing infrastructure
New features are developed behind feature flags
Code can be ramped dynamically to different groups of members
Performance of new features or changes can be monitored
Dynamic configuration
Server-controlled kill switch
Crashing or buggy code can often be disabled without a new build
100. 3x3 after 6 months: areas to improve
Release automation
Production uploads to the app stores are still a manual process
Getting release notes & translations is painful
Automated performance testing
We can sample performance of the app in production,
but don’t have a great way of catching issues before release
Android Monkey testing
Enables wide range of API level & device coverage with very low overhead cost
iOS speed improvements
102. 3x3 blogs & videos
3x3: Speeding up mobile releases
3x3: iOS Build Speed and Stability
Test Stability - How We Make UI Tests Stable
UI Automation: Keep it Functional - and Stable!
Consistent Android Testing Environments with Gradle (slides)
Effective Layout Testing Library for iOS
Managing iOS Continuous Integration at Enterprise Scale