Uber Mobility Meetup: Mobile Testing

Evolution of Octopus
Bian Jiang, Nirav Nagda
Uber Mobility Meetup
July 7, 2016
From cross-app/cross device testing, to network
record/replay, to scenario-based testing

Soon after we started at Uber, we encountered a fresh
testing challenge. Can you guess what it is?
Our Uber Challenge

Octopus: Our platform
agnostic test runner

GTAC 2015
Octopus Takes on the Uber Challenge of
Cross-app/Cross-device Testing
https://www.youtube.com/watch?v=p6gsssppeT0
Uber Eng Blog
https://eng.uber.com/rescued-by-octopus/

EndToEnd (L)
Scenario-based
Tests (M)
Unit Tests (S)
Top: small number of large end to
end tests (dozens)
Middle: middling amount of medium
integration tests (hundreds)
Bottom: large number of small unit
tests (thousands)
PREREQUISITES: HEALTHY TEST PYRAMID

Octopus Evolution
(Hermetic Tests)

Record mode:
- Records the network responses
Replay mode:
- Plays back the network responses from local files
HERMETIC TESTS: USING NETWORK RECORD/REPLAY

ANDROID EXAMPLE: NETWORK RECORD/REPLAY
@Test
@Replay
public void
whenDestinationPromptTap_shouldGoToLocationEditor() {
HomeLayout
.verifyLayout()
.clickDestinationPrompt();
}
(record = true)

func testFavoriteLocationsScreen() {
do {
launchAppWithOctopusConfig(try OctopusConfigBuilder()
.setNetworkReplayCache("rider_favorite_locations")
.setFakeLocation("555 Market St")
.build())
} catch {
print("Error when launching app")
}
login(withUsername: "fake_user", password: "fake_password")
let favoriteLocationsScreen = FavoriteLocaitonsScreen(
testCase: self,
application: application)
favoriteLocationsScreen.waitForRequiredElements()
}
IOS EXAMPLE: NETWORK RECORD/REPLAY

Octopus Evolution
(Rules & Scenario-based testing)

● What are rules?
● How do they help?
○ Easier test code
○ Shared setup/tearDown code logic
○ Abstract out the complicated logic from tests
UBER RULES

● Experiments
● Location
● Accounts
● Animations
● App reset
RULES EXAMPLES

EXAMPLE: LOCATION RULE
@Location (device = TestLocation.CHICAGO_SUBURB)
@Replay
@Test
public void whenValidAccount_shouldBeAbleToSignIn() {
WelcomeLayout
.verifyLayout()
.clickLogin();
LoginLayout
.verifyLayout()
.signInWith(Account.GENERIC.getEmail(),
Account.GENERIC.getPassword());
RideLayout.verifyLayout();
}

● What is scenario-based testing?
● How does it help?
○ Speeds up the test run
○ Easier to maintain
○ Deterministic tests
SCENARIO-BASED TESTING

SCENARIO-BASED TESTING: ANDROID EXAMPLE
Only when
recording
Actual test
public void cancelTripRequest() {
replayScenarioManager.doWhenRecording(new Interactions() {
@Override
public void perform() {
AcceleratorsLayout.verifyLayout().clickItem(0);
ProductSelectionLayout.verifyLayout().swipeToIndex(1)
.clickProduct(R.string.uberx);
ConfirmationLayout.verifyLayout().clickGo();
}
});
// The test will start from here in replay mode.
DispatchingLayout.verifyLayout().clickCancel();
ConfirmationLayout.verifyLayout();
}

Special thanks to mobile-test-platform@,
octopus-eng@, mobile-eng@ and all the
amazing people at Uber :)
Uber eng blog
https://eng.uber.com/rescued-by-octopus/
GTAC talk on Octopus
https://www.youtube.com/watch?v=p6gsssppeT0

How an "ideal" CI setup for mobile looks like
Valera Zakharov

Ideal =
Reliable
Scalable
Performant
Controllable
Debuggable

On Mobile
Different screen resolutions
Different OS versions

On Android
Need Android Runtime to run
instrumentation tests
$ adb shell am instrument -w
<test_package_name>/<runner_cla
ss>

Basic Building Blocks
Orchestration Device
ADB
Host (build node)

Ideal Orchestration
Obtains/releases a device
Sets up the device for testing
Issues ‘adb shell am instrument’ calls for each test
Collects debug info

Ideal Android Debug Bridge
Issue shell commands
Install apks
Push data to device
Pull data from device
Does all of the above reliably and quickly

Ideal Test Device
Starts reliably and quickly
Runs as fast as top-of the line phone
Allows you to scale
Provides tests with control over system settings
Test Services
Prevents UI popups

Android Vanilla Setup
Orchestration
Jenkins or another equivalent
./gradlew connectedCheck
Adb
Stock
Device
Locally connected emulator or physical device

Vanilla Setup Problems
Orchestration
one process for all tests
no sharding
no separate logcat/debug info
Adb
Slow

Vanilla Setup Problems
Device
Not test friendly
Physical devices
not designed for CI usage
hard to maintain and scale
Emulator:
hard to configure properly
system images not well maintained

Google’s Internal Setup
Orchestration
Custom (hooked into the internal build system)
Adb
adb.turbo ™
Device
Stock x86 emulator running in dedicated data-center machines
Properly configured/maintained
Config abstracted away by a script

Existing services come up short
Firebase Google Cloud Test Lab
Runs all tests in one instrumentation
Virtual Devices are slow
Not debuggable
Xamarin
Doesn’t run instrumentation tests
AWS
Runs all tests in one instrumentation

Some thoughts...
Orchestration
Tools for discovering tests in an apk (based on dexdump)
An alternative to ./gradlew connectedCheck
android_test --apk my.apk --tests test.apk
ADB
adb.turbo (or equivalent) if planning to run at Google-scale
Device
Custom IActivityController

Faster Tests with Test Parallelization
JUSTIN MARTIN & DANIEL RIBEIRO

Too many tests? No such
thing!

~150 minutes
~50 minutes
MACHINE TIME WALL TIME
Distributed vs.
Non-distributed

~150 minutes
~125 minutes
~50 minutes
NON-DISTRIBUTED DISTRIBUTED
Distributed vs.
Non-distributed

~150 minutes
~125 minutes
~50 minutes
~20 minutes
Distributed vs.
Non-distributed
NON-DISTRIBUTED DISTRIBUTED

Separation of
Powers
BUILD TEST

Separation of
Powers
BUILD
MAC PROS
TEST
MAC MINIS

Reruns Are Even Faster!
PRO TIME MINI TIME
MACHINE TIME
(RERUN)
WALL TIME
(RERUN)

XCKnife
FASTER DISTRIBUTED TESTS FOR iOS
GITHUB.COM/SQUARE/XCKNIFE
SQU.RE/XCKNIFE

Each iteration takes a test class from the queue and assigns it to the machine that is the
least busy
XCKnife
Balancing

When you fix,
please open source
it

Justin Martin
Daniel Ribeiro
jmartin@squareup.com
danielrb@squareup.com

3x3: Speeding Up Mobile
Releases
Drew Hannay (@drewhannay)

Project Voyager
New version of flagship
LinkedIn app
250+ committers across
Android & iOS
~1 year of development
Investment in mobile
infrastructure at
LinkedIn

Before Voyager
12 releases per year
RC build + manual regression suite
Mad dash to commit code before the RC cutoff
Missing the cutoff meant a long wait for the next release
Product & marketing plans were made around the monthly releases
Hard to iterate on member feedback

3x3
Release three times per day, no more than three
hours from code commit to member availability

Why three hours?
Not enough time for manual testing steps
Not enough time to test everything
The goal isn’t 100% automation, it’s faster iterations
We don’t want engineers spending most of their time maintaining tests that break whenever a
design changes
UI tests are chosen based on production-critical business flows
Faster iteration helps emphasize craftsmanship
Devs can take the extra time to write quality code since the next release is soon

Commit Pipeline
Code
Review
Static
Analysis
Unit
Tests
Build
Release
Artifacts
UI
Tests
Alpha
Release
Feature
Development
Production
Release
Beta
Release

Static analysis
Compile-time contract with API server using Rest.li
Rest.li data templates are shared between API server & clients
Provides static analysis checks that guarantee backwards compatibility
Client models are code generated for further safety
Java Checkstyle
Android Lint
Over 200 checks provided by Google
Several custom checks written for LinkedIn-specific patterns

Building the code
Over 500k lines of code between Android & iOS
Building production binaries for a large codebase is slow
iOS & Swift
At one point early on, Swift compilation took over two hours
Refactoring into sub-projects and modules lead to a more than 50% speed up
Android Split APKs
Separate binary for each combination of screen density and CPU architecture
Distributed builds

What do we test?
Unit tests
Layout tests
Unit tests for views
Stress test views with long strings, short strings
Make sure views don’t overlap, and render properly in right-to-left mode
Scenario tests
Validate that key business metric flows are working properly
Usually flows that span multiple screens in the app

Test stability
UI tests use Android Espresso & iOS KIF frameworks
Needed to create a consistent test environment across dev & build machines
Android
Self-contained, standardized Android emulator bundle
Custom test harness that runs one test per application launch
iOS
KIF contained hard-coded waits that passed on dev boxes, but failed on slower build servers
Forked KIF to convert to a checkpoint-based system,
where the code tells the test when to proceed to the next step

Test speed
Android
Use Espresso’s IdlingResource API to avoid sleeps and waits
Run up to 16 Android emulators in parallel on a single build machine
Custom test harness allows optimal test parallelization
iOS
Refactoring KIF cut UI testing time by more than 80%
Distributed testing -> Shard tests across multiple machines
Significantly faster, but led to greater exposure to any tooling instability

Android multi-emulator test run

Partner teams
Historically, several partner teams validated the
build before a release
For example, we needed sign off from the
localization team
Lint checks catch hardcoded or improperly
formatted strings
Layout tests catch strings that are too long and
RTL layout bugs
Semantic correctness of translations is still
validated by translators manually

Getting to members
Every three hours, internal alpha testers get a new build
Mainly members of the Flagship team
Product managers, devs, and execs who want to see the latest code ASAP
Every week, the rest of the company gets a new beta build
iOS build is submitted to Apple for review
After a week of beta, the build is promoted to production
Assuming Apple’s review is complete, iOS is released
Take advantage of Google Play staged rollout for Android

Dogfooding
Android: Google Play alpha/beta channel
Easy upgrades for employees, even while off the corporate network
Somewhat difficult to get set up, but easy once registered
iOS: TestFlight
Nice, but limited number of users
iOS: Custom enterprise distribution
Scales to our number of users, but employees must be on corporate wifi to upgrade
Splash screen in the production app encourages employees to use beta builds

Minimizing risk & enabling experiments
Take advantage of LinkedIn’s existing A/B testing infrastructure
New features are developed behind feature flags
Code can be ramped dynamically to different groups of members
Performance of new features or changes can be monitored
Dynamic configuration
Server-controlled kill switch
Crashing or buggy code can often be disabled without a new build

3x3 after 6 months: areas to improve
Release automation
Production uploads to the app stores are still a manual process
Getting release notes & translations is painful
Automated performance testing
We can sample performance of the app in production,
but don’t have a great way of catching issues before release
Android Monkey testing
Enables wide range of API level & device coverage with very low overhead cost
iOS speed improvements

3x3 blogs & videos
3x3: Speeding up mobile releases
3x3: iOS Build Speed and Stability
Test Stability - How We Make UI Tests Stable
UI Automation: Keep it Functional - and Stable!
Consistent Android Testing Environments with Gradle (slides)
Effective Layout Testing Library for iOS
Managing iOS Continuous Integration at Enterprise Scale

Uber Mobility Meetup: Mobile Testing

Uber Mobility Meetup: Mobile Testing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Uber Mobility Meetup: Mobile Testing

Similar to Uber Mobility Meetup: Mobile Testing (20)

Uber Mobility Meetup: Mobile Testing