Although the IRS can process your refund quickly, it does not have a reputation for quick internal change; it still uses many applications from the Kennedy administration. But Andrea Schneider, Senior Manager of Product Management in Online Services, is helping the IRS launch a MVP early in next year’s filing season that will allow taxpayers to access basic tax information online, including their balance, tax records, and payment histories, as well as make a payment. She, along with Lauren Gilchrist, Product Manager at Pivotal Labs, who is partnering with the IRS to build the MVP, will discuss their biggest challenges along the way and share advice for experimenting and iterating inside large bureaucracies.
Minimum Viable Taxes - Lessons learned building a MVP in the IRS
1. Minimum Viable Taxes
Lessons learned building a MVP in
the IRS
Andrea Schneider @AZSchneiderinMD
Lauren Gilchrist @lgilchrist
Lovingly illustrated by Linda Joy @ljoy
32. Thank you!
Andrea Schneider @AZSchneiderinMD
Lauren Gilchrist @lgilchrist
Lovingly illustrated by Linda Joy @ljoy
Editor's Notes
I’m Lauren, from Pivotal Labs
I’m Andrea, from the IRS
Service levels are declining
Call center costs are increasing
IRS budget is shrinking
You filed your taxes and you have a balance you can’t pay, or you’ve gone through an audit, or you’ve filed your taxes incorrectly. You owe money to the IRS.
The IRS communicates that via a letter (not email, not a phone call).
People get the letter, and their first reaction is to call the IRS. They get to the phone this down, wait in the queue for hours, and it’s very possible they won’t even get through to someone.
In 2014 the IRS collects $3 Trillion dollars in taxes.
200,000 average Monthly Calls about what is my balance due.
In the constrained budget, we need to figure out the best way to service taxpayers that need to answer this question in the most cost effective way possible.
This is an obvious problem to fix with software.
There are a ton of obstacles to launching software as a federal agency. The IRS is still running some applications from the Kennedy administration.
To build software in the government, you start with an investment proposal. This begins 2 years out, and gives a high level framing to OMB.
Once this gets approved, traditionally, business spends about a year writing requirements.
Then, IT spends another year building to those requirements. Unfortunately, those requirements often change, and IT doesn’t know about it. This extends the deadline because IT was building to an out of date requirement with little input from the business.
We have additional security and testing requirements that can take another 6 months.
And finally when all is said and done, someone has to sign an Authority to Operate agreement, which lists out every single risk of launching this software, so that if something goes wrong, that person can be accountable in front of Congress.
Compound that with the fact that the IRS processed 240 million tax returns in 2014, most of which were between January 15 and April 15.
This means we do everything possible to not deploy software during filing season. So we can’t just make changes at any time.
Some of these problems are outside our control.
OMB budgeting process? Probably can’t fix that.
Shortening the test and security process cycles? Probably. Breaking down the wall between business and developers so we can respond to changes in requirements faster? Probably fix that.
We decided to solve this problem in a more agile fashion.
We want to with the aim of delivering something faster, not defining everything upfront, and responding more quickly to taxpayer needs.
We know we’re not awesome at this, and it’s new to us. We need some help changing our ways. How do the pros do it? We wind up partnering with Pivotal Labs
Labs is an agile delivery consultancy
We see a lot of organizations like the IRS who turn to agile to help solve their problems.
What we’ve learned over the years is that Agile is a planning and delivery methodology that can deliver a product to market more quickly fewer errors.
Agile can break down barriers between teams, and it can shorten cycle times to deliver software to users.
Unfortunately, Agile cannot ensure that you build the right software. In fact, it can even help you deliver the wrong software faster.
And to compound this problem…
We need to have confidence that the software meets the needs of our customers.
The IRS, or any other government agency, or nonprofit for that matter, cannot be cavalier with how we spend our money.
This isn’t like the private sector where people who don’t like your product just don’t buy it
Everyone in america pays for the software the IRS is making
Wouldn't you want the IRS to be making software that actually solves tayper needs?
Wouldn't you rather the IRS build something you actually need?
We certainly would!
But as we learned from healthcare.gov, government can’t afford to fail when it rolls out software.
When Pivotal Labs began its engagement with the IRS, we knew we needed to build the right software.
I began to ask if lean methodologies could help ensure that we build the right software.
Luckily, I get to work with Janice Fraser, who did that wonderful workshop on the Leader’s Guide that you hopefully attended on Monday
Janice has always advocated that lean breaks down into four very basic concepts.
So we kick off this engagement with the IRS, and we decided to understand the assumptions we are making.
The first thing we asked is “”what problem are we solving, and how do we know if we are right?”
The IRS answers: we think the problem is that Taxpayers want to know their basic tax history.
When taxpayers call, a customer service rep can tell them everything that’s going on in their account, including how much they owe, what they’ve paid in prior years, if they’re due a refund, etc.
We think it’s our obligation to provide as much information to the taxpayer as we possibly can, particularly because this information is not currently available anywhere online.
So we starting down this path of solving the problem of online account.
So along the way, we decide to ask some innocent questions
Is this really what taxpayers want? How do we know for sure?
Understanding your customers is a major challenge.
People that owe the IRS money are not exactly lining up around the block to give feedback on software that the IRS is building.
So we had to get creative. We talked to people who had made a payment to the IRS. We asked them a ton of questions about the last time they paid the IRS and what that experience was like.
But we didn’t stop there...
But we didn’t stop there.
We took some early design prototypes, put them in front of taxpayers, and asked them to complete simple tasks such as checking their balance and seeing if they were due a refund.
These were clickable prototypes.
We sweetened the deal with free donuts.
What we learned is that taxpayers were overwhelmed by our efforts to show them all the information they could possibly want about their account.
Turns out, taxpayers didn’t want to know everything. They simply wanted to know if they owed anything to the IRS, and if they did, what they owed.
Furthermore, they wanted to see that information presented in a way that made sense to them -- like a bank statement or a credit card bill
Here’s an example of design iterations that we did as a result of taxpayer feedback.
On the left, you’ll see that we’ve got all of kinds of information about taxes paid, refund due, and balance due.
On the right, we streamlined this information to only reflect what, if anything the taxpayer owed.
We also made adjustments to make this look more like their mental model of a credit card bill or a bank statement.
By going through a lean cycle of listing our assumptions, talking to real taxpayers, putting a real prototype in front of people, and then adjusting based on evidence,
We became more and more confident that we were building the right software.
We continued to do this feedback cycle every two weeks throughout our work with the IRS.
These short feedback loops gave us a lot of confidence we were building the right thing for taxpayers.
So now we want to switch gears and talk a bit more about what we learned and how you can apply lean principles to building software in the government, a non profit, or any other environment with significant constraints
And we have a couple of pieces of advice that we want to share:
Imagine a world where the IRS is searching for people who owe money to the IRS to give feedback on a prototype. Hard to picture people lining up for that one for free donuts for that one.
In the past, it’s been very difficult to get access to real people if you work in the government building software. The Paperwork Reduction Act requires extensive permissioning if a government employee wants to solicit opinions of more than 10 citizens at a time.
This legislation was designed to make surveys and forums more just, but when you want to get prototypes in the hands of people and get feedback, it can feel like a huge hurdle.
Luckily much of this is changing, and folks in 18F have been helping government agencies design within the constraints of the PRA by publishing their design methods, which are a great crib sheet for how to color inside the lines in the eyes of the law.
So all that being said, we still had to do a ton of legwork to find people to give us feedback on what we working on, including posting google signup forms across every form of social media, and soliciting feedback in our kitchen by offering free donuts. We easily spent 10 hours just finding people to talk to us.
Because this is so difficult, in the past, the solution was to just test with IRS employees.
Unfortunately, employees are not a good substitute for real taxpayers.
In fact, IRS employees can get disciplinary action for not filing their taxes correctly. This means they know too much. Every IRS employee that saw our designs automatically knew that interests and penalties accrue daily for a balance due.
Unfortunately, taxpayers that we tested with did NOT know that, and we had to adjust our designs to help inform them of the consequences of not paying quickly.
So make sure you are giving more preference to real users, not employees
We can’t afford to put something into production and see how it goes (like startups do).
Remember healthcare.gov?
No one wants to do that ever again.
One of the principles of lean is to put something in the real world quickly, but that can look different for the government, and doesn’t necessarily involve putting software into production. Though that’s a great goal to have.
So instead, we used high-fidelity clickable prototypes to see how users would really interact with the software we were building.
We asked them non-leading questions and asked them to complete simple tasks.
Testing with real people was crucial to us gaining confidence that we were on track to building the right software
So to recap, make sure you do the legwork to find real, representative users to test with
When we first started trying to spin up this engine of regular user feedback, we got pushback within the IRS that the number of taxpayers we were testing with were not statistically significant or representative of all taxpayers.
There’s logic behind this.
When the government needs to survey people, there is legislation in place to ensure it is done fairly and widely and broadly.
Unfortunately, that also depends on having something out there in the public to survey about.
We don’t have the luxury of building the wrong thing and then surveying to see how it did.
That means we need to get feedback from a smaller number of people and look for patterns.
So when we do usability testing with people, we learn a LOT from the first person we test with. And then we get a lot of overlap from the second person. The third person gives you diminishing returns. The things that are most salient are repeated over and over again.
There;s a quote from a legendary user research Jared Spool, who says "Statistical significance" does not mean "more data.It means the data you have can predict overall outcomes.
By putting prototypes in front of 8 people at a time, we were able to see enough patterns to make decisions and gain evidence that we were on track to building the right software, and predit the right outcome.
So remember that evidence doesn’t always need to be statistically significant.
As a contractor it was really intimidating to design software that could touch every single american.
I’ve never worked on that scale before, and I imagine most of us in this room never will.
Additionally, you have to remember that every IRS employee is a public servant and has taken an oath to protect the american taxypayer.
That means it canbe really easy to get caught up in edge cases and anecdotal experiences when designing a product
For me, it was really important ebalance both qualitative and quantitative evidence when making decisions about what software is right.
I’ll give you two examples.
In one scenario, we had decided to leave out a feature to display what’s called the shared responsibility payment (the penalty you pay if you are uninsured under obamacare), because taxpayers did not understand it when interacting with our prototype. We got qualitative evidence that we didn’t need it. We subsequently learned that about 30% of taxpayers owed this payment, so the quantitative evidence made us add it back in.
On the flipside, we spent a lot of time worrying about how to handle an experience for users who need to pay on a future date, and designed several iterations of a prototype
We ultimately learned that the number of calls about this problem only represent about .5% of all calls about balance due. (Mind you that’s still 1,000 calls a month)
So the lesson here is that evidence can be finicky and plagued with opinions and agendas. It’s your job to balance both quant and qual, and not get stuck in analysis paralysis.
And that leads to my last point...
I spoke last year at this conference about how thinking like a scientist, instead of an expert, helped me be a better consultant, and helped me facilitate teams to better outcomes
I found that mentality has really applied in working with the IRS
Because IRS employees are taxpayers, they use the software that we build as well.
They have opinions on how the software should look and function
There is also a strong government culture around what are called SMEs, or subject matter experts.
There’s a lot of “expert” mentality in the government, and as we know, those opinions can often result in stalemates or analysis paralysis, or even government shutdowns.
The best we’ve found to combat this culturally is to think of ourselves as a scientists, not an expert.
We don’t know what the best experience is for users, but we can find it out by listing our assumptions, talking to our customers, putting something in front of them, and adjusting based on real evidence.
So when you run into expert mentality, try to use evidence to your advantage. And try to inspire others to think that way as well.
We are providing a federal service via the internet.
This means any software we release is legally bound to be accessible to taxpayers using assistive technology, such as a screenreader, or a voice command software, or navigating the web by keyboards.
A startup very rarely has to think about making their first release compliant, though there is a small an active community trying to change that fact.
For the government, we don’t have a choice. So we decided to embrace the constraint, and adjust our processes to make accessibility a first-class user experience.
This meant that product, design, development, and our 508 expert all collaborated on what the experience of a feature should be for a user who is visusally impaired, or accesses the web with dictation software.
Government has rules and regulations that it needs to follow that don’t apply to startups.
We are legally required to be auditable by an independent regulatory body, which means we must document our processes and that we have followed those processes.
We are legally bound to security regulations from NIST-800-53, which require an independent security audit.
We already mentioned the authority to operate agreement, which requires someone to be dragged in front of congress if something goes wrong
Compliance is table stakes.
We needed to adjust our process to ensure we could iterate and compliance would still be a first class citizen.
Launching an MVP in a week, while it sounds like unicorns dancing on rainbows, is likely not going to happen by a government agency.
But we’re learning, and we’re getting better.
For us, it’s important to remember our MVP needs to be both viable for our users and viable for all the federal regulations in place for a government agency to launch software.
We’re learning a ton. We could do a whole other talk on how we’ve been trying to adjust processes and regulations to be compliant but allow us to quickly course correct to ensure we are building the right thing.
We are also embracing the growing digital government community that’s helping pave the way to build more MVPs in government, including 18f, USDS, and digitalgov.
And we have a couple of pieces of advice that we want to share:
We wanted to speak here because we think there are not many examples of large institutions, in particular government agencies, embracing lean.
We wanted to share an open and honest perspective on our experience in hopes that others will open up about theirs and be inspired to try this.
Janice Fraser, my colleague who ran the leader’s guide workshop, always says try it on Monday.
The things we’ve learned by embracing lean have saved thousands of hours of development, and countless taxpayer dollars.
This is hard, but it’s also really rewarding.
Don’t be afraid. Try it on Monday! Share your stories!