U.S. Digital Services Playbook

The Digital Services Playbook, which is sponsored by the Chief Information Officers Council, lays out best practices for building effective digital services like web and mobile applications and will serve as a guide for agencies across government.

About the Digital Services Playbook

The American people expect to interact with government through digital channels such as websites, email, and mobile applications. By building better digital services that meet the needs of the people that use government services, they make the delivery of government policies and programs more effective.

Today, too many of government digital services projects do not work well, are delivered late, or are over budget. To increase the success rate of these projects, the U.S. Government needs a new approach. This is why the CIO Council created a playbook of 13 key “plays” drawn from successful best practices from the private sector and government that, if followed together, will help government build effective digital services.

DIGITAL SERVICE PLAYS

Understand what people need
Address the whole experience, from start to finish
Make it simple and intuitive
Build the service using agile and iterative practices
Structure budgets and contracts to support delivery
Assign one leader and hold that person accountable
Bring in experienced teams
Choose a modern technology stack
Deploy in a flexible hosting environment
Automate testing and deployments
Manage security and privacy through reusable processes
Use data to drive decisions
Default to open

PLAY 1

Understand what people need

We must begin digital projects by exploring and pinpointing the needs of the people who will use the service, and the ways in which the service will fit into their lives. Whether the users are members of the public or government employees, policy makers must include real people in their design process from the very beginning. The needs of people — not constraints of government structures or silos — should drive technical and design decisions. We need to continually test the products we build with real people to keep us honest about what is important.

checklist

Early in the project, spend time with current and prospective users of the service
Use a range of qualitative and quantitative user research methods to determine people’s goals, needs, and behaviors; be thoughtful about the time spent
Test prototypes of possible solutions with real people, in the field if possible
Document the findings about user goals, needs, behaviors, and preferences
Share findings with the team and agency leadership
Create a prioritized list of user stories, which are short descriptions of the goals the user is trying to accomplish
As the digital service is being built, regularly test it with potential users to ensure it will meet peoples’ needs

key questions

What user needs will this service address?
Why does the user want or need this service?
Who are your key users?
Which people will have the most difficulty with your service?
What research methods were used?
What were the key findings from users’ current experience?
How were the findings documented? Where can future team members access the documentation?
How often are you testing with real people?

PLAY 2

Address the whole experience, from start to finish

We must build digital services with an understanding of the range of ways a person might interact with our service, including the actions they take online, through a mobile application, on the phone, or in person. Every encounter should move the user closer towards the desired outcome, whether that encounter is online or offline.

checklist

Understand the different points at which people will interact with the service – both online and in person
Identify pain points in the current way users interact with the service, and prioritize these according to user needs
Design the digital parts of the service so that they are integrated with the offline touch points people use to interact with the service
Develop metrics that will measure how well the service is meeting user needs, at each step of the service

key questions

What are the different ways (both online and offline) that people currently accomplish the task the digital service is designed to help with?
Where are user pain points in the current way people accomplish the task?
Where does this specific project fit into the larger way people currently obtain the service being offered?
What metrics will best indicate how well the service is working for its users?

PLAY 3

Make it simple and intuitive

Using a government service shouldn’t be stressful, confusing, or daunting — it’s our job to build services that are simple and intuitive enough that users succeed the first time, unaided.

checklist

Create or use an existing, simple, and flexible design style guide for the service
Use the design style guide across related digital services
Provide users with clear information about where they are in the process as they use the service
Follow accessibility best practices to ensure all people can use the service
Provide users with a way to exit and return later to complete the process
Use language that is familiar to the user and is easy to understand
Use language and design consistently throughout the service, including in the online and offline (non-digital) touch points people use to interact with the service

key questions

What primary tasks are the user trying to accomplish?
What is the reading level of the language the service uses?
What languages is your service offered in?
If a user needs help while using the service how do they go about getting it?
How does the service’s design visually relate to other government services?

PLAY 4

Build the service using agile and iterative practices

We should use an incremental, fast-paced style of software development to reduce the risk of failure by getting working software into users’ hands quickly, and by providing frequent opportunities for the delivery team members to adjust requirements and development plans based on watching people use prototypes and real software. A critical capability is being able to automatically test and deploy the service so that new features can be added often and easily put into production. Following agile methodologies is a proven best practice for building digital services, and will increase our ability to build services that effectively meet user needs.

checklist

Ship a functioning “minimum viable product” (MVP) that solves a core user need addressed by the service as soon as possible, and not longer than three months from the beginning of any new digital project, using a “beta” or “test” period if needed
Run usability tests frequently to see how well the service works for users, and identify improvements that should be made
Ensure the individuals building the service are in close communication using techniques such as war rooms, daily standups, and team chat tools
Keep delivery teams small and focused; limit organizational layers that separate these teams from the business owners
Release features and improvements multiple times each month
Create a prioritized list of features and bugs, also known as the “feature backlog” and “bug backlog”
Use an “issue tracker” to catalog features and bugs
Use a source code version control system
Ensure entire team has access to the issue tracker and version control system
Use code reviews to ensure quality

key questions

How long did it take to ship the MVP? If it has not shipped yet, when will it?
How long does it take for a production deployment?
How long in days are the iterations/sprints?
Which source code version control system is being used?
What tool is being used to track bugs and issue tickets?
What tool is being used to manage the feature backlog?
How often do you review and reprioritize the items in your feature and bug backlog?
How do you collect user feedback during development and how is that feedback to improve the service?
At each stage of usability testing, what gaps were identified in addressing user needs?

PLAY 5

Structure budgets and contracts to support delivery

To improve our chances of success when contracting out development work, we need to work with experienced budgeting and contracting officers. In cases where we use third parties to help build a service, a well-defined contract can facilitate good development practices like conducting a research and prototyping phase, refining product requirements as the service is built, evaluating open source alternatives, ensuring frequent delivery milestones, and allowing the flexibility to purchase cloud computing resources.

The TechFAR Handbook provides a detailed explanation of the flexibilities in the Federal Acquisition Regulation (FAR) that can help agencies implement this play.

checklist

Budget includes research, discovery, and prototyping activities
Contract is structured to request frequent deliverables and not multi-month milestones
Contract is structured to hold vendors accountable to deliverables
Contract allows the government delivery team the flexibility to adjust feature prioritization as the project evolves
Contract ensures open source solutions are evaluated alongside commercial solutions when technology choices are made
Contract specifies that software and data generated by third parties remains under our control, and can be reused and released to the public as appropriate and in accordance with the law
Contract allows us to use tools, services, and hosting from vendors with a variety of pricing models, including fixed fees and variable service-based models like “pay-for-what-you-use” services
Contract specifies a warranty period where defects uncovered by the public are addressed by the vendor(s) at no additional cost to the government
Contract includes a transition of services period and transition-out plan

key questions

How frequent are the delivery milestones?
What are the performance metrics defined in the contract? (i.e., response time, system uptime, time period to address priority issues, etc.)

PLAY 6

Assign one leader and hold that person accountable

There must be a single product owner who has the authority and responsibility across teams to assign tasks and work elements; make business, product, and technical decisions; and is accountable for the success or failure of the overall service. This product owner is ultimately responsible for how well the service is meeting the needs of its users, which is how a service should be evaluated. The product owner is responsible for ensuring the features are built and managing the feature and bug backlogs.

checklist

A product owner has been identified
All stakeholders agree that the product owner has the authority to assign tasks and make decisions about features and technical implementation details
The product owner has a product management background with technical experience to assess alternatives and weigh tradeoffs
The product owner has a work plan that includes budget estimates and identification of funding sources
The product owner has a strong relationship with his or her contracting officer

key questions

Who is the product owner?
What organizational changes have been made to ensure the product owner has sufficient authority over and support for the project?
What does it take for the product owner to add or remove a feature from the service?

PLAY 7

Bring in experienced teams

We need talented people working in government who have experience creating modern digital services. This includes bringing in seasoned product managers, engineers, and designers. When outside help is needed, our teams should work with contracting officers who understand how to evaluate third-party technical competency so our teams can be paired with contractors who are good at both building and delivering effective digital services. The makeup and experience requirements of the team will vary depending on the scope of the project.

checklist

Member(s) of the team have experience building popular, high-traffic digital services
Member(s) of the team have experience designing mobile and web applications
Member(s) of the team have experience using automated testing frameworks
Member(s) of the team have experience with modern development and operations (DevOps) techniques such as continuous integration and continuous deployment
Member(s) of the team have experience securing digital services
A Federal contracting officer is on the internal team if a third party will be used for development work
A Federal budget officer is on the internal team or is a partner
The appropriate privacy, civil liberties, and/or legal advisor for the department or agency is a partner

PLAY 8

Choose a modern technology stack

The technology decisions we make need to enable development teams to work efficiently and enable services to scale easily and cost-effectively. Our choices for hosting infrastructure, databases, software frameworks, programming languages and the rest of the technology stack should seek to avoid vendor lock-in and match what successful modern consumer and enterprise software companies would choose today. In particular, digital services teams should consider using open source, cloud based, and commodity solutions across the technology stack, as these solutions have seen widespread adoption and support by the most successful private-sector consumer and enterprise software technology companies.

checklist

Choose software frameworks that are commonly used by private-sector companies creating similar services
To the extent practical, ensure that software can be deployed on a variety of commodity hardware types
Ensure that each project has easy to understand instructions for setting up a local development environment, and that team members can be quickly added or removed from projects
Consider open source software solutions at all layers of the stack

key questions

What is your development stack and why did you choose it?
What database(s) are you using and why did you choose them?
How long does it take for a new team member to set up a local development environment?

PLAY 9

Deploy in a flexible hosting environment

Our services should be deployed on flexible infrastructure, where resources can be provisioned in real time to meet spikes in user demand. Our digital services are crippled when we host them in data centers which market themselves as “cloud hosting” but require us to manage and maintain hardware directly. This outdated practice wastes time, weakens our disaster recovery plans, and results in significantly higher costs.

checklist

Resources are provisioned on demand
Resources scale based on real-time user demand
Resources are provisioned through an API
Resources are available in multiple regions
We pay only for the resources we use
Static assets are served through a content delivery network
Application is hosted on commodity hardware

key questions

Where is your service hosted?
What hardware does your service use to run?
What is the demand / usage pattern for your service?
What happens to your service when it experiences a surge in traffic or load?
How much capacity is available in your hosting environment?
How long does it take you to provision a new resource such as an application server?
How have you designed your service to scale based on demand?
How are you paying for your hosting infrastructure — i.e., by the minute, hourly, daily, monthly, fixed?
Is your service hosted in multiple regions / availability zones / data centers?
In the event of a catastrophic disaster to a datacenter, how long will it take to have the service operational?
What would be the impact of a prolonged downtime window?
What data redundancy do you have built into the system, and what would be the impact of a catastrophic data loss?
How often do you need to contact a person from your hosting provider to get resources or to fix an issue?

PLAY 10

Automate testing and deployments

Today, developers write automated scripts that can verify thousands of scenarios in minutes and then deploy updated code into production environments multiple times per day. They use automated performance tests which simulate surges in traffic to identify performance bottlenecks. While manual tests and quality assurance is still necessary, automated tests provide consistent and reliable protection against unintentional regressions, and make it possible for developers to confidently release frequent updates to the service.

checklist

Create automated tests that verify all user-facing functionality
Create unit and integration tests to verify modules and components
Run tests automatically as part of the build process
Perform deployments automatically with deployment scripts, continuous delivery services, or similar techniques
Conduct load and performance tests at regular intervals, including before public launch

key questions

What percentage of the code base is covered by automated tests?
How long does it take to build, test, and deploy a typical bug fix?
How long does it take to build, test, and deploy a new feature into production?
How frequently are builds created?
What test tools are used?
What deployment automation or continuous integration tools are used?
What is the estimated maximum number of concurrent users who will want to use the system?
How many simultaneous users could the system handle, according to the most recent capacity test?
How does the service perform when you exceed the expected target usage volume? Does the service degrade gracefully or catastrophically?
What is your scaling strategy when demand increases suddenly?

PLAY 11

Manage security and privacy through reusable processes

It is critical that our digital services protect sensitive information and keep systems secure. This is typically a process of continuous review and improvement which should be built into the development and maintenance of the service. At the start of designing a new service or feature, the team lead should engage the appropriate privacy, security, and legal officer(s) to discuss the type of information collected, how it should be secured, and how it may be used and shared. The sustained engagement of a privacy specialist helps ensure that personal data is properly managed. In addition, a key process to building a secure service is comprehensively testing and certifying the components in each layer of the technology stack for security vulnerabilities, and then to re-use these same pre-certified components for multiple services.

The following checklist provides a starting point, but teams should work closely with their privacy specialist and security engineer to meet the needs of the specific service.

checklist

Contact the appropriate privacy or legal officer of the department or agency to determine whether a System of Records Notice (SORN), Privacy Impact Assessment, or other review should be conducted
Determine, in consultation with a records officer, what data is collected and why, how it is used or shared, how it is stored and secured, and how long it is kept
Determine, in consultation with a privacy specialist, whether and how users are notified about how personal information is collected and used, including whether a privacy policy is needed and where it should appear, and how users will be notified in the event of a security breach
Consider whether the user should be able to access, delete, or remove their information from the service
“Pre-certify” the hosting infrastructure used for the project using FedRAMP
Use deployment scripts to ensure configuration of production environment remains consistent and controllable

key questions

Does the service collect personal information from the user (whether government or public)? How is the user notified of this collection?
Does it collect more information than is needed to perform the requested task? Are there uses of the data that would not be expected by the average user?
How does a user contact a responsible person to seek access, correction, deletion, or removal of his or her personal information?
Will information stored in the system be shared with others?
How and how often is the service tested for security vulnerabilities?
How can someone from the public report a security issue?

PLAY 12

Use data to drive decisions

At all stages of a digital project, we should measure how well our service is working for our users. This includes measuring how well a system performs and how people are interacting with the system in real time. Our teams and agency leadership should carefully watch these metrics to proactively spot issues and identify which improvements should be prioritized. In addition to monitoring tools, a feedback mechanism should be in place for people to report issues directly.

checklist

Monitor system-level resource utilization in real time
Monitor system performance, measuring response time, latency, throughput, and error rates in real-time
Ensure monitoring in place can measure median, 95th percentile and 98th percentile performance
Create automated alerts based on this monitoring
Track concurrent users in real time, and monitor user behaviors (in the aggregate) to determine how well the service is meeting user needs
Publish metrics internally
Publish metrics externally
Use an experimentation tool that supports multivariate testing in production

key questions

What are the key metrics for the service?
How have these key metrics performed over the life of the service?
What system monitoring tool(s) are in place?
What is the targeted average response time for your service? What percent of requests take more than 1 second, 2 seconds, 4 seconds, and 8 seconds?
What is the average response time and percentile breakdown (percent of requests taking more than 1s, 2s, 4s, and 8s) for your service’s top 10 transactions?
What is your service’s monthly uptime target?
What is your service’s monthly uptime percentage including scheduled maintenance? Excluding scheduled maintenance?
How does your team receive automated alarms when incidents occur?
What is the volume of each of your service’s top 10 transactions? What is the percentage of transactions started vs. completed?
What tool(s) are in place to measure user behavior?
What tool/technology is used for A/B testing?
How do you measure customer satisfaction?

PLAY 13

Default to open

When we collaborate in the open and publish our data publicly we can improve Government together. By building services more openly and publishing open data, we simplify the public’s access to government services and information, allow the public to easily provide fixes and contributions, and enable reuse by entrepreneurs, nonprofits, other agencies, and the public.

checklist

Offer users a mechanism to report bugs and issues, and be responsive to these reports
Provide datasets to the public, in their entirety, through bulk downloads and APIs (application programming interfaces)
Ensure that data from the service is explicitly in the public domain, and that rights are waived globally via an international public domain dedication, such as the “Creative Commons Zero” waiver
Catalog data in the agency’s enterprise data inventory and add any public datasets to the agency’s public data listing
Ensure that we maintain the rights to all data developed by third parties in such a manner that is releasable and reusable at no cost to the public
Ensure that we maintain contractual rights to all custom software developed by third parties in such a manner that is publishable and reusable at no cost
When appropriate, create an API for third parties to interact with the service directly
When appropriate, publish source code of projects or components online
When appropriate, share your development process and progress publicly

key questions

How are you collecting user feedback for bugs and issues?
If there is an API, what capabilities does it provide? Who uses it? How is it documented?
If the codebase has not been released under an open source license, explain why.
What components are made available to the public as open source?
What datasets are made available to the public?