Faster onboarding in a fully remote world

Max Saltonstall
Google Cloud - Community
8 min readNov 19, 2020

--

(Ayka Agayeva, IT lead for Google’s program to shift onboarding to a fully work-from-home world, sat down with me to talk about the crazy year this has been and some of her team’s work. She organized the significant changes to establish virtual onboarding, and worked with me to collect the lessons they learned along the way.)

Hiring new people during a global pandemic isn’t usually covered in your standard HR or IT handbooks. Our goal was to get people up and running as soon as possible, ideally productive during the first week.

We needed to get new employees up and running quickly

But we did not have the resiliency to handle it right off the bat. We had to scramble, and find a new way of working.

To start we asked ourselves what it would take to create fully virtual new hire onboarding, historically done in large groups on our major campuses, and how could we enable our employees to become productive within a week of their start date, whether it’s submitting code, managing infrastructure, or providing customer service. Our program team worked closely with teams in IT support, HR, Security, and Inventory Management to put in place the pieces for speedy onramp and access.

People waiting in line Source

Thanks to cross-team efforts, redesigned processes, and some technical adaptations, we were able to reach our goal for more than 90% of new hires. In this blog we’ll cover how we accomplished that and what we learned along the way.

Cross-functional teamwork and technology changes allowed for faster onboarding

Forcing function

At the beginning of 2020, Google was onboarding an average of 500 employees per week across its global offices. Pre-pandemic, “Day 1” onboarding was a major in-person event, which involved crowds of excited new employees completing tasks like HR paperwork, identity verification, IT equipment distribution and set-up, login credential creation, and orientation sessions with dozens of HR, IT, and Security team members.

As a company, we rely heavily on a mix of people and technical infrastructure to make events like this possible. We test the resilience of our teams and systems annually with exercises that we call DiRT, but these tests are usually regional and not designed to simulate a worldwide pandemic. When everyone was suddenly working from home, we had to quickly shift how we onboard new employees to ensure we were doing it securely from anywhere in the world.

Good disaster recovery practices enable better responses for novel scenarios

To get there, we focused first on people, then on process, and finally on technology.

Who’s involved

Who is this service going to affect and which assumptions are no longer true for them?

A users-first approach is important in any effort, but it was especially important during this unique crisis. The global workforce’s circumstances had changed and that introduced new personal challenges for the people we were onboarding, as well as the supporting teams. For example, we could no longer assume that our new employees were available during business hours, had access to technology at their homes, or were able to travel.

Cross-functional team collaborating remotely

Who should be part of the core project team?

This project required our team to be cross-functional, have embedded decision-makers, and maintain clear channels of escalations. Structuring our team this way allowed us to make decisions quickly and conduct experiments without sending everything up the chain. Having the right people in the room helped speed everything up, from unblocking work to escalating as needed. We could resolve all these problems more efficiently by empowering the team to act, rather than requiring them to go to a risky single point of failure.

Front-line and back-end teams are both key to success

For example, decisions on what constitutes critical productivity equipment or the maximum weekly onboarding capacity were made live during the daily sync meetings — every lead had the ability to represent their organization and sign-off without the need to escalate. For major decisions, such as whether we should buy or build specific technical capability, our project leads had clear channels of escalation with response times within hours.

The execution team needs authority to make fast decisions

Who do we need to maintain focus on supporting the needed infrastructure (and protect from day-to-day interruptions)?

Lastly, it was important to ensure that staffing plans included both the frontline and the behind-the-scenes strategists. While in crisis, the front-line emergency-response folks ran the experiments (e.g., which security keys to use, length of appointment slots), handled escalations, triaged, and communicated with our new hires. We were deliberate in staffing the front-line with senior people, who could identify friction in the process, make judgement calls when necessary, and help inform fast-paced iterations. Distanced from interruptions, the strategists and backend support ensured the availability of necessary services, evaluated experiment strategy and outcomes, adjusted support as needed, and reported to stakeholders.

Unexpected failure: security keys surprisingly hidden in the folds of shipment boxes and being missed.

As an example, early on the support team identified that a large number of new hires were struggling with the identity verification and login steps. As a result of their frontline analysis, multiple changes were tested and implemented, such as adjustments to security key shipments, improvements to certain operating systems, and timeout flags for some identity and login tasks.

Our new employee process, pre-COVID

Operations, training and coordination

While moving quickly, with such high stakes and so much uncertainty, we knew our ability to turn onboarding into a successful remote-only process would depend on having quick access to information, minimizing distractions, and prioritizing efficiently.

First, we embedded our remote onboarding program within the company-wide incident response effort that was already in flight. This gave us access to the daily briefing calls and communications channels, connecting us to the leadership and other incident response teams. That presence helped us leverage research and findings from other workstreams, such as compatibility ratings of various platforms for remote development work.

Another benefit of this was the direct and fast access to a pool of experts across many domains as they were all operating under crisis mode and elevated response times. We were able to tap into this for one-off requests (e.g., to understand and/or change behavior of login accounts under certain conditions).

Connect the front-line response team to the subject matter experts

Next, we prioritized defining and aligning all of the stakeholders and the broader response team on our success criteria, specifically what percentage of Nooglers we would get logged into corporate resources by the end of day 1, 3, and 5. Taking the time to agree and regularly report on key metrics saved us escalations and confusion down the road. We minimized distractions so the team could stay focused on building and running the service.

Align everyone on specific, measurable goals for success

Lastly, we had a “parking lot” for capturing items (long term risks, business continuity considerations, scaling, etc.) that were not blockers during the crisis, but were important nevertheless. Because we’d be eventually transitioning out of crisis response and back to sustaining operations, it was important to not lose these data points.

Technology bridging the gaps

Our remote onboarding process had three major components that needed a (quick!) technical rework:

  1. Verifying a person’s identity remotely
  2. Issuing login credentials
  3. Setting the new employee up with necessary IT equipment

We were able to lean heavily on our existing cloud infrastructure and also brought in third-party systems where we didn’t have expertise or coverage.

Remote identity verification was one of the biggest challenges, and we made an early decision to use a third-party technology rather than build internally. We focused our efforts on the infrastructure for ingesting verification data and adding additional security controls to connect it to our internal systems. Because of the criticality of this first step, we worked closely with our security engineering team on both security and usability.

Authentication is ALWAYS the hard part

Remotely issuing login credentials was relatively straightforward since our IT support staff was able to replicate it over the existing technology of Google Meet. To compensate for the lack of physical presence, we implemented additional security controls, such as out-of-band SMS verifications. We took feedback from our support representatives daily, and our new hires weekly, to identify what was working and what wasn’t. That feedback was a crucial input to the iterations we made alongside the security and account engineering teams.

Setting up corporate hardware for new employees had two components: physically providing the equipment to the person and connecting it to our corp resources. We won’t focus on the logistics in this post; instead we’ll cover the devices choices and set-up. Because Google is a cloud company, we had a few things working in our favor, but we also encountered a few challenges.

Working in the cloud enables a rapid shift to new work circumstances

Chromebooks were the device of choice, as the imaging and setup could be done by anyone in a few short minutes. Built-in features in the OS also made it easier to set up the rest of the security controls, such as security keys and certificates. For the rest of the platforms, we shipped machines with pre-configured images. While this made the experience better for new hires, it was not a scalable long-term solution (and landed in our post-crisis “parking lot”).

Chromebooks!

We were able to deploy virtual desktops on Google Cloud Platform as the dev environments for the vast majority of our developers and engineers. For the rest, we set up a process to deploy physical workstations, which was expensive and unscalable (and also landed in our post-crisis “parking lot”).

Our BeyondCorp environment made remote access a seamless experience. Our new hires could connect to corp resources and be productive without requiring additional software. This was a huge win.

The new normal

Better all around

Crises provide an opportunity to rethink your processes, challenge your assumptions, and deliver infrastructure and services in a more resilient and scalable state. Here are three key takeaways from our experience pivoting to remote onboarding:

  • Cloud capabilities matter. Assess your infrastructure today and how it can scale in various scenarios. We leaned on the scale and reliability of our public cloud infrastructure (virtual desktops being one of them) to keep onboarding employees smoothly.
  • Trivial bottlenecks can become major failure points during a crisis. It’s worthwhile to rethink your supply chains, platform management systems, account operations and support staff.
  • Know your company’s incident response procedures. When possible, embed or align with an existing company-wide effort. This can open access to the necessary subject matter experts. More importantly, it also connects you to your senior leadership and business owners when it’s imperative to make decisions quickly and stay coordinated.

Adaptation can lead to permanent improvements

Don’t wait for the next crisis. Take a moment now to assess your people, policies, processes, systems, supply chains and identify potential failure points or bottlenecks. This will help you navigate the next surprise with more preparation and confidence.

--

--

Max Saltonstall
Google Cloud - Community

Father, gamer, juggler, tech enthusiast. I tell stories about how to cloud, and keep it all secure. Sometimes make games. Opinions are my own. Also chocolate