Site Reliability Engineering - Public Home

Hi. My name is Alexander Nicholson, and I’m the current Site Reliability Engineering Manager here at TableCheck. I want to try and give you an idea of what it’s like to apply for a job in the Site Reliability Engineering team at TableCheck, as well as what it’s like to work for us if you’re hired.

Hiring

Firstly, we’re looking for people who want to work on our mission:

We help TableCheck’s various development, operations and other business units to run a robust and fault-tolerant infrastructure built on Amazon Web Services (AWS) with Terraform, Kubernetes, Helm, and an array of tools for CI/CD, logging, monitoring, and so on. We emphasize DevOps best practices such as agile, scrum, automation, and customer-centric improvements.

In a nutshell, this basically entails the following:

Keep the lights on,
reduce toil,
increase velocity,
and keep our customers happy.

Our team thrives on continuous communication, which means we talk every day in both async (using Slack and Loom), and in-sync (using Around.co).

We aren’t exclusively trying to hire those that believe they are the top 1% in a competency. We are a team who believes in learning, and as the “quarterbacks” of the Engineering department, we are responsible for learning a lot. We don’t lie about what we know, and are realistic about our goals and expectations. We bias for action, so we aren’t afraid to make leaps. We want architecture, infrastructure and supporting systems that allow us to bias for action.

We have a set of https://tablecheck.atlassian.net/wiki/spaces/SREPUB/pages/2653618276 which we follow during our everyday work. It may be odd to have a set of principles in a specific team, but considering that we act as a multi-faceted part of the organization, it’s important for us to treat departments and teams like customers.

Our organization has code written in Ruby, Elixir, Python, Scala, Javascript, TypeScript, Bash, with different frameworks, differing ages of code, and numerous teams involved in projects. This means that as the backbone of these teams, we need to be at the same skill level as those teams. We are all versed in at least one programming language, not including YAML, and all have a high level of competency using Kubernetes (AWS EKS).

Working

We provide paid time off for all members of the team, including contractors.
We are 100% remote. No offices.
- A note from Alexander:
  We at TableCheck actually have a "work remotely, from anywhere, at anytime" policy which means anywhere. I’ve seen multiple companies in Japan advertise remote work, but actually mean that they want you to work somewhere within Japan. We have SREs and developers in France, Ukraine, the Philippines and many other countries, who work their timezone’s hours.
We provide an industry-level pay scale.
We sponsor visas for those wishing to relocate to Japan. But you do not have to! This is a fully-remote position.
We work hard to reduce toil and ensure our pagers go off as little as possible. You can see our live status page here: http://status.tablecheck.com/