firstname.lastname@example.org +1 707 799 8675
Based in Beacon, NY, I am a Software Developer, Technical Lead and Manager. I have been writing software for the web professionally since 2005. I have two passions in software. The first is building beautiful tools for people to share knowledge, advice and experience. Second is creating reliable infrastructure that is easy to use and maintain. Outside of building systems, I love growing team culture and helping folks grow in their careers.
Outside of the tech-world I am an Eagle Scout. I enjoy contributing to open source projects, writing, reading, fishing, listening to music and wandering through cities and countrysides. I maintain a personal website at natwelch.com.
Author & Speaker
- Co-Author of the book Reliable Webservers with Go from Newline. 2021.
- Author of the book Real World SRE from Packt Publishing. 2018.
- Published in Issue Three and Issue Six of Code Words. 2015 & 2016.
- Spoke at LinuxConf 2014, Strange Loop 2017, SRECon Americas 2017, SRECon Americas 2019, Illuminate 2022 and others.
- Since 2015, Mentor through natwelch.com/wiki/mentoring. From 2016 to 2020, I mentored through Out of Office Hours. In 2022 and 2023 I mentored through ADPList. I help folks with career and architecture questions weekly, averaging around 30 individuals a year.
Open Source Developer
I am the former lead of the open source project fog-google (60 million installs), am the lead maintainer on danluu/post-mortems and also contribute to a bunch of other small projects.
Laurel (fka Time by Ping)
Principal Software Engineer, Cloud Platform Lead November 2020 to present.
- Lead and manage a team of three.
- Manage our cloud infrastructure, automation, developer tooling, observability, performance and reliability efforts.
- In charge of all technical vendor relationships and evaluating new vendors. I drove a vendor minification project that shrank us from nine observability providers to one and three CI/CD providers to one.
- I manage our infrastructure budget and work closely with our Head of Finance to manage and report our CoGS. I have led multiple projects to lower our CoGS and improve engineering monetary efficiency.
- Provide architectural guidance and write backend software. Define infrastructure and reliability requirements for teams.
- I often do technical communication with customers. I led integrations with four large law firms, one of the world's largest accounting firms.
- I define and manage our on-call policies and rotations, and am an active member of the infrastructure rotation.
- It is a startup, I wear a lot of hats.
Senior Site Reliability Engineer November 2018 to November 2020.
- Worked on the Customer Reliability Engineering team. CRE helps customers achieve Google-level reliability by partnering with them to implement SRE operational best practices. I gave presentations to groups of varying levels of seniority and size at every level of every size company. I helped companies architect and plan for global launches, re-architect on-prem systems as they moved to cloud, and develop SRE programs. My role was a mix of Tech Lead, Developer Advocate, Software Developer and Traveling Consultant.
- CRE's small team is listed as one of the top three strengths of GCP in both the 2018 and 2019 IAAS Gartner Report.
- Built multiple data pipelines to evaluate customer reliability.
- Worked as the SRE lead on Google's Covid19 Exposure Notification system.
First Look Media
Lead Site Reliability Engineer March 2017 to October 2018.
Migrated three services from Colos to AWS. Maintained Terraform config and AWS for company. Improved deploy reliability and automation, wrote new features for most services and refactored entire ECS infrastructure. Mentored engineers around infrastructure, reliability and architecture design. Automated capacity planning, started a postmortem culture, and improved performance and reliability of our CMS platform. Wrote Go and Node.js with extensive work with a GraphQL API.
Hillary for America
Staff Site Reliability Engineer January 2016 through December 2016.
Promoted reliability in both our web serving infrastructure and data analytics pipelines. Built tools and infrastructure to prevent humans from making mistakes while sleep deprived. Survived constant attacks with minimal external visible downtime the entire campaign.
Technical Lead August 2015 to January 2016.
- Led optimization efforts for holiday traffic. Cut average site load time in half and shrunk average API response time by two magnitudes.
- Managed a team of three full time software engineers. Helped define code review and code style policies for the development team.
Site Reliability Engineer III April 2012 to March 2015.
- SRE for Google Compute Engine in London and San Francisco. My job included being part of an on-call rotation and writing software to maintain, monitor and optimize millions of servers globally.
- While on SRE I also worked on Google Cloud Storage and designed and built Google Cloud Status.
Software Engineer II August 2011 to April 2012.
I worked on Punchd, Google Offers and Google Local.
Software Developer January 2011 to September 2012.
Maintained backend app. Migrated to AWS. Acquired by Google.
Software Developer April 2009 to April 2011.
Built Answers, improved wiki and image processors.
Software Developer 2005 to 2009.
I was a software developer contractor dealing mainly in web design and Linux systems management.
Adobe Systems Incorporated
Dreamweaver Quality Engineering Intern Summer 2007 and 2008.
Cal Poly CSC Department
Computer Lab Staff 2007.
County of Sonoma ISD
Software Development Intern Summer 2005.
BSA Camp Oljato
Nature Director Summer 2006.
Camp Counselor Summer 2002, 2003 and 2004.
Computer Science, B.S. California Polytechnic State University, San Luis Obispo. Fall 2006 - Spring 2011.
Recurser Recurse Center, New York. Spring 2015