“The best way to predict the future is to create it.”

Abraham Lincoln


Site Reliability Engineer (SRE)

Portland, Oregon, or Remote

Job Description

We are on the lookout for a seasoned Site Reliability Engineer (SRE) with a strong focus on Kubernetes to enhance our technology team. This pivotal role centers around improving the reliability and efficiency of applications managed through Kubernetes. The ideal candidate is someone who brings a deep understanding of deploying and managing Kubernetes in live production environments and who is passionate about maintaining high standards of system stability and performance.

Key Responsibilities

  • Implement and manage containerized applications on Kubernetes, ensuring both reliability and scalability through largely automated processes.

  • Proactively monitor system performance and reliability to identify and resolve issues swiftly, surpassing Service Level Agreements (SLAs).

  • Handle system alerts, accurately diagnose issues, escalate when necessary, and execute predefined scripts to address operational challenges.

  • Create detailed documentation of system anomalies and initiate JIRA tickets to support the engineering team in addressing complex or recurrent issues.

  • Utilize tools like Grafana for setting up alerts and monitoring system performance to troubleshoot effectively.

Qualifications and Skills

  • Extensive experience in deploying code and managing Kubernetes clusters in live production environments.

  • Proficient in major cloud platforms (AWS, GCP, Azure) and their Kubernetes management frameworks.

  • Skilled in using performance monitoring tools such as Grafana and Prometheus, and operational incident management tools like PagerDuty and JIRA.

  • Capable of running predefined scripts efficiently, with a strong understanding of their functionalities and implications.

  • Excellent problem-solving skills, with a focus on detailed documentation and clear communication to expedite issue resolution.

  • Experience with Git, Kafka, Prometheus, and Postgres is desirable.

  • Effective in a collaborative team setting, demonstrating a high level of responsibility for assigned tasks.

Bonus Qualifications

  • In-depth expertise in Kubernetes management.

  • Experience with database management and SQL for optimizing data handling within Kubernetes platforms.

To apply, please send your resume, cover letter, and any relevant references to careers@hydrolix.io

We look forward to seeing how you can make an impact at Hydrolix.

Back to Jobs

Hydrolix provides equal employment opportunities without regard to an applicant’s race, sex, pregnancy, sexual orientation, gender identity or expression, genetic information, national origin, age, physical or mental disability, medical condition, religion, marital status or veteran status.

Applicants with disabilities may be entitled to reasonable accommodation under the terms of the Americans with Disabilities Act and certain state or local laws. A reasonable accommodation is a change in the way things are normally done which will ensure an equal employment opportunity without imposing undue hardship on Hydrolix. Please inform us if you need assistance completing any forms or to otherwise participate in the application process.

Ready to start? Try us out!

Looking to cut your log data retention costs by 75%?
Give Hydrolix a try or get in touch with us to learn more!