About the Role
The Software Engineer (L3) leads production reliability efforts for KAST's core financial platforms, partnering closely with engineering teams. This role combines deep engineering insight with hands-on operational ownership of production systems, directly protecting customer trust as we scale globally.
Responsibilities
- Lead production incident response for KAST's core platforms, ensuring fast resolution and minimal customer impact.
- Drive deep root cause analysis for high-severity incidents and turn learnings into permanent, long-term reliability improvements.
- Debug and resolve issues across application, data, and infrastructure layers in distributed, cloud-based systems.
- Use logs, metrics, and traces to understand system behavior, identify failure patterns, and improve observability.
- Partner closely with engineering and platform teams to resolve defects and raise the overall reliability bar.
- Lead incident reviews and contribute to improving how we prevent, detect, and respond to production issues.
- Execute configuration changes, hotfixes, and rollbacks safely while protecting system availability.
- Improve operational readiness by evolving runbooks, SOPs, alerts, and dashboards as systems scale.
- Ensure production systems consistently meet availability, performance, security, and compliance expectations.
- Participate in on-call rotations and take ownership during live incidents when reliability matters most.
- Proactively identify operational risks, technical debt, and system stability gaps before they impact users.
Requirements
- 5+ years of experience in application development, site reliability engineering or similar roles with exposure to handling high-impact production incidents.
- A strong foundation in supporting cloud-hosted systems on platforms like AWS and GCP, with an understanding of how reliability scales in real-world environments.
- Hands-on experience working with containerized, microservices-based architectures and the challenges that come with operating them in production.
- Confidence debugging and troubleshooting both front-end and back-end applications, including systems built with modern programming languages and frameworks such as Go, Python, JavaScript, Next.js, Flutter (Dart), and related ecosystems.
- Solid understanding of CI/CD pipelines, deployment strategies, and release management practices.
- Experience using monitoring, logging, and alerting tools to diagnose production issues and improve system observability over time.
- Strong understanding of Linux systems, networking fundamentals, and cloud security basics.
- Familiarity with infrastructure-as-code and configuration management concepts.
- Experience supporting data stores and messaging platforms in production environments.
- The ability to stay calm, structured, and effective during high-severity incidents.
- Clear communication skills and a collaborative mindset when working across engineering, operations, and business teams.
- AWS cloud certification (associate or professional level) is a plus.
Skills
`AWS` · `GCP` · `Containerized Architectures` · `Microservices` · `Go` · `Python` · `JavaScript` · `Next.js` · `Flutter` · `Dart` · `CI/CD Pipelines` · `Deployment Strategies` · `Release Management` · `Monitoring` · `Logging` · `Alerting Tools` · `Linux Systems` · `Networking Fundamentals` · `Cloud Security` · `Infrastructure-as-Code` · `Configuration Management` · `Data Stores` · `Messaging Platforms` · `AWS Cloud Certification`
What we offer
- Be part of an innovative, fast-growing finance platform.
- Play a key role in fixes, improve systems, and influence how production reliability evolves at KAST
- Career growth opportunities as we expand our global technical team.
- Flexible work environment with remote and hybrid options.
- Competitive compensation package and stock option plan, ensuring you're invested in our success.
- Work with a passionate team pushing the boundaries of stablecoin finance
How to apply
Email your CV to **careers@kastcard.com** with the position title in the subject line.
---
_Source: https://topjobs.lk/logo/topjobs-batch/14.png_