Kiwimesh

HR Tech · SaaS · Enterprise

HR performance management platform

Zero-downtime AWS migration for a live US SaaS product

We stepped into a live, third-party-hosted platform and migrated it to AWS without a minute of downtime. Established CI/CD, agile process, and a full UI overhaul.

Zero downtimeOngoing T&MCloud migration

Employee Performance Management Platform

Domain: HR Tech / SaaS Engagement: Time & Material (ongoing) Team Size: Dedicated cross-functional pod, full-stack Client Location: United States


The Problem

The client operates an established US-based SaaS platform for employee performance management — a real business with real revenue and real customers, not a greenfield product. The engineering reality had drifted away from the product reality: the platform was hosted on a poorly managed third-party cloud provider with limited scalability, no CI/CD pipeline, no staging environment, and an aging UI that no longer met modern usability standards. Every change carried risk because the system under it was brittle.

They needed a partner who could do two hard things at once: keep the plane in the air (active customers, live contracts) while rebuilding the plane (infrastructure, delivery pipeline, UI, code quality). Cutting customers off for a maintenance window wasn't an option. Taking six months of feature freeze to refactor wasn't either.


What We Do

An ongoing engagement where we operate as the client's extended engineering team. Scope covers the full application lifecycle — infrastructure modernization, delivery engineering, incremental codebase rehabilitation, feature development, and UI redesign — with zero customer-visible disruption as a non-negotiable baseline.

Infrastructure migration — with zero downtime

Moved the entire production stack from an unreliable third-party host to AWS without a maintenance window. The approach wasn't a single big cut-over; it was a staged migration designed so every phase was independently rollback-able:

  • Infrastructure parity first — stood up the target AWS environment (EC2 / ASG, RDS Multi-AZ, S3, CloudFront, Route 53, VPC with private subnets) and validated it against production traffic shapes before touching DNS
  • Database replication — ran MySQL replication from the legacy host into RDS, kept it running hot while the app still wrote to the legacy primary
  • Application dual-readiness — deployed the app to AWS pointing at the replica, ran shadow traffic and smoke tests against it without customers seeing any of it
  • Controlled cut-over — promoted the AWS RDS instance, flipped DNS through a low-TTL window, monitored error and latency budgets, with a documented reverse path if anything spiked
  • Observed soak period — held the legacy environment in standby for a bounded window post-cut-over in case anything needed to fall back

Post-migration the stack picked up what the previous host lacked: automated daily backups with point-in-time recovery, Multi-AZ failover for the database, auto-scaling groups behind a load balancer, secrets in AWS Secrets Manager rather than config files, IAM role-based access rather than long-lived keys, and VPC-level network segmentation.

Delivery engineering — CI/CD from zero

There was no pipeline when we arrived. Every deploy was a human with SSH access. We built the delivery system the product had always needed:

  • Automated CI on every pull request: linters, unit tests, security dependency scans, and build verification — red builds don't merge
  • Environments — isolated dev, staging, and production with parity of configuration (same infra-as-code, just scaled and parameterized differently)
  • Staged deployment — merges to main deploy to staging automatically; production deploys are promoted from a staging artifact after smoke tests, never built fresh for prod
  • Rollback as a first-class action — previous release artifacts are retained, so rolling back is a one-command operation, not a re-deploy of old code from git
  • Release discipline — a documented release checklist, change-log per deployment, and on-call rotation for post-deploy ownership

Application development — rehabilitating a legacy PHP + MySQL codebase

The codebase is PHP + MySQL with a React frontend. Legacy PHP earns its reputation honestly, but responsible modernization isn't a rewrite — it's disciplined incremental improvement:

  • Contract-first API stability — the boundary between backend and frontend is documented and tested so frontend and backend can evolve on independent cadences
  • Feature flags for anything risky — new behavior ships behind a flag, rolls out to a cohort, and is measurable before full enablement
  • Audit-driven hardening — code quality and security audit findings are triaged, prioritized by risk, and worked off as a structured backlog rather than ad-hoc
  • Performance work — targeted profiling of slow endpoints, query plan review, index additions, and N+1 elimination; improvements verified against production traffic shapes, not synthetic benchmarks
  • Frontend evolution — incremental React component modernization aligned with the UI redesign, with a shared component library as the target state

Observability & operations

  • Centralized application logs with structured fields and correlation IDs
  • Infrastructure metrics and alerting tied to user-facing SLOs, not CPU graphs
  • Error tracking with stack traces and user context
  • On-call runbook for the incidents we've actually seen, not theoretical ones

UI/UX revamp (in progress)

A dedicated UI designer is leading a full redesign of the SaaS experience. The redesign is delivered as a connected design-to-development pipeline: design system tokens, component specs, accessibility review, and staged rollout screen-by-screen rather than a flag-day relaunch. The goal is a modern, accessible, cohesive product — without a risky one-shot migration.

Project management

Agile sprints with regular communication, proactive architectural recommendations, and scope prioritization done jointly with stakeholders. We operate with the transparency of an in-house team: the client can see what we're working on, what's blocked, and what's shipping.


Tech Stack

Layer Technology
Backend PHP, MySQL
Frontend React JS
Infrastructure AWS (EC2/ASG, RDS Multi-AZ, S3, CloudFront, Route 53, VPC)
Secrets & Identity AWS Secrets Manager, IAM role-based access
CI/CD Automated pipelines, PR gates, staged promotion, artifact-based rollback
Environments Isolated dev / staging / prod with configuration parity
Observability Structured logs, correlation IDs, SLO-based alerting, error tracking
Release Practice Feature flags, documented rollback, change-log per deploy
Design UI/UX revamp via dedicated designer + component-library-driven delivery

Outcome

  • Zero-downtime migration to AWS executed on a live production system with active paying customers
  • Measurable reliability and performance improvements post-migration — bounded by SLOs rather than vibes
  • CI/CD pipeline where there was none, cutting deployment risk and accelerating release cadence
  • Legacy codebase on a stable, improving trajectory — audit findings retired on a schedule, performance regressions caught before they reach customers
  • Feature-flag discipline means risky changes reach production safely
  • UI revamp underway on a rollout plan that protects existing users
  • Client operates with the velocity and confidence of a modern SaaS engineering org without having re-platformed the product

Key Takeaway

Stepping into a live, customer-facing SaaS product carries a specific kind of engineering risk: the business doesn't pause while you fix it. The value of this engagement is that we modernized the infrastructure, stood up the delivery pipeline, and began rehabilitating the codebase without the customers ever experiencing a maintenance window, a failed deploy, or a regression. That outcome — moving a legacy platform forward without breaking what works — is the real test of operational engineering maturity, and it's where this team earns its keep.

Have a project like this?

Tell us what you're trying to build. Discovery calls this week, scope within 3 business days.