HR Tech · SaaS · Enterprise
HR performance management platform
Zero-downtime AWS migration for a live US SaaS product
We stepped into a live, third-party-hosted platform and migrated it to AWS without a minute of downtime. Established CI/CD, agile process, and a full UI overhaul.
Employee Performance Management Platform
Domain: HR Tech / SaaS Engagement: Time & Material (ongoing) Team Size: Dedicated cross-functional pod, full-stack Client Location: United States
The Problem
The client operates an established US-based SaaS platform for employee performance management — a real business with real revenue and real customers, not a greenfield product. The engineering reality had drifted away from the product reality: the platform was hosted on a poorly managed third-party cloud provider with limited scalability, no CI/CD pipeline, no staging environment, and an aging UI that no longer met modern usability standards. Every change carried risk because the system under it was brittle.
They needed a partner who could do two hard things at once: keep the plane in the air (active customers, live contracts) while rebuilding the plane (infrastructure, delivery pipeline, UI, code quality). Cutting customers off for a maintenance window wasn't an option. Taking six months of feature freeze to refactor wasn't either.
What We Do
An ongoing engagement where we operate as the client's extended engineering team. Scope covers the full application lifecycle — infrastructure modernization, delivery engineering, incremental codebase rehabilitation, feature development, and UI redesign — with zero customer-visible disruption as a non-negotiable baseline.
Infrastructure migration — with zero downtime
Moved the entire production stack from an unreliable third-party host to AWS without a maintenance window. The approach wasn't a single big cut-over; it was a staged migration designed so every phase was independently rollback-able:
- Infrastructure parity first — stood up the target AWS environment (EC2 / ASG, RDS Multi-AZ, S3, CloudFront, Route 53, VPC with private subnets) and validated it against production traffic shapes before touching DNS
- Database replication — ran MySQL replication from the legacy host into RDS, kept it running hot while the app still wrote to the legacy primary
- Application dual-readiness — deployed the app to AWS pointing at the replica, ran shadow traffic and smoke tests against it without customers seeing any of it
- Controlled cut-over — promoted the AWS RDS instance, flipped DNS through a low-TTL window, monitored error and latency budgets, with a documented reverse path if anything spiked
- Observed soak period — held the legacy environment in standby for a bounded window post-cut-over in case anything needed to fall back
Post-migration the stack picked up what the previous host lacked: automated daily backups with point-in-time recovery, Multi-AZ failover for the database, auto-scaling groups behind a load balancer, secrets in AWS Secrets Manager rather than config files, IAM role-based access rather than long-lived keys, and VPC-level network segmentation.
Delivery engineering — CI/CD from zero
There was no pipeline when we arrived. Every deploy was a human with SSH access. We built the delivery system the product had always needed:
- Automated CI on every pull request: linters, unit tests, security dependency scans, and build verification — red builds don't merge
- Environments — isolated dev, staging, and production with parity of configuration (same infra-as-code, just scaled and parameterized differently)
- Staged deployment — merges to main deploy to staging automatically; production deploys are promoted from a staging artifact after smoke tests, never built fresh for prod
- Rollback as a first-class action — previous release artifacts are retained, so rolling back is a one-command operation, not a re-deploy of old code from git
- Release discipline — a documented release checklist, change-log per deployment, and on-call rotation for post-deploy ownership
Application development — rehabilitating a legacy PHP + MySQL codebase
The codebase is PHP + MySQL with a React frontend. Legacy PHP earns its reputation honestly, but responsible modernization isn't a rewrite — it's disciplined incremental improvement:
- Contract-first API stability — the boundary between backend and frontend is documented and tested so frontend and backend can evolve on independent cadences
- Feature flags for anything risky — new behavior ships behind a flag, rolls out to a cohort, and is measurable before full enablement
- Audit-driven hardening — code quality and security audit findings are triaged, prioritized by risk, and worked off as a structured backlog rather than ad-hoc
- Performance work — targeted profiling of slow endpoints, query plan review, index additions, and N+1 elimination; improvements verified against production traffic shapes, not synthetic benchmarks
- Frontend evolution — incremental React component modernization aligned with the UI redesign, with a shared component library as the target state
Observability & operations
- Centralized application logs with structured fields and correlation IDs
- Infrastructure metrics and alerting tied to user-facing SLOs, not CPU graphs
- Error tracking with stack traces and user context
- On-call runbook for the incidents we've actually seen, not theoretical ones
UI/UX revamp (in progress)
A dedicated UI designer is leading a full redesign of the SaaS experience. The redesign is delivered as a connected design-to-development pipeline: design system tokens, component specs, accessibility review, and staged rollout screen-by-screen rather than a flag-day relaunch. The goal is a modern, accessible, cohesive product — without a risky one-shot migration.
Project management
Agile sprints with regular communication, proactive architectural recommendations, and scope prioritization done jointly with stakeholders. We operate with the transparency of an in-house team: the client can see what we're working on, what's blocked, and what's shipping.
Tech Stack
| Layer | Technology |
|---|---|
| Backend | PHP, MySQL |
| Frontend | React JS |
| Infrastructure | AWS (EC2/ASG, RDS Multi-AZ, S3, CloudFront, Route 53, VPC) |
| Secrets & Identity | AWS Secrets Manager, IAM role-based access |
| CI/CD | Automated pipelines, PR gates, staged promotion, artifact-based rollback |
| Environments | Isolated dev / staging / prod with configuration parity |
| Observability | Structured logs, correlation IDs, SLO-based alerting, error tracking |
| Release Practice | Feature flags, documented rollback, change-log per deploy |
| Design | UI/UX revamp via dedicated designer + component-library-driven delivery |
Outcome
- Zero-downtime migration to AWS executed on a live production system with active paying customers
- Measurable reliability and performance improvements post-migration — bounded by SLOs rather than vibes
- CI/CD pipeline where there was none, cutting deployment risk and accelerating release cadence
- Legacy codebase on a stable, improving trajectory — audit findings retired on a schedule, performance regressions caught before they reach customers
- Feature-flag discipline means risky changes reach production safely
- UI revamp underway on a rollout plan that protects existing users
- Client operates with the velocity and confidence of a modern SaaS engineering org without having re-platformed the product
Key Takeaway
Stepping into a live, customer-facing SaaS product carries a specific kind of engineering risk: the business doesn't pause while you fix it. The value of this engagement is that we modernized the infrastructure, stood up the delivery pipeline, and began rehabilitating the codebase without the customers ever experiencing a maintenance window, a failed deploy, or a regression. That outcome — moving a legacy platform forward without breaking what works — is the real test of operational engineering maturity, and it's where this team earns its keep.
More recent work
Healthcare · Patient management
Multi-stakeholder healthcare platform
A six-app healthcare ecosystem — the product our client took to Shark Tank
Patient app, resident web, staff portal, admin dashboard, shared SDK, and an AI emergency dispatcher — all sharing a single backend with role-isolated access. Built for a healthcare startup that was later featured on Shark Tank with the product we delivered.
Read the case study
Transportation · Smart city
TransitPal
Multi-modal routing and cashless ticketing for public transit
Proprietary IP we built and productised — launched on the App Store and Play Store. 150+ APIs, dynamic fare engine, real-time vehicle tracking, ONDC-certified, and a voice AI for 250+ metro stations.
Read the case study
Have a project like this?
Tell us what you're trying to build. Discovery calls this week, scope within 3 business days.