OpenShift HA
OpenShift HA
Good references and resources
- 
https://www.openshift.com/blog/disaster-recovery-strategies-for-applications-running-on-openshift 
- 
https://www.openshift.com/blog/deploying-openshift-applications-multiple-datacenters 
- 
https://www.openshift.com/blog/stateful-workloads-and-the-two-data-center-conundrum 
- 
https://www.openshift.com/blog/disaster-recovery-with-gitops 
- 
https://cloud.redhat.com/blog/stateful-workloads-and-the-two-data-center-conundrum 
- 
https://cloud.redhat.com/blog/disaster-recovery-strategies-for-applications-running-on-openshift 
- 
https://cloud.redhat.com/blog/deploying-openshift-applications-multiple-datacenters 
- 
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-two-cockroachdb 
- 
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-3-keycloak 
- 
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-four-kafka 
- 
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-five-yugabytedb 
DR options
Failure Scenarios
- 
Complete datacenter outage 
- 
Availability Zone failure (network segment or group of racks) 
- 
Rack failure 
- 
Host failure 
- 
VM failure 
- 
OpenShift master failure 
- 
OpenShift etcd failure (etcd backup and restore) 
Active - Active Production
- 
deploy two active OpenShift clusters in separate datacenters 
- 
synchronize application rollouts between datacenters using CICD pipelines 
- 
spread application load balancing across two datacenters using a F5 GTM / Netscaler GSLB (Global Server Load Balancing, NetScaler Enterprise Edition.) 
- 
requires support from application’s backend datastore: i.e. replication/sharding/etc across datacenters 
- 
optionally, deploy backend datastore in one datacenter and plan for its failover separately from the application (e.g. front end app uses backend Oracle database, which handles its own sync/replication/restore): this results in active/active on the frontend with active/passive on the backend, where one active datacenter will always be configured to handle database requests to the single datastore in the other datacenter 
- 
you could choose to handle applications individually, and any applications that do not require persistent data can always run active/active while those that do require persistent data run active/passive with managed replication 
Active - Passive Production
- 
deploy a secondary passive cluster is a separate datacenter from the active OpenShift cluster 
- 
synchronize application rollouts between datacenters using CICD pipelines 
- 
employ a load balancer to switch to secondary passive cluster 
- 
replicate application data from active to passive cluster on a reasonable interval 
Active Production - Active Nonprod Staging Standby
- 
deploy Production and Nonproduction in separate datacenters 
- 
use Nonproduction isolated nodes as a Production DR 
- 
switch load balancers for applications from Prod to Nonprod DR 
- 
already have environment ready, just need to synchronize all deployments to latest prod releases