OpenShift HA
OpenShift HA
Good references and resources
-
https://www.openshift.com/blog/disaster-recovery-strategies-for-applications-running-on-openshift
-
https://www.openshift.com/blog/deploying-openshift-applications-multiple-datacenters
-
https://www.openshift.com/blog/stateful-workloads-and-the-two-data-center-conundrum
-
https://www.openshift.com/blog/disaster-recovery-with-gitops
-
https://cloud.redhat.com/blog/stateful-workloads-and-the-two-data-center-conundrum
-
https://cloud.redhat.com/blog/disaster-recovery-strategies-for-applications-running-on-openshift
-
https://cloud.redhat.com/blog/deploying-openshift-applications-multiple-datacenters
-
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-two-cockroachdb
-
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-3-keycloak
-
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-four-kafka
-
https://cloud.redhat.com/blog/geographically-distributed-stateful-workloads-part-five-yugabytedb
DR options
Failure Scenarios
-
Complete datacenter outage
-
Availability Zone failure (network segment or group of racks)
-
Rack failure
-
Host failure
-
VM failure
-
OpenShift master failure
-
OpenShift etcd failure (etcd backup and restore)
Active - Active Production
-
deploy two active OpenShift clusters in separate datacenters
-
synchronize application rollouts between datacenters using CICD pipelines
-
spread application load balancing across two datacenters using a F5 GTM / Netscaler GSLB (Global Server Load Balancing, NetScaler Enterprise Edition.)
-
requires support from application’s backend datastore: i.e. replication/sharding/etc across datacenters
-
optionally, deploy backend datastore in one datacenter and plan for its failover separately from the application (e.g. front end app uses backend Oracle database, which handles its own sync/replication/restore): this results in active/active on the frontend with active/passive on the backend, where one active datacenter will always be configured to handle database requests to the single datastore in the other datacenter
-
you could choose to handle applications individually, and any applications that do not require persistent data can always run active/active while those that do require persistent data run active/passive with managed replication
Active - Passive Production
-
deploy a secondary passive cluster is a separate datacenter from the active OpenShift cluster
-
synchronize application rollouts between datacenters using CICD pipelines
-
employ a load balancer to switch to secondary passive cluster
-
replicate application data from active to passive cluster on a reasonable interval
Active Production - Active Nonprod Staging Standby
-
deploy Production and Nonproduction in separate datacenters
-
use Nonproduction isolated nodes as a Production DR
-
switch load balancers for applications from Prod to Nonprod DR
-
already have environment ready, just need to synchronize all deployments to latest prod releases