OpsDB runs on two servers, named appropriately xxxxxx-opsdb01.geant.net and xxxxx-opsdb02.geant.net where xxxxx = prod, uat, or test
First Steps
If for any reason the system becomes unavailable, the initial action is to make sure we have switched from the ‘Primary(01)’ instance of OpdDB to the ‘Secondary(02)’ instance. This will allow the general user to continue working on OpsDB whist we continue with our investigations as to why it initially went down.
Change the Domain Name System (DNS) entry for OpsDB (i.e. Move from one instance to another)
Currently we direct all the OpsDB public domain URI calls to the ‘01’ instance (the ‘Primary’ instance) of the appropriate OpsDB VM.
If required (i.e. the 01 instance of a VM is down) we can change the DNS to point our public domain URI to the 02 VM (the ‘Secondary;’ Instance) whilst the 01 VM is being fixed – this should ensure they service continues to be available to the public.
To do this will require changes to be made by the systems administrator and is documented elsewhere.
Once this has been done the system should then be available to the users again whilst more detailed investigation takes place.
Please do not forget to inform the users that OpsDB is back up once this has been done.
Further Investigation
The following points may help troubleshot any issues that arise with this application.
Check Apache.
- Has apache failed? Is it running?
...
This should start or restart the http server (apache) on the VM – please perform this on both VMs separately.
Check MySQL.
- Is the MySQL instance running?
...
This should start or restart MySQL on the VM – please perform this on both VMs separately.
Recovery of MySQL Data
Currently MySQL data backups are stored in the /opt/vackups/mysql folder within each VM.
...
To restore any of these instances of data, locate the appropriate DB dump and go through the mysql restore procedure (documented elsewhere in MySQL documentation)
Change the Domain Name System (DNS) entry for OpsDB
Currently we direct all the OpsDB public domain URI calls to the ‘01’ instance of the appropriate OpsDB VM.
If required (i.e. the 01 instance of a VM is down) we can change the DNS to point our public domain URI to the 02 VM service whilst the 01 VM is being fixed – this should ensure they service continues to be available to the public.
To do this will require changes to be made by the systems administrator and is documented elsewhere.
Security Updates with underlying software and operating systems
OpsDB is, in terms of software, an ‘old lady’ now, awaiting retirement.
...
HTML / Javascript are currently supported and have no future planned support end dates, in fact older versions are more supported than the latest ones!.
Check disk usage
Is the VM disk full?
...