Work in 24x7 rotational shifts
Monitor, install, configure and troubleshoot our production servers and services.
Maintain 100% uptime of the production services.
Ensure that our monitoring tools catch and generate alerts on all production issues.
Follow escalation process through issue completion, including providing documentation after resolution.
Follow regular Operations procedures and complete all assigned tasks during the shift.
Assist in root cause analysis of production issues and help write a report which includes details about the failure, the relevant log entries, and likely root cause.
Send periodic NOC reports with the system and service status.
Monitor backups and disaster recovery, including backup verification, and performing restoration tests and participating in disaster recovery drills.
Build a knowledge base by creating and updating documentation for support