-Work in 24x7 rotational shifts
-Monitor, install, configure and troubleshoot our production servers and services.
-Maintain 100% uptime of the production services.
-Ensure that our monitoring tools catch and generate alerts on all production issues.
-Follow escalation process through issue completion, including providing documentation after resolution.
-Follow regular Operations procedures and complete all assigned tasks during the shift.
-Assist in root cause analysis of production issues and help write a report which includes details about the failure, the relevant log entries, and likely root cause.
-Send periodic NOC reports with the system and service status.
-Monitor backups and disaster recovery, including backup verification, and performing restoration tests and -participating in disaster recovery drills.
-Build a knowledge base by creating and updating documentation for support.