Job Description
1. Main Purpose (Why does the role exist)
The main purpose of the Infrastructure and Cloud Specialist role is to design, implement the infrastructure and cloud environments and focus on ensuring the availability, scalability, security, and efficiency of the organization’s infrastructure and cloud systems with guidance from the Technology and Infrastructure Lead.
2. Education
A tertiary qualification such as a BSc in Computer Science/Information Systems or a related field, practical experience, and technical skills are important.
3. Experience
- 5+ years of practical experience working with cloud platforms, such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).
- Demonstrated experience managing and maintaining on-premises infrastructure, including servers, storage systems, and network devices.
4. Certifications
Any of the following certifications would be desirable: AWS Certified Solutions Architect; Cisco Certified Network Associate (CCNA) Data Center, NetApp Certified Storage Associate, Microsoft Certified: Azure Solutions Architect, Google Cloud Certified – Professional Cloud Architect.
5. Knowledge
- Knowledge of core services, deployment models, networking, security, and resource management.
- Strong knowledge of server administration and operating systems such as Linux (e.g., Ubuntu, CentOS, Red Hat) and Windows Server.
- Understanding of cloud service deployment models associated with integrating and managing hybrid environments to maintain cloud-based solutions.
- Familiarity with virtualization technologies, including virtual machines (VMs) and containerization platforms such as Docker and Kubernetes.
- Knowledge of storage technologies, including SAN, NAS, and object storage, as well as database systems like MySQL, PostgreSQL, SQL-Server, and MongoDB.
- Strong understanding of cloud principles, best practices, and compliance frameworks.
- Knowledge of monitoring and management tools, familiarity with backup tools and disaster recovery solutions.
6. Skills
- Proficiency with various operating systems in server environments, including Windows and Linux distributions (such as Ubuntu, CentOS, or Red Hat), and familiarity with tools like Terraform, Ansible, or Cloud Formation is valuable for automating infrastructure deployment and managing configuration changes.
- Strong knowledge and experience with cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).
- Strong knowledge of server administration and operating systems such as Linux (e.g., Ubuntu, CentOS, Red Hat) and Windows Server.
- Familiarity with virtualization technologies like VMware, Hyper-V, or KVM, and containerization platforms such as Docker and Kubernetes.
- Strong scripting and automation skills using languages such as Python, PowerShell, or Bash.
- Ability to set up monitoring solutions, analyze performance metrics, identify bottlenecks, and implement optimizations to ensure optimal system performance.
- Experience in managing and optimizing database systems such as MySQL, PostgreSQL, or MongoDB.
- Proficiency in collaboration and DevOps tools such as Git, Jenkins, Jira, or Confluence.
- Ability to collaborate with cross-functional teams, stakeholders, and external vendors to drive digital transformation initiatives and ensure alignment with business goals.
7. Critical Deliverables/Core Accountabilities and Responsibilities
- Deliver infrastructure designs and implement them according to Letshego’s requirements and standards by designing and deploying servers, storage systems, networking components, and other infrastructure resources.
- Create detailed architecture and configuration documentation, including network diagrams, system architecture diagrams, and configuration specifications.
- Implement and configure cloud-based solutions, ensuring integration with existing on-premises infrastructure, based on the organization’s requirements. This includes managing server resources, allocation, and networking services, ensuring high availability and security.
- Develop automation scripts and templates using tools like Terraform, Ansible, or Cloud Formation to streamline deployment and configuration processes.
- Implement monitoring and alerting systems to track the health, performance, and availability of infrastructure resources for compliance to Letshego standards.
- Provide technical support and troubleshooting assistance for infrastructure-related issues including investigating and resolving incidents, performing root cause analysis, and implementing preventive measures to minimize future occurrences.
- Set up and implement backup and recovery procedures to ensure data integrity and availability.
- Identify and resolve security vulnerabilities within network configurations, and implement best practices.
- Maintain detailed controls and best practices for infrastructure components, including configuration management, documentation, system hardening, and secure technical standards, standard operating procedures (SOPs), and knowledge base articles to document infrastructure configurations, processes, and troubleshooting steps.
- Collaborate with developers, operations teams, and other stakeholders to ensure seamless integration of infrastructure components with applications and services.
8. Key Performance Indicators
- Infrastructure Availability: Uptime and availability of infrastructure components to ensure that critical systems are accessible and operational.
- Customer Satisfaction: Internal customer satisfaction with infrastructure services and support.
- Incident Response and Resolution: The effectiveness and efficiency of incident response and resolution processes.
- Backup and Recovery Performance: The effectiveness and efficiency of backup and recovery processes.
- Security Incident Response: Assess the team’s ability to respond to and mitigate security incidents.
9. Complexity of the Role
- Managing infrastructure and application across multiple cloud platforms, such as AWS, Azure, and Google Cloud, as well as on-premises infrastructure.
- Can be complex, requiring skills, task synchronization, and consistent security controls and best practices.
- Requires balancing competing demands, tight deadlines, multiple challenges, and changing priorities, while maintaining effective communication and collaboration between all involved teams.
- Requires collaboration with developers, application teams, digital designers, operations teams, and other stakeholders, and teams to ensure alignment and successful implementation of infrastructure projects.
Closing Date:Â 01 July 2025 Application email:Â [email protected] Disclaimer:Â Only shortlisted candidates will be contacted. Letshego Contact number:Â 364 3000 Remuneration:Â A competitive remuneration package will offered to the suitable candidate. Interested applicants should forward their Curriculum Vitae to the provided email by indicating the position they are applying for on the subject of the email: Ref: Infrastructure & Cloud Specialist.