Run a highly available SharePoint Server 2016 farm in Azure
About Alif: Alif empowers Microsoft MSP-CSP partners to provide exceptional IT services to their clients to ensure that the partners reduce their costs and focus on their business. We provide white-labelled managed services for technologies like Microsoft Azure, Microsoft 365, Microsoft Dynamics 365, Microsoft Security, SharePoint, Power Platform, SQL, Azure DevOps and a lot more. Our headquarter is in Pune, India whereas we work with over 50 partners across the globe that trust us with their client delivery.
Run a highly available SharePoint Server 2016 farm in Azure
This reference architecture shows proven practices for deploying a highly available SharePoint Server 2016 farm on Azure, using MinRole topology and SQL Server Always On availability groups. The SharePoint farm is deployed in a secured virtual network with no Internet-facing endpoint or presence.
This architecture builds on the one shown in Run Windows VMs for an N-tier application. It deploys a SharePoint Server 2016 farm with high availability inside an Azure virtual network (VNet). This architecture is suitable for a test or production environment, a SharePoint hybrid infrastructure with Microsoft 365, or as the basis for a disaster recovery scenario.
The architecture consists of the following components:
Resource groups. A resource group is a container that holds related Azure resources. One resource group is used for the SharePoint servers, and another resource group is used for infrastructure components that are independent of VMs, such as the virtual network and load balancers.
Virtual network (VNet). The VMs are deployed in a VNet with a unique intranet address space. The VNet is further subdivided into subnets.
Virtual machines (VMs). The VMs are deployed into the VNet, and private static IP addresses are assigned to all of the VMs. Static IP addresses are recommended for the VMs running SQL Server and SharePoint Server 2016, to avoid issues with IP address caching and changes of addresses after a restart.
Availability sets. Place the VMs for each SharePoint role into separate availability sets, and provision at least two virtual machines (VMs) for each role. This configuration makes the VMs eligible for a higher service level agreement (SLA).
Internal load balancer. The load balancer distributes SharePoint request traffic from the on-premises network to the front-end web servers of the SharePoint farm.
Network security groups (NSGs). For each subnet that contains virtual machines, a network security groupis created. Use NSGs to restrict network traffic within the VNet, in order to isolate subnets.
Gateway. The gateway provides a connection between your on-premises network and the Azure virtual network. Your connection can use ExpressRoute or site-to-site VPN. For more information.
Windows Server Active Directory (AD) domain controllers. This reference architecture deploys Windows Server AD domain controllers. These domain controllers run in the Azure VNet and have a trust relationship with the on-premises Windows Server AD forest. Client web requests for SharePoint farm resources are authenticated in the VNet rather than sending that authentication traffic across the gateway connection to the on-premises network. In DNS, intranet A or CNAME records are created so that intranet users can resolve the name of the SharePoint farm to the private IP address of the internal load balancer.
SharePoint Server 2016 also supports using Azure Active Directory Domain Services. Azure AD Domain Services provides managed domain services so that you don't need to deploy and manage domain controllers in Azure.
SQL Server Always On availability group. For high availability of the SQL Server database, we recommend SQL Server Always On availability groups. Two virtual machines are used for SQL Server. One contains the primary database replica, and the other contains the secondary replica.
Majority node VM. This VM allows the failover cluster to establish a quorum. For more information.
SharePoint servers. The SharePoint servers perform the web front-end, caching, application, and search roles.
Jumpbox. Also called a bastion host. This is a secure VM on the network that administrators use to connect to the other VMs. The jump box has an NSG that allows remote traffic only from public IP addresses on a safe list. The NSG should permit remote desktop (RDP) traffic.
To scale up the existing servers, simply change the VM size.
With the MinRoles capability in SharePoint Server 2016, you can scale out servers based on the server's role and also remove servers from a role. When you add servers to a role, you can specify any of the single roles or one of the combined roles. If you add servers to the Search role, however, you must also reconfigure the search topology using PowerShell. You can also convert roles using MinRoles.
Note that SharePoint Server 2016 doesn't support using virtual machine scale sets for autoscaling.
This reference architecture supports high availability within an Azure region because each role has at least two VMs deployed in an availability set.
To protect against a regional failure, create a separate disaster recovery farm in a different Azure region. Your recovery time objectives (RTOs) and recovery point objectives (RPOs) will determine the setup requirements. The secondary region should be a paired region with the primary region. In the event of a broad outage, recovery of one region is prioritized out of every pair.
To operate and maintain servers, server farms, and sites, follow the recommended practices for SharePoint operations.
The tasks to consider when managing SQL Server in a SharePoint environment may differ from the ones typically considered for a database application. A best practice is to fully back up all SQL databases weekly with incremental nightly backups. Back up transaction logs every 15 minutes. Another practice is to implement SQL Server maintenance tasks on the databases while disabling the built-in SharePoint ones
The domain-level service accounts used to run SharePoint Server 2016 require Windows Server AD domain controllers or Azure Active Directory Domain Services for domain-join and authentication processes. However, to extend the Windows Server AD identity infrastructure already in place in the intranet, this particular architecture uses two VMs as Windows Server AD replica domain controllers of an existing on-premises Windows Server AD forest.
In addition, it's always wise to plan for security hardening. Other recommendations include:
Add rules to NSGs to isolate subnets and roles.
Don't assign public IP addresses to VMs.
For intrusion detection and analysis of payloads, consider using a network virtual appliance in front of the front-end web servers instead of an internal Azure load balancer.
As an option, use IPsec policies for encryption of cleartext traffic between servers. If you are also doing subnet isolation, update your network security group rules to allow IPsec traffic.
Install anti-malware agents for the VMs.
Use the Azure pricing calculator to estimate costs. Here are some factors for optimizing the cost for this architecture.
Active Directory Domain Services
Consider having Active Directory Domain Services as a shared service that is consumed by multiple workloads to lower costs.
The billing model is based on the amount of time the gateway is provisioned and available.
All inbound traffic is free. All outbound traffic is billed. Internet bandwidth costs are applied to VPN outbound traffic.
Azure Virtual Network is free. Every subscription is allowed to create up to 50 virtual networks across all regions. All traffic that originates within the boundaries of a virtual network is free. So, communication between two VMs in the same virtual network is free.