#VCP #Storage #Concept VMwares own, hyperconverged network distributed storage. Local storage devices (DAS, Direct attached storage) from ESXi hosts are bundled together to form one or more shared vSAN datastores. vSAN runs natively on the ESXi hypervisor. vSAN is classified as an object-based storage. Management of vSAN is done through the vSphere client. One can imagine vSAN as a network distributed RAID storage where local disks are used as shared storage, VM data has a local copy and another copy on the other nodes in the cluster. The number of copies for data protection and performance can be configured per object. There are 2 different types of vSAN: - vSAN OSA(Original Storage Architecture) - vSAN ESA(Express Storage Architecture) ## Requirements/Restrictions - vSphere 5.5 U1 or later - only a single vSAN datastore per Cluster is created - RDM is not supported - SIOC,DPM and VMFS is not supported on a vSAN Datastore. ## Fault domains Configure fault domains to protect agains rack or chassis failures. vSAN places copies of the data to different hosts/racks of a fault domain to protect against a fault domain failure. ## Stretched cluster vSAN supports stretching a cluster across diferrent locations/datacenters, a Witness in a third datacenter is required for this to work. ## Support for Windows Server failover clusters (WSFCs) SCSI-3 persistent reservations is supported on virtual disks. vSAN support has some limitations though, there is a maximum of 6 application nodes in each vSAN cluster. ## Skyline Health is integrated into each vSAN Cluster, can be retrieved inside the vSphere client. Has a general health score from 1-100 and gives alerts about problems inside your Cluster and also recommendations on how to remediate issues. ## Integration with vSphere storage features Snapshots, linked clones and replication are all supported, third party backup solutions like Veeam, Nakivo, Altaro or Zerto are fully supported. ## Deduplication and compression Block level deduplication and compression are available in vSAN, they can be configured at the cluster level and are applied on each disk group. ## Data at rest encryption encryption of data that is not in transit, for example deduplication or compression. If drives are removed the data on them is encrypted. ## vSAN Specific storage policies ### Primary Level of Failures to tolerate (PFTT) Describes how many hosts and device failures VM objects can survive. Where n= "failures to tolerate" it stores the data in n+1 locations. ### Secondary Level of Failures to tolerate (SFTT) When used in Stretched Cluster, this is needed. It describes The number of additional host failures the object tolerates after a complete Site fails. The Sites must be correctly configured in vSANs configuration. Example: Where PFTT = 1 and SFTT = 2. One Site is unavailable, but the Cluster can tolerate two additional host failures. Maximum value is 3. For more information: [Docs](https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vsan.doc/GUID-08911FD3-2462-4C1C-AE81-0D4DBC8F7990.html) ### Data locality Allows the objects to be limited to a specific Site, default setting is none. ### Failure Tolerance Method Defines Data replication mechanism, RAID level. ### Number of disk stripes per object number of capacity devices where each VM replica is stripted, consumes more resources the higher it is set. ### Flash Read cache reservation for hybrid vSAN(HDDs & SSDs mixed) defines the percentage of a VM object which is reserved for object caching. ### Force Provisioning can be either yes/no forces provisioning of objects even when policy cannot be met. ### Object Space Reservation Percentage of VMDK objects that must be thick provisioned on deployment ### Disable object Checksum used for integrity checks to make sure that copies of data spread across a vSAN cluster are identical. Yes=checksum not calculated. ## vSAN OSA (Original Storage Architecture) ### Requirements - 3 ESXi Hosts - 2 are possible with a virtualized Witness Appliance - at least 1Gbit NIC on each host - for all-flash 10Gbit - dedicated VMK port for vSAN - per required disk group(max. of 5 disk groups per host): - 1 cache drive (SSD) - 1-7 capacity drive (HDD/SSD) - network latency of 1ms RTT - 5ms RTT between sites for stretched cluster - 200ms between witness and main site - Storage controller and all disks must be vSAN certified/ listed in the HCL - vCenter for configuration ### Configurations #### Hybrid vSAN is configured with Traditional HDDs for capacity and SSDs as cache-tier generally the ratio of SSD to HDD should be around 10%. frequently read data is stored on faster cache SSDs, 70% of the cache device is reserved for this. If read operation is happening on HDD we call this a "cache miss". Read operation needs to be handled by non-cache drive HDD. Writes are mirrored, a copy of the data is sent to each node. The writes are sent first to the write buffer on the SSD and then written to capacity HDD. #### All-Flash Cache drive is still required. Aditionally enables Compression/Deduplication. The cache drive is only used for write cache, read cache is done on the capacity disks directly. ### Disk groups Disk groups are groups of drives (1 Cache, 1 or more capacity) on an ESXi Host which will make out the capacity of the single vSAN Datastore. Container for multiple disks. You can configure multiple disk groups per host. ### Best practices - dedicated VLAN for vSAN VMK traffic - For each ESXi Host it is recommended to have the same disk groups configured to balance I/O and failure tolerance more easily. ## vSAN ESA(Express Storage Architecture) designed for NVMe all-Flash storage. There are no more cache drives required in an ESA Cluster because the ESA Architecture filesystem is log structured, there are no more disk groups. It is generally much faster than OSA, Performance in Raid5 configuration is almost as fast as a Raid1 configuration in OSA. It also consumes less CPU and has better compression. ### Requirements - at least 10Gbit NIC on each Host - all NVMe based TLC flash disks - maximum of 24 drives per node - dedicated VMK port for vSAN ### native snapshots ESA brings a new snapshot sstem that allows for snapshot operation to be going up to 100x faster consolidation times. Snapshots are available by backup APIs. Backup should proceed much faster with vSAN 8.0 ESA. ### Security improvements ESA uses envryption but the encryption process only occurs on the host where the VM resides. In OSA the data needed to be decrypted from moved between caching and capacity, this used additional CPU cycles. ESA allows for the data not needing to be moved between cache and capacity and therfore lowers these CPU cycles and overall I/O ### log-structured file system(LFS) one of the biggest difference between OSA and ESA is, that ESA operates on a log-structured file system. LFS allows significant efficiencies throughout the stack and allows ingesting writes much more quickly. It takes advantage of the approach to write data resilient by first quickly writing using a redundant mirror and packages it to allow vSAN to write the data to a stripe with parity, all while maintaining the metadata efficiently. ### Raid 5/6 erasure coding brings huge capacity savings comparte to R1 mirroring. failures to tolerate (FTT) = 2 using R6 will consume 1.5x while being 3x when using the same tolerance with R1. ## Setup 1. Create vSphere Cluster 2. create dedicated vDS or vSwitch for vSAN(optional) 3. create VMKs on each host for vSAN 1. on dedicated VLAN(recommended) 4. enable vSAN on Cluster 1. either OSA or ESA 5. configure disk groups, Cache/Capacity(OSA) or claim disks (ESA) 6. configure fault domain ## 🔗Resources - [vSAN Storage Policies](https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vsan.doc/GUID-08911FD3-2462-4C1C-AE81-0D4DBC8F7990.html)