Aug 102012
 

First I would like to thank everyone for participating. When I got the idea to run the contest I anticipated <10 people entering, but my expectation has been greatly exceeded.

logo_trainsignal

Secondly, I would like to give a big THANK YOU to David Davis (blog / twitter) and the wonderful people at Trainsignal for sponsoring part of the giveaway. The quality video training they produce has definitely helped me along my quest for knowledge and certifications. Keep them coming!

Finally, the winner. I wanted to do a random drawing, but instead of throwing numbers into a hat and drawing the winner, I decided to use the get-random cmdlet native to powershell. Here is the code:

[sourcecode language="powershell" padlinenumbers="true"]
# set $count to 0

$count = 0

# there were 33 entries for the contest

$entries = 1..33

# since this is the VCP5, I decided to to use get-random 5 times
# meaning the fifth random number would be the winner

while ($count -lt 5) { $winner = $entries | get-random; $count++; $winner }

[/sourcecode]

The Result

vcp5ga_winner

Lucky number is 6, and the sixth entry for the contest is Mark Latham. Congratualtions Mark! I will be contacting you shortly.

Again, thank you to everyone that entered, and thank you for the honest feedback on the website, I will definitely be making some changes based on it. I will be at VMworld SF this year, hit me up on twitter or the comments if you want to meet up and chat.

Aug 062012
 

For this objective I used the following documents:

Objective 3.3– Implement and Maintain Complex DRS Solutions

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Explain DRS / storage DRS affinity and anti-affinity rules
    • DRS affinity and anti-affinity rules
      • Two types of rules exist; VM-Host affinity rules and VM-VM affinity rules
      • VM-Host affinity rules
        • Allows you to tie a virtual machine or group of virtual machines to a particular host or particular set of hosts. Also allows anti-affinity for said objects
        • Before creating a VM-Host affinity rule you need to create a DRS group and a host group
        • Decide whether it is a “must” rule or a “should” rule
          • “Must” rules will never be violated by DRS, DPM, or HA
          • “Should” rules are best effort and can be violated
      • VM-VM affinity rules
        • Used to keep virtual machines on the same host or ensure they do NOT run on the same host. If you had two servers that provide load-balancing for an application, it’s a good idea to ensure they aren’t running on the same host
      • VM-VM affinity rules shouldn’t conflict with each other. Meaning, you shouldn’t have one rule that separates virtual machines and another rule that keeps them together. If you have conflicting rules then the older rule wins and the new rule is disabled
    • Storage DRS affinity and anti-affinity rules
      • Storage DRS affinity rules are similar to DRS affinity rules, but instead of being applied to virtual machines and hosts they are applied on virtual disks and virtual machines when using datastore clusters
      • The three different storage DRS affinity/anti-affinity rules are Inter-VM Anti-Affinity, Intra-VM Anti-Affinity and Intra-VM Affinity(The “intra” rules are also known as VMDK anti-affinity and VMDK affinity)
        • Inter-VM ant-affinity allows you to specify which virtual machines should not be kept on the same datastore within a datastore cluster
        • Intra-VM anti-affinity lets you specify the virtual disks that belong to a particular virtual machine are stored on separate datastores within a datastore cluster
        • Intra-VM affinity will store all of your virtual disks on the same datastore within the datastore cluster (this is the default)
      • Storage DRS affinity rules are invoked during initial placement of the virtual machine and when storage DRS makes its recommendations. A migration initiated by a user will not cause storage DRS to be invoked
      • You can change the default behavior for all virtual machines in a datastore cluster by modifying the Virtual Machine Settings

image

      • This allows you to specify VMDK affinity or VMDK ant-affinity

 

  • Identify required hardware components to support DPM
    • DPM uses on of the following methods to bring hosts out of standby:
      • Intelligent Platform Management Interface (IPMI)
      • HP Integrated Lights-Out (HP iLO)
      • Wake on LAN (WOL)
    • IPMI and HP iLO both require a base management controller (BMC) – this allows access to hardware functions via a remote computer over LAN
      • THE BMC is always on whether the host is or not, enabling it to listen for power-on commands
    • IPMI that uses MD2 for authentication is not supported (use plaintext or MD5)
    • To use the WOL feature instead of IPMI or HP iLO the NIC(s) you are using must support WOL. More importantly, the physical NIC that corresponds to the vMotion vmkernel portgroup must be capable of WOL
      • In this case you can see that my vMotion vmkernel is located on vSwitch0, which has vmnic0 as its uplink
      • If you look at the Network Adapters section (host > configuration > network adapters) you can see that vmnic0 has WOL support

image

 

 

  • Identify EVC requirements, baselines and components
    • Enhanced vMotion Compatibility (EVC) is used to mask certain CPU features to virtual machines when a host(s) in a cluster have a slightly different processor than other hosts in the cluster
    • An AWESOME knowledge base article answers a lot of questions about EVC; VMware KB1005764
    • There are multiple EVC modes so check out the VMware Compatibility Guide to see which mode(s) your CPU can run
    • Enable Intel VT or AMD-V on your hosts
    • Enable the execute disable bit (XD)
    • CPUs must be of the same vendor

 

  • Understand the DRS / storage DRS migration algorithms, the Load Imbalance Metrics, and their impact on migration recommendations
    • DRS and Storage DRS use different metrics and algorithms, so I’ll talk about each of them separately
    • DRS
      • By default DRS is invoked every 5 minutes (300 seconds). This can be changed by modifying the vpxd configuration file, but it is highly discouraged and may or may not be supported
      • Prior to DRS performing load-balancing it will first try and correct any constraints that exists, such as DRS rules violations
      • One constraints have been corrected, DRS moves on to load-balancing using the following process:
        • Calculates the Current Host Load Standard Deviation (CHLSD)
        • If the CHLSD is less than the Target Host Load Standard Deviation (THLSD) then DRS has no further actions to execute
        • If CHLSD is greater than the THLSD then:
          • DRS executes a “bestmove” calculation which determines which VMs are candidates to be vMotioned in order to balance the cluster. The CHLSD is then calculated again
          • The costs, benefits and risks are then weighed based on that move
          • If the migration does not exceed the costs, benefits, and risks threshold, the migration will get added to the recommended migration list
        • Once all migration recommendations have been added to the list, the CHLSD is then calculated based on simulating those migrations on the list
      • The tolerance for imbalance is based on the user-defined migration thresholds (five total). The more aggressive the threshold, the lower the tolerance is for cluster imbalance
      • For a much deeper dive into DRS calculations, check out chapter 14 of the vSphere 5 Technical Deepdive mentioned at the top of this post
    • Imbalance Calculation and metrics
      • As mentioned earlier, load imbalance is when the CHLSD is greater than the THLSD.
      • Some things that will cause the DRS imbalance calculation to trigger are:
        • Resource settings change in a virtual machine or resource pool
        • When a host is added/removed from a DRS cluster
        • When a host enters/exits maintenance mode
        • Moving a virtual machine in/out of a resource pool
    • Storage DRS
      • There are two types of calculations performed by Storage DRS; initial placement and load-balancing
      • As with DRS, Storage DRS has a default invocation period, however it is much longer – 8 hours is the default interval. Again, it is not recommended that you change the default interval
      • Initial placement takes datastore space and I/O metrics into consideration prior to placing a virtual machine on a datastore. It also prefers to use a datastore that is connected to all hosts in the cluster instead of one that is not
      • Storage DRS Load imbalance
        • Before load-balancing is taken into consideration, corrections to constraints are processed first. Examples of constraints are VMDK affinity and anti-affinity rule violations
        • One constraint violations have been corrected, load-balancing calculations are processed and recommendations are generated
          • There are Storage DRS rules that are taken into account when the load-balancing algorithms run; Utilized Space and I/O Latency. Recommendations for Storage DRS migrations will not be made unless these thresholds are exceeded
          • Additionally, you can set advanced options that specify your tolerance for I/O imbalance and the percentage differential of space between source and destination datastores
            • Example: destination datastore must have more than a 10% utilization difference compared to the source datastore before that destination will be considered
        • Storage DRS also calculates a cost vs. benefits analysis (like DRS) prior to making a recommendation
    • Besides the standard invocation interval, the following will invoke Storage DRS:
      • If you manually click the Run Storage DRS hyperlink
      • When you place a datastore into datastore maintenance mode (the I/O latency metric is ignored during this calculation)
      • When you move a datastore into the datastore cluster
      • If the space threshold for a datastore is exceeded
    • There are a lot more technical details involved, such as workload and device modeling, but these facets of Storage DRS are complex and would make this post extremely long, If you care to review these, check out chapter 24 of the vSphere 5 Technical Deepdive mentioned at the top of this post

 

 

Skills and Abilities

  • Properly configure BIOS and management settings to support DPM
    • This will be slightly different for each system depending on the BIOS that it’s running. You will also need to configure your IPMI or iLO settings if you are using either of those technologies to support DPM. Most IPMI controllers (BMCs) will have their own configuration screen that can be accessed when booting the host
    • Some BIOS may require you to enable to WOL feature (if it’s an onboard NIC)

 

  • Test DPM to verify proper configuration
    • Before you can use the WOL option for DPM and enable it on a DRS cluster you must first successfully enter Standbymode and power the host back on successfully, If you aren’t able to successfully power the host back on after entering standby mode then you can need to disable the power management setting for that host
      • Log into the vSphere client
      • From the inventory tree right-click the cluster and select Edit Settings…
      • Under Power Management click on Host Options
      • In the right, find the host(s) that failed to exit standby and under the Power Management column select Disabled from the dropdown box

image

    • Click OK

 

  • Configure appropriate DPM Threshold to meet business requirements
    • As a business, all resources consumed cost money and being efficient as possible while still meeting business requirements is important. Using DPM can save you on unneeded power consumption, but you don’t want to use it to the point of negative returns. Setting the DPM threshold for your cluster(s) is an important consideration. You set the DPM threshold by:
      • Log into the vSphere client
      • From the inventory tree, right-click on your DRS cluster > click Edit Settings…
      • Under vSphere DRS click the Power Management setting

image

    • Here you can see that there are three different options you can choose; Off, Manual and Automatic
      • Off – power management is turned off
      • Manual – vCenter will give you recommendations during low resource utilization for hosts that can be put into standby mode
      • Automatic – vCenter will automatically place hosts in standby mode based on the DPM threshold that is set
    • Setting the Automaticoption and figuring out the DPM threshold to use is where business requirements are factored in. Before we can make the correlation, let’s talk about different migration thresholds. Like the DRS migration threshold, the DPM threshold is based on priority recommendations. The further to the right you move the slider, the more aggressive DPM becomes, the higher priority recommendations start to be included
      • There are five priority recommendations from 1-5
      • With the slider all the way to the left, only priority one recommendations are generated. When you move the slider to the right one notch, only priority one and two recommendations are generated. Each notch to the right will include a new priority level
    • Consider your hard requirements regarding resource availability. Determine if your workloads are capable of operating under resource contention should they need to wait for a host to be brought out of standby mode. Workloads can fluctuate, and while DPM will always keep enough resources powered on to satisfy admission control, it may not be able to react fast enough to meet resource demand

 

  • Configure EVC using appropriate baseline
    • EVC allows you to present the same CPU instruction sets to your virtual machines across a DRS cluster, even if the instruction sets of your physical CPUs across hosts are different. A few EVC requirements:
      • All hosts must have CPUs from the same vendor (Intel or AMD)
      • Hardware virtualization for each host should be enabled (Intel-VT or AMD-V)
      • Execute Disabled bit (Intel) or the No Execute bit (AMD) should be enabled in the BIOS
      • Any virtual machine that is running on a host with a higher CPU feature set than what is presented via the configured EVC baseline must be powered off prior to configuring EVC
        • If those virtual machines are not powered off then you will not be able to enable EVC
    • Unless you are using applications that take advantage of certain advanced CPU features that can potentially be masked by EVC you want to use the highest baseline compatible with your hosts. To configure EVC on a new cluster:
      • Log into the vSphere client
      • Right-click a datacenter from the inventory tree and click New Cluster…
      • Enter in a Name for the cluster (you will most likely want to enable DRS and HA, but for these purposes we’ll skip those steps and go straight to EVC)
      • Click Next
      • Choose Enable EVC for AMD Hosts for AMD processors or Enable EVC for Intel Hosts if using Intel processors
      • Choose an EVC mode
        • Intel

image

        • AMD

image

      • Each mode you select will give you a description of that mode, as well as the knowledge base article to look at (VMware KB1003212)
      • Complete the cluster configuration

 

  • Change the EVC mode on an existing DRS cluster
    • Changing the EVC mode or enabling EVC mode for the first time on an existing cluster can potentially be disruptive. As stated earlier, if you have virtual machines that are running on hosts that expose a higher level of advanced CPU features then are presented with the EVC baseline you want to configure, then those virtual machines must be powered off. To enable EVC mode or change the EVC mode on an existing DRS cluster:
      • Log into the vSphere client
      • Right-click the DRS cluster you want to modify from the inventory tree > click Edit Settings…
      • Select the VMware EVC option > click the Change EVC Mode… button
      • Select Enable EVC for AMD Hosts or Enable EVC for Intel Hosts
      • Select the desired mode from the dropdown
        • If the mode you select is not compatible with the processors running in your hosts you will get errors

image

        • If the mode you select is not compatible, possibly due to having powered on virtual machines running on hosts with greater CPU features than the selected EVC mode, or possibly due to a misconfigured BIOS setting on a host(s) you will see the following error

image

        • When you choose a mode that is compatible, it will show as Validation Succeeded

image

        • Click OK when finished

 

  • Create DRS and DPM alarms
    • Since DPM is a facet of DRS, I’ll cover creating DRS and DPM alarms together
    • One of the best pre-configured alarms for DRS/DPM is the Exit Standby Error. This is an event based alarm, so it will only trigger when the host/cluster reports an event of a host not able to exit standby mode
    • To create a new DRS/DPM alarm for a cluster:
      • Log into the vSphere client
      • Select a cluster from the inventory tree > on the right, click the Alarms tab
      • Click the Definitions button > here you will see a list of pre-defined alarms
      • Right-click a white area of that pane > click New Alarm…
      • Enter in an Alarm name and Description > from the alarm type dropdown select Cluster

image

      • Click the Triggers tab > click the Add button > click the event in the event column to get a drop down
      • Select which event you want. Here are a few DRS/DPM alarm events

image

      • Click on the Actions tab > click Add > select a desired action from the dropdown and when the action should be initiated (when alarm goes from green to red, red to green, etc…)

image

      • Click OK when finished

 

  • Configure applicable power management settings for ESXi hosts
    • Power management settings can be set on the hosts themselves (the active policy) or within the DRS cluster settings for DPM purposes
    • Set the Active Policy power management for an ESXi host
      • Log into the vSphere client
      • Select a host from the inventory tree > click the Configuration tab on the right
      • In the Hardware pane click the Power Management hyperlink > click the Properties hyperlink
      • Choose from one of the following power management policies
        • High Performance
        • Balanced
        • Low power
        • Custom
    • Set the Power management setting for DRS/DPM
      • Log into the vSphere client
      • Right-click a DRS cluster from the inventory tree > click Edit Settings…
      • Click  Host Options under vSphere DRS > Power Management
      • Here you will see a list of hosts that are part of the DRS cluster, under the Power Management column choose from one of the following settings
        • Default
        • Disabled
        • Manual
        • Automatic

image

        • Click OK

 

  • Properly size virtual machine and clusters for optimal DRS efficiency
    • You don’t want to size your virtual machines to the cluster, rather, you want to sized your clusters based on virtual machines
    • Properly sizing you virtual machine s and clusters can be tricky, especially if you don’t have hard requirements. Virtual machine sizing is the most important, and cluster sizes will be based on how you size your virtual machines with a percentage added in for scale and redundancy
    • In order to get optimal DRS efficiency from your clusters you want to
      • Ensure each host has the same resource configuration (memory, CPU)
      • DRS Clusters support a maximum of 32 hosts and 3000 virtual machines
      • Put vMotion on a separate layer network and use 10Gb if possible, also multiple NIC vMotion
      • Don’t set VM-HOST affinity rules (must rules) unless you absolutely have to
      • Don’t change the default automation level per virtual machine if you don’t have to
    • Don’t oversize your virtual machines, wasted resources can cause cluster imbalance

 

  • Properly apply virtual machine automation levels based upon application requirements
    • When creating a DRS cluster you set a virtual machine automation level for the cluster. There might be some use cases that require a virtual machine, or a set of virtual machines, that require a different level of automation then what the default for the cluster is. You can set automation levels for virtual machines individually
      • Do this sparingly. The more individual changes you make, the more management overhead you add, as well as potentially reducing the effectiveness of DRS
    • Why would you want to make changes to an individual virtual machine?
      • Applications might have to stay on a particular host due to licensing requirements
      • If you have an application that is constantly changing its memory contents, you may want not want it to move hosts as often as other virtual machines
    • Apply automation levels to individual virtual machines
      • Log into the vSphere client
      • Right-click on a DRS cluster from the inventory tree and click Edit Settings…
      • Under the vSphere DRS option choose Virtual Machine Options
      • Ensure that the Enable individual virtual machine automation levels checkbox is checked
      • In the Automation Level column, change the virtual machine(s) to the desired automation level using the dropdown

vm_automation

      • Click OK

 

 

  • Administer DRS / Storage DRS
    • Administering a DRS cluster involves creating and managing DRS affinity and anti-affinity rules, DRS virtual machine groups, DRS cluster validation and standard addition/removal of hosts from the DRS cluster
    • All administration takes place within the GUI and almost all of it within the cluster settings
    • Administering DRS
      • Adding and removing hosts
        • This is pretty straight forward; right-click on the cluster and click Add Host and go through the wizard
        • To remove a host from the cluster the host must be in maintenance mode first
      • Cluster Validation
        • A cluster can become overcommitted or invalid. The cluster object in the inventory tree will show yellow for overcommitted and red for invalid. A cluster can become invalid if you make changes directly to a host, and those changes aren’t reflected in vCenter. When vCenter comes back into the mix, there is a mismatch, which causes it to become invalid
      • Creating VM Anti Affinity/Affinity rules
        • There is some overlap between some of the VCAP-DCA objectives and the VCP5 objectives. While I hate referring you to another link to get information, I feel that it isn’t very efficient to duplicate some of these items when I could be continuing with other objectives in the blueprint.
    • Storage DRS can only be used with a new construct known as Datastore Clusters. With this new construct, come different points of administration, such as datastore maintenance mode, Storage DRS scheduled tasks, Storage DRS recommendations and, as with DRS, automation levels for individual virtual machines
    • Administering Storage DRS
      • Storage DRS Maintenance Mode
        • Must be manually invoked and is only available to datastores within a datastore cluster
        • Log into the vSphere client
        • Navigate to the Datastores and Datastore Clusters view (Ctrl+Shift+D)
        • Right-click on the datastore within the datastore cluster and click Enter SDRS Maintenance Mode

sdrs_maint

      • SDRS Scheduling
        • You can schedule Storage DRS to run at certain time (such as when little/0 users are at the office) in order to move vmdks to different datastores within the cluster. You then set the end settings which will revert SDRS back to its original configuration, or to a configuration you specify
        • Log into the vSphere client
        • Navigate to the Datastores and Datastore Clusters view (Ctrl+Shift+D)
        • Right-click a datastore cluster from the inventory tree > click Edit Settings…
        • Choose SDRS Scheduling
        • Click the Add… button

sdrs_sched_add

        • Enter a Start and End time as well as the Frequency > click Next
        • At the Start Settings page enter in a Description
        • Choose the Automation Level (Manual or Fully Automated)
        • Enable the I/O metric for SDRS recommendations (optional)
        • Set the Utilized Space (%)
        • Set the I/O Latency (ms)
        • Decide and set your I/O imbalance threshold (see screenshot for description)

sdrs_sched_start

        • Click Next
        • At the End Settings page enter in a Description
        • Leave the Restore settings to the original configuration checkbox checked
          • If you uncheck this option, set the Utilized Space (%), I/O Latency (ms)  and the I/O imbalance threshold

sdrs_sched_end

        • Click Next
        • Click Finish
      • Storage DRS Recommendations
        • Before you can get Storage DRS recommendations, or use it period, you need to make sure it is enabled on

sdrs_enable

        • Log into the vSphere client
        • Navigate to the Datastores and Datastore Clusters view (Ctrl+Shift+D)
        • Choose a datastore cluster from the inventory tree > click the Storage DRS tab on the right
        • In the Storage DRS Recommendations pane you can choose any pending recommendations and apply them

sdrs_recommend

      • Storage DRS Virtual Machine Settings
        • There are two parts that make up virtual machine settings; the automation level and the option to keep VMDKs (disk affinity) together
        • Log into the vSphere client
        • Navigate to the Datastores and Datastore Clusters view (Ctrl+Shift+D)
        • Right-click a datastore cluster from the inventory tree > click Edit Settings…
        • Choose Virtual Machine Settings
        • For each virtual machine you want to change, set the Automation Level (Fully Automated, Manual, Default or Disabled)
        • Check/uncheck the box to Keep VMDKs together

sdrs_vm_options

        • Click OK
      • Storage DRS Rules
        • Like DRS, Storage DRS has rules you can setup. Rules to keep VMs separate (VM anti-affinity), which means disks from those particular virtual machines will be kept on different datastores within the datastore cluster. The other option is the VMDK anti-affinity which separates virtual disks that belong to a particular virtual machine on different datastores within the datastore cluster
        • Log into the vSphere client
        • Navigate to the Datastores and Datastore Clusters view (Ctrl+Shift+D)
        • Right-click a datastore cluster from the inventory tree > click Edit Settings…
        • Choose Rules > click Add…
        • Enter in a Name for the new rule > choose the type of rule from the dropdown

sdrs_rules_vmdk

          • VMDK anti-affinity
            • Click Add
            • Click the Select Virtual Machine button
            • Choose a virtual machine from the list > click OChoose the virtual disks you want to separate (in the screenshot below there is only one virtual disk, you need at least two before you can proceed)
            • Click OK

sdrs_rules_vmdk1

          • VM anti-affinity
            • Click Add
            • Select two or more virtual machines from the list
            • Click OK

sdrs_rules_vm1

            • Click OK when finished

Tools

Jul 232012
 

For this objective I used the following documents:

  • Documents listed in the Tools section

Objective 3.2 – Optimize Virtual Machine Resources

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Compare and contrast virtual and physical hardware resources
    • At its most basic form, virtual resources allow you to overcommit your virtual machines. Virtual resources are the makeup of physical resources and allow flexibility
    • Over commitment of virtual resources is a good idea as long as its managed well. Configuring X amount of virtual resources on a virtual machine does not meant the commensurate physical resources will be used, which is where the flexibility of virtual resources comes in
    • While virtual hardware resources add overhead that physical resources do not, using virtual resources you can get the most out of the physical resources
          • I don’t have a lot more to say about this. I didn’t see any reference to this topic in the documentation and I believe the basic comparisons of physical vs. virtual still apply. Please, if anyone has a reference from the documentation please let me know in the comments
  • Identify VMware memory management techniques
    • With the introduction of vSphere 5 new management techniques were introduced to further optimize memory management
    • Hosts allocate memory to virtual machines based on their most recent working set size and relative shares to the resource pool. The working set size is monitored for 60 seconds (default period). This interval can be changed by modifying the advanced setting Mem.SamplePeriod
    • A cool new feature introduced with vSphere 5 is VMX Swap. I’ve explained this in previous objectives, but I’ll go through it real quick. Every VM has memory overhead, and that memory is reserved during power on. A chunk of that memory is reserved for the VMX process. Instead of using physical memory for the VMX process, VMX swap files are created during power on and memory needed for the VMX process is swapped to the VMX swap files instead of using memory. This feature is enabled by default and is invoked when the host memory is overcommitted
    • ESXi memory sharing allows virtual machines running the same operating systems and/or applications to, when possible, share the memory pages. This technique is called Transparent Page Sharing (TPS). You can set advanced settings per-host to specify a custom interval of how often the host scans for memory and host much CPU resources to consume doing it. Those two settings are Mem.ShareScanTime and Mem.ShareScanGHz. The defaults are 60 (minutes) and 4 respectively
    • Memory Compression is a technique that is used right before pages start getting swapped to disk. Memory pages that can be condensed into 2KB or less are stored in what’s called the virtual machine’s compression cache. You can set the maximum size of the compression cache with the Mem.MemZipMaxPct advanced setting. The default is 10%. If you want to enable/disable memory compression us the Mem.MemZipEnable advanced setting. Use the value 0 to disable and 1 to enable
    • Before getting into the memory reclamation techniques lets talk about the idle memory tax. The idle memory tax is a construct that, during a time of contention, will reclaim idle memory that is held by a virtual machine. Jason Boche (blog / twitter) has an older, but still excellent and relevant blog post on the Idle Memory Tax (IMT). The more idle memory a virtual machine has, the more the tax goes up, effectively reclaiming more memory. There are two advanced settings associated with the idle memory tax; Mem.IdleTax and Mem.IdleTaxType. Mem.IdleTax is the maximum percentage of total guest memory that can be reclaimed by the idle memory tax, with a default of 75%. Mem.IdleTaxType specifies whether the tax increases/decreases based on the amount idle memory (this is called variable and is the default). For this setting, 1 is the default (variable) and 0 is for a flat rate
    • There are two memory reclamation techniques that are used when memory contention exists amongst virtual machines; memory ballooning and memory swapping
    • Memory ballooning uses the memory balloon driver, known as vmmemctl which is loaded into the guest operating system as part of the VMware tools installation. Obviously, if the virtual machine doesn’t have VMware tools installed, it won’t have the balloon driver, which means the ballooning technique will be skipped and swapping may occur (this is bad!). When memory pressure exists the balloon driver determines the least valuable pages and swaps them to the virtual disk of the virtual machine (this is not host swapping). Once the memory is swapped to virtual disk, the hypervisor can reclaim that physical memory that was backing those pages and allocate it elsewhere. Since ballooning is performing swap to virtual disk, there must be sufficient swap space within the guest operating system. You can limit the amount of memory that gets reclaimed on a per-virtual machine basis by adding sched.mem.maxmemctl line to the virtual machine configuration file. The value is specified in MB
    • There are two types of swap to disk mechanism; swap to disk and swap to host cache. Swapping to disk is the same as it’s been in previous versions; a swap file is created (by default in the same location as the virtual machine’s configuration file) during power-on and during times of memory contention, if ballooning doesn’t slow/stop contention or the balloon driver isn’t working or available, swapping to disk occurs. Alternatively, for swap file location, you can change this per-VM, per-host, or specify a datastore for an entire cluster
    • Host cache is new in vSphere 5. If you have a datastore that lives on a SSD, you can designate space on that datastore as host cache. Host cache acts as a cache for all virtual machines on that particular host as a write-back storage for virtual machine swap files. What this means is that pages that need to be swapped to disk will swap to host cache first, and the written back to the particular swap file for that virtual machine
  • Identify VMware CPU load balancing techniques
    • CPU affinity is a technique that doesn’t necessarily imply load balancing, but it can be used to restrict a virtual machine to a particular set of processors. Affinity may not apply after a vMotion and it can disrupt ESXi’s ability to apply and meet shares and reservations
    • ESXi hosts can take advantage of multicore processors and use them to produce the most optimized performance for your virtual machines. The ESXi CPU scheduler is aware of the processor topology within the system and can see how the sockets, cores and logical processors are related to each other
      • By default, the CPU scheduler will spread the workload across all sockets in the system in undercommitted systems
      • You can override the default behavior by adding sched.cpu.vsmpConsolidate = True to the virtual machine configuration file. This setting will prevent the workload from being spread across all sockets it and limited it to the same socket
    • Hyperthreading is a feature that only exists in certain Intel processor families. Hyperthreading breaks up a single core on into two logical threads. This allows vCPU1 to execute instructions on thread1 while vCPU2 can execute instructions on thread2
      • Be careful when setting manual CPU affinity when hosts have hyperthreading enabled. The scenario exists where two virtual machines get bound to the same core (one on thread1 and one on thread2) which could be detrimental to the performance of those workloads
      • Hyperthreading is needs to be enabled in the Host BIOS and once that is done, should automatically be enabled in vSphere
    • NUMA (Non-Uniformed Memory Access) nodes work differently then your standard x86 system, and therefore, ESXi has a separate CPU scheduler; the NUMA scheduler. At a very high level, NUMA is an architecture that provides more than one memory bus. Each socket has its own bus to memory and the physical processors have the option to access memory that isn’t located on its dedicated bus (that’s the non-uniform part)
      • When a virtual machine is allocated memory, it takes memory locality into mind, meaning it will provide best effort in assigning memory that is from the home node (the home node is a term used to describe a processor and memory local to that processor)
      • If there is an imbalance in the load, the NUMA scheduler can change a virtual machines home node on-the-fly (CPU DRS for NUMA?). Even though the home node moves to a new home node, it does not automatically mean that the memory is relocated to its new home node, however the scheduler has the ability to relocate remote memory to once again make it local
      • The dynamic load balancing algorithm will exam the load and decide whether a rebalance is needed, this happens every two seconds by default

 

  • Identify pre-requisites for Hot Add features
    • There are a lot of different virtual hardware items that can be hot added to a virtual machine. Even though the topic doesn’t specifically refer to CPU and memory, that is what I’m going to focus on
    • Hot add cannot be abled for all virtual machines, here are some of the prerequisites:
      • Only certain guest operating systems are supported for hot add, so ensure the guest operating system you are using supports it
      • Hot add must be enabled per virtual machine and the virtual machine must be powered off in order to enable it
      • If you are hot-adding multocore vCPUs then the virtual machine must be using hardware version 8
      • If you are hot-adding a vCPU to a virtual machine using virtual hardware 7, the number of cores per socket must be set to 1
      • The virtual machine MUST have at least hardware version 7 or later
      • Install VMware tools
    • You can perform hot-add operations through the standard vSphere client or the vSphere web client

Skills and Abilities

  • Tune Virtual Machine memory configurations
    • In this section (and the rest of the “tuning” sections) I will not go over how to identify bottlenecks or misconfigurations (such as using ESXTOP to diagnose). I will simply be listing some recommended practices that should optimize and make your virtual machines more efficient. ESXTOP and vscsiStats will be covered in section 3.4 and in section 6
    • Pay attention to your virtual machine memory allocation. You don’t want to overcommit to the point where the VM starts swapping to host cache, or worse, disk. You can use the built-in performance charts and esxtop / resxtop to determine whether the VM is swapping pages to virtual disk or the host is swapping to disk (these items are covered in detail in section 3.4 and section 6)
    • Don’t oversize memory on your virtual machines
      • Even if you have the available physical memory, don’t allocate anymore then what’s needed. Over-allocating memory will waste physical memory as the more memory you allocate to a virtual machine, the more memory the vmkernel takes for overhead
    • Proceed cautiously when setting memory reservations and limits. Setting these too low or too high can cause unnecessary memory ballooning and swapping
    • Ensure VMware tools is installed and up-to-date
    • If you need to control priority over memory, use memory shares to determine relative priority
    • Use an SSD disk to configure Host cache
  • Tune Virtual Machine networking configurations
    • Here’s an easy one – use the paravirtualized network adapter, also known as the VMXNET3 adapter
      • Requires VMware tools to be installed
      • Requires virtual machine hardware version 7 or later
      • Ensure the guest operating system is supported
      • Enable jumbo frames for the VM if the rest of the infrastructure is using jumbo frames
        • Is set in the guest OS driver

 

  • Tune Virtual Machine CPU configurations
    • If hyperthreading is enabled for the host, ensure that the Hyperthreaded Core Sharing Mode for your virtual machines are set to Any

ht_advcpu (1)

    • If you need to disable hyperthreading for a particular virtual machine, set the Hyperthreaded Core Sharing Mode to None
    • Select the proper hardware abstraction layer (HAL) for the guest operating system you are using
        • This only applies for the guest operating systems that have different kernels for single processor (UP) and multiple processors (SMP). Single vCPU would use UP and all others will use SMP
    • If your application or guest OS can’t leverage multiple processors then configure them with only 1 vCPU
    • If your physical hosts are using NUMA, ensure the virtual machines are hardware version 8 as this exposes the NUMA architecture to the guest operating systems allowing NUMA aware applications to take advantage of it. This is known as Virtual NUMA

 

  • Tune Virtual Machine storage configurations
    • Logical disks you create inside the guest OS should be separated into separate VMDK files. In other words, have a 1:1 for logical disks and VMDKs for your OS disk and data disks
    • Ensure the guest operating system disks are aligned with the VMFS volumes they reside on
      • Some guest operating systems (such as Windows Server 2008) do this automatically
    • Consider using the paravirtualized SCSI (PVSCSI) adapter
      • The PVSCSI adapter can provide higher throughput and lower CPU utilization
      • Requires virtual machine hardware version 7 or later
    • Large I/O requests have the potential to be broken up into smaller requests by the device driver within the guest OS. Modify the registry to increase the block size as fragmented I/O requests can reduce performance

 

  • Calculate available resources
    • In vSphere 5 there are many visualizations within the GUI that will show you what the available resources are for a cluster, host or a virtual machine. You can also determine available resources for a host using esxtop
    • Determining available resources for a cluster can be done by viewing the Resource Distribution Chart
      • Log into the vSphere client
      • Navigate to the Hosts and Clustersview > select a cluster from the inventory tree
      • In the vSphere DRS  pane on the right, click the View Resource Distribution Charthyperlink
      • Here you can see CPU and Memory usages in MHz or MB or the percentage of each

clusres_cpu

clusres_mem

    • Viewing available host memory in the GUI
      • Log into the vSphere client
      • Navigate to the Hosts and Clusters view > select a host from the inventory tree
      • You can view the current host resource utilization, as well as available resources by taking the total capacity and subtracting the current usage > located in the Resources pane

hostres_cpumem

    • You can also calculate available host resources using esxtop
      • SSH into a host and type esxtopat the command line
      • On the CPU screen (default screen when running esxtop, press C to get to it) you’ll see two lines at the top called PCPU USED (%) and PCPU UTIL (%)

esxtop_pcpu

      • There is some difference between PCPU USED and PCPU UTIL, but for calculating available CPU resources, lets focus on PCPU USED (%). You’ll see each physical CPU represented on this line and its corresponding USEDpercentage
      • PCPU USED (%)represents the effective work of that particular VCPU, thus, allowing you to calculate the available resources per PCPU
      • You can also look at AVG, which is the last field in the PCPU USED (%) line and that averages all PCPUs. This would tell you the overall CPU resources used for the host (thus enabling you to calculate the available resources)
      • To calculate the available memory for a host in esxtop press the M button to navigate to the memory screen. This time we’ll focus on the second and third lines, which are PMEM /MB and  VMKMEM /MB, respectively. This is physical memory represented in megabytes and vmkernel memory represented in megabytes

image

      • PMEM /MBwill show you your total amount of physical memory, how much is being used by the vmkernel and how much memory is free on the host
      • VMKMEM /MB important items are the rsvd and ursvdfields. These represent, in MB how much memory is reserved and unreserved for the host. Here is why these are important:
        • If your PMEM is showing 20GB of memory available, but the VMKMEM only shows 15GB ursvd (unreserved) then your virtual machines only have 15GB available to them
    • You can view the available virtual machine resources through the GUI as well
      • Log into the vSphere client
      • Navigate to the Hosts and Clustersview > select a virtual machine from the inventory tree
      • On the right, click on the Resource Allocationtab
      • Here you can see the physical CPU and Memory that is allocated to the virtual machine. You can also see what is being consumed (CPU) or is active (MEM ) within the guest operating system

vmres_cpumem

      • Above you can see that this particular virtual machine could consume up to ~4.5Ghz of CPU, but is only consuming 113MHz. You can also see that the VM has the potential to use ~9GB of memory, and it has consumed 6.4GB, but there is only 645MB active

 

  • Properly size a Virtual Machine based on application workload

 

  • Modify large memory page settings
    • Large memory page settings are configured per-host. Here are a list of existing large page settings:

largepages

    • You can modify any of the settings above by doing the following
      • Log into the vSphere client
      • Navigate to the Hosts and Clustersview > select a host from the inventory tree
      • On the right, click the Configurationtab
      • In the Software pane click the Advanced Settingshyperlink
      • Click the Mem object on the left > find the setting you want to modify (see list above)

advset_mem

  • Understand appropriate use cases for CPU affinity
    • CPU affinity is a tricky thing, and as a general practice, shouldn’t be used.
    • Some use cases (there are very few) that you want to use CPU affinity:
      • Simulating a workload
      • Load testing for an application
      • Certain workloads can also benefit from this
        • From a blog post by Frank Denneman: “When the virtual machine workload is cache bound and has a larger cache footprint than the available cache of on CPU, it can profit from aggregated caches”
    • Along with understanding the use cases for CPU affinity, it is important to understand some potential issues that are associated with it:
      • If you are using NUMA hardware, the NUMA scheduler may not be able to manage virtual machine with CPU affinity, essentially disabling NUMA for that virtual machine
      • Hyperthreaded enabled hosts may not be able to fully leverage hyperthreading on a virtual machine with CPU affinity
      • Reservations and shares may not fully be respected for a virtual machine configured for CPU affinity
      • CPU affinity might not exist for a virtual machine across all hosts in a cluster during a migration

 

  • Configure alternate virtual machine swap locations
    • Configuring an alternate location for virtual machine swap files is a simple task, but can be mundane if you have to do it for a lot of virtual machines
      • Alternatively you could configure an alternate swap location for the host
        • Select host from inventory tree
        • Choose Configuration > click Virtual Machine Swapfile Locationhyperlink
        • Click the Edit… hyperlink (if this is disabled then you have to choose the Store the swapfile in datastore specified by host option for the cluster

image

        • Select the datastore where you want to store the swapfile for all virtual machines on that host

image

        • Click OK
    • To configure the alternate swapfile location for a particular virtual machine
      • Log into the vSphere client
      • Navigate to the Hosts and Clustersview
      • Right-click a virtual machine from the inventory > click Edit Settings…
      • Click the Options tab > at the bottom click the Swapfile Location

image

      • Select one of the following options (as you can see in the screenshot above)
        • Default – which is either the cluster default or host default
        • Always store with the virtual machine – swapfile is stored in the same directory as the host
        • Store swap file in the host’s swapfile directory

Tools

Jul 182012
 

 

Welcome to the first ever ValCo Labs giveaway event! Since this is my contest/giveaway cherry, I don’t want to go too CRAZY because any subsequent contests/giveaways have the potential for being subpar, and nobody wants that!

How to Enter:

Leave a comment in the comments section and include the following:

  1. If you could change one thing about vSphere, what would it be?
  2. In an effort to make this site better, give a honest opinion of valcolabs.com (the site you are at!) Any feedback, good or bad is welcome
  3. Name the podcasts you listen to, if any (I’m always curious what other technology enthusiasts listen to)
  4. Include an email address or some way to get in contact with you

 

 

Here’s what you get:

  • VMware vSphere 5 Training by David Davis and Elias Khnaser
    • Trainsignals vSphere 5 training was an invaluable resource to me when I was studying for my VCP5 exam. The audio files that came with it definitely helped me with my preparation.
  • vSphere Advanced Networking Training by Jason Nash
    • I wasn’t sure what to expect with this course. I mean, there’s a standard switch and a distributed switch, and some settings, what else is there? Answer, A CRAP TON MORE. I am halfway through this course and I can tell you it has a lot of good content. I’m currently preparing for my VCAP5-DCA exam and I will be using this to help me prepare.
  • Mastering VMware vSphere 5 (paperback or Kindle edition) by Scott Lowe
    • I read this book cover-to-cover during my VCP5 preparation and referenced it many times after. A MUST have for any aspiring VCP
  • One VCP5 voucher
    • Will be sent via email to the winner
  • One awesome TRAINSIGNAL sticker!
    • it’s a sticker, and it’s awesome

 

WP_000627

 

The winner will be chosen at random on August 9th

 

Rules and Restrictions

  1. You cannot participate If you work for VMware, EMC, or Trainsignal
  2. If you work for the Federal Government (Armed Services, DoD Civilian, etc..) you aren’t eligible for the VCP5 voucher (sorry, but the manner in which the voucher was obtained precludes said parties, if you win and are in the armed services, I’ll buy you a voucher)
  3. If you have your VCP5 already, please don’t participate, give others a chance
  4. If your first name is Josh you can participate
  5. I am willing to ship worldwide so there is no geographical discrimination
  6. If your last name is Atwell you can participate
  7. Following me on twitter (@joshcoen) will NOT increase your chances of winning
  8. If rules four and six both apply then you are ineligible for all prizes
Jul 132012
 

For this objective I used the following documents:

Objective 3.1 – Tune and Optimize vSphere Performance

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Identify appropriate BIOS and firmware setting requirements for optimal ESXi host performance
    • BIOS settings on your hosts is an important thing to take into consideration when optimizing your environment. Here are some general guidelines (pulled from the aforementioned whitepaper) you can follow that will assist you in your optimization efforts
      • Ensure you are using the most up-to-date firmware for your host
      • Ensure all populated sockets are enabled
      • Enable “Turbo Boost” if your processor supports it (is this like the turbo button on my x486?)
      • If your processor(s) support hyper-threading, make sure it is enabled
      • Set Disable node interleaving to Enabled (disabling this setting will essentially disable NUMA)
      • Disable any hardware devices that you won’t be using
      • Depending on your workload characteristics you may, or may not, want to disable cache prefetching features. Workloads that randomly access memory may get a performance boost if these features are disabled
      • Set the CPU power-saving features to “OS Controlled”
        • this will allow the hypervisor to control and manage these features
      • Last, but certainty not least, enable Hardware Virtualization (VT). You will know right away if this is NOT enabled if you try and boot a 64-bit virtual machine and get a ‘longmode’ error
  • Identify appropriate driver revisions required for optimal ESXi host performance
    • I don’t know exactly what it is they are looking for here, and I can’t find it in any of their product documentation. A few things that come to mind though:
      • Check the VMware HCL
        • From the dropdown you can select a category of what you are looking for

image

        • In this example I chose IO Devices
        • You can then select which VMware product and version and then select which vendor and I/O Device type
        • Click Update and View Results
        • Scroll through the list until you find the device you are looking for. The model of the device should be a hyperlink, click the hyperlink
        • Here you will see pertinent information for the release and the device driver to use

image

        • I have no idea if this is the ‘optimized’ driver, but logically you would think it’s the best device driver for that device, based on the product version
      • The next best place to look would be to check the vendor’s website and see if they have made a separate driver to use with your version of vSphere

Skills and Abilities

  • Tune ESXi host memory configuration
    • In this section (and the rest of the “tuning” sections) I will not go over how to identify bottlenecks or misconfigurations (such as using ESXTOP to diagnose). I will simply be listing some recommended practices that should optimize and make your hosts more efficient. Troubleshooting will be covered in section 6
    • One thing that you will see a lot of are blanket memory configurations for virtual machines, such as all Windows Server 2K8 R2 VMs will get a base of 4GBs of RAM and will only be increased if needed. On the surface this seems like a good practice, but what if that VM only needs 3GB?
      • Virtual machine memory overhead is dependent on the configured memory size of a virtual machine, the more you configure, the more overhead it takes, the less memory is available for your other virtual machines.
        • Don’t under-configure the memory where the working set can’t keep up because of too little memory (thrashing)
        • Don’t over-configure the memory where the working set doesn’t use all the configured memory and now you have wasted more memory than needed on memory overhead
    • The same concept above applies to the number of  vCPUs you configure for a virtual machine. The more vCPUs you configure increases the amount of memory overhead.
      • Don’t give you virtual machines more vCPUs then what is needed. Doing so increases memory overhead
    • Memory over-commitment is a feature of vSphere, and VMware has 5 different mechanisms to deal with over-commitment. There are a few things to keep in mind when talking about over-commitment, and tuning our hosts to use it effectively
      • The biggest degradation of performance to a virtual machine is when the host starts swapping to disk. There are four other memory over-commitment techniques that are used before swapping to disk
        • Don’t disable these other memory over-commitment techniques; ballooning, page sharing and memory compression
    • Use the new swap to host cache feature
      • This is a new memory over-commitment technique that allows the host to swap to cache instead of to disk. The ‘cache’ it is referring to is a SSD disk
        • Configure a SSD as host cache, which will get much better performance than swapping to traditional disk
    • Virtual machine swap files are created in the VM working directory by default (typically where the .vmx file is located)
      • Ensure that location of those swap files have enough free disk space. The swap file is created dynamically during a power on operation and the size is determined using the following calculation:
      •                              allocated mem – reserved mem = swap file size (.swp)
      • Don’t place swap files on thin-provisioned disks
    • The biggest take away here should be, its OK to overcommit memory, but not to the point where you are swapping out to disk.
  • Tune ESXi host networking configuration
    • One thing that you want to monitor when thinking about virtual networking and how to make it perform as efficient as possible is your CPU utilization. Virtual networking relys heavily on the CPU to process the network queues. The higher CPU utilization you have, the less throughput you make get
    • DirectPath I/O may provide you a bump in network performance, but you really need to look at the use case. You can lose a lot of core functionlity when using this feature, such as vMotion and FT (some special exceptions when running on UCS for vMotion) so you really need to look at the cost:beneift ratio and determine if it’s worth the tradeoffs
    • You can control your bandwidth and how it is allocated by using Network I/O Control (NIOC). You allocat bandwidth to resource pools and use shares/limits to establish priority. There seven pre-defined network resource pools:

image

      • There is also something called a user-defined resource pool in which you can create your own resource pool in order to prioritize other traffic not covered by the pre-defined pools
    • User-defined pools are pretty archaic, all you can do is assign shares and a QoS priority tag. Let’s go through an example of creating a user-define network resource pool:
      • Log into the vSphere client
      • Switch to the Networking view by selecting the View menu > select Inventory > select Networking (Ctrl + Shift + N)
      • Select a vDistributed Switch from the inventory on the left (remember that NIOC requires an enterprise+ license) > click the Resource Allocation tab
      • Click the New Network Resource Pool… hyperlink

image

      • Enter in a Name and Description
      • Set the Physical Adapter Shares value (Low, Normal, High or Custom)
      • If you Uncheck the Unlimited option be sure to enter in what amount, in Mbps that you want to set it to
      • Set a QoS Priority Tag if desired and select a tag from the dropdown (1-7)
      • Click OK

image

    • Use separate vSwtiches with different physical adapters. Doing so should help avoid unnecessary contention between the VMkernel and virtual machines
    • The us of the VMXNET3 paravirtualized adapter should be used as the standard, not the exception. When creating new virtual machines you should be asking yourself “Why shouldn’t I use VMXNET3?”, not “Why should I use VMXNET3?”
    • If you have network latency sensitive applications you want to adjust the ESXi host power management settings to the maximum performance.You do this so resources aren’t asleep for some reason when your application needs them
      • Log into the vSphere client and navigate to the Hosts and Clusters view
      • Select a host from the inventory > click the Configuration tab
      • In the Hardware pane click the Power Management hyperlink
      • Click the Properties hyperlink in the upper right
      • Select the High Performance option
      • Click OK
    • Also for applications that are sensitive to network latency you want to disable C1E and other C-states in the BIOS of the host(s) that the application may run on
    • When using the VMXNET3 networking adapter, there is a feature called virtual interrupt coalescing. Disabling this feature can improve performance for certain network latency-sensitive applications. However, be careful when enabling this as it may reduce performance for other types of workloads. This is enabled per-VM with the ethernetX.coalescing.Scheme advanced configuration option, which we’ll go over configuring in a later section
    • SplitRx Mode is a new feature that was introduced with vSphere 5.0 and it can improve performance for virtual machines in certain circumstances. Typically, networking traffic comes into a network queue and is processed by a single physical CPU. SplitRx Mode is a per-VM setting that allows network traffic coming into a single network queue to be processed by multiple physical CPUs
      • If the VM is a network appliance that is traversing traffic between virtual machines on the same host using the API, then throughput may be increased with the use of SplitRx
      • If you have more than one virtual machine on the same host receiving multicast traffic from the same location then SplitRx can improve throughput and CPU efficiency
      • Enable SplitRx mode using the ethernetX.emuRxMode advanced configuration setting
  • Tune ESXi host CPU configuration
    • This may be a given, but turn on DRS. You don’t want a host getting overloaded with VMs and maxing out the CPU when there are other hosts in your cluster that have idle CPU cycles
    • Don’t configure your VMs for more vCPUs then their workloads require. Configuring a VM with more vCPUs then it needs will cause additional, unnecessary CPU utilization due to the increased overhead relating to multiple vCPUs
    • If your hardware supports Hyper-threading (the hardware itself and BIOS) then the hypervisor should automatically take advantage of it. If your hardware does support hyper-threading but it doesn’t show enabled in vCenter, ensure that you enable it in your hardware BIOS
      • Here you can see that hyper-threading is enabled

ht_procs

      • In vCenter you can enable/disable hyper-threading by going to the Configuration tab of the host > click the Processors hyperlink > click the Properties hyperlink

ht_enable

    • When using hyper-threading ensure that you leave the per-VM advanced CPU setting to Any. Changing this setting to None will essentially disable hyper-threading for that particular virtual machine as it will place the other ‘core’ in a halted state

ht_advcpu

    • When dealing with NUMA systems, ensure that node interleaving is disabled in the BIOS. If node interleaving is set to enabled it essentially disables NUMA capability on that host
    • When possible configure the number of vCPUs to equal or less than the number of physical cores on a single NUMA node
      • When you configure equal or less vCPUs:physical cores the VM will get all its memory from that single NUMA node, resulting in lower memory access and latency times
  • Tune ESXi host storage configuration
    • Enable Storage DRS. Even if you set it to manual, enable Storage DRS in order to get the initial placement recommendations. Storage DRS is enabled by creating a Datastore Cluster. This has been covered in Objective 1.2 – Manage Storage Capacity in a vSphere Environment so I won’t go over it again, but just know that you should enable this when possible
    • Turn on Storage I/O Control (SIOC) to split up disk shares globally across all hosts accessing that datastore. SIOC will proportionally assign disk shares per-host based on the sum of VM disks shares and total disk shares for that datastore
    • Ensure that the storage configuration setup has enough IOPs to support the virtual machine workloads running on said storage
    • One of the key metrics you want to monitor in r/esxtop are GAVG counters, these are the “guest average” counters and they indicate what the guest VM is seeing. For example, the GAVG/cmd counter will show what latency the guest VM is seeing when accessing that particular storage device. Again, this will be covered more in-depth in Section 6
    • Ensure that your multi-pathing policies are set in accordance with the best practices from VMware and your storage vendor. Even if the multi-pathing policy you are currently using might e working, it doesn’t mean that there isn’t a better one out that that is more efficient
  • Configure and apply advanced ESXi host attributes
    • There are many advanced host attributes that can be set, such as for memory or CPU
    • Configure Advanced ESXi Host Attributes
      • Log into the vSphere client
      • Click on a host from the inventory > click the Configuration tab
      • On the right, in the Software pane click the Advanced Settings hyperlink

advsettings

      • Choose the item on the left where the attribute is located, such as Cpu
      • On the right, locate the proper attribute and make the required change
      • A list of Memory and CPU advanced attributes can be found in the vSphere Resource Management guide on pages 104-106
  • Configure and apply advanced Virtual Machine attributes
    • Advanced virtual machine attributes are changed per VM and typically the VM will need to be powered off in order to make the change
      • I have successfully made advanced VM changes with VMs powered on using PowerCLI and then either powering the VM off/on or performing a vMotion. The vMotion is registering the VM on a new host, which means it goes through the .VMX file again
    • Configuring Advanced Virtual Machine Attributes
      • Log into the vSphere client
      • From the inventory, right-click a VM and select Edit Settings…
      • Click the Options tab > click General > click the Configuration Parameters button

configparam

      • Click the Add Row button
      • Enter in the Name of the attribute and the Value

newparam

  • Configure advanced cluster attributes
    • The only advanced cluster attributes that I know of are for vSphere HA. If there are others that can be configured for the cluster please let me know!
    • Configure Advanced Cluster Attributes
      • Log into the vSphere client
      • From the inventory, right-click on a cluster and click Edit Settings…
      • Click on vSphere HA > click the Advanced Options… button

advha

      • Here you can add different options and values. A list and explanation of advanced HA options can be found on Ducan Epping’s (blog / twitter) HA Deepdive post
      • Click OK when finished

Tools

Jun 262012
 

For this objective I used the following documents:

  • Documents listed in the Tools section

Objective 4.2 – Deploy and Test VMware FT

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Identify VMware FT Hardware Requirements
    • There are some pretty strict hardware requirements when it comes to VMware Fault Tolerance (FT). Only certain processors are compatible with VMware’s vLockstep technology. What this means is the vLockstep technology requires certain extensions on the physical processor in order for it to work. VMware provides a nice utility called VMware SiteSurvery which will generate a report showing you your host’s hardware compatibility with VMware technology, including Fault Tolerance (you need to have the vSphere client installed on the system you are loading SiteSurvey on)
    • There are too many processor combinations to list out here, but running SiteSurvey will tell you if your processors are compatible. Check out the help page for SiteSurvey which includes a list of FT capable processors
    • Here is a list of other hardware requirements
      • Both hosts that are hosting the FT virtual machines must have processors in the same family and be within +/- 400MHz of each other
      • The hosts must also be certified for FT use. Check out the VMware Compatibility Guideto ensure your hosts are supported
        • Open the VMware Compatibility Guide in a web browser
        • Choose the version of vSphere you are using
        • In the Features list box choose Fault Tolerant(FT)
        • Click Update and View Results

image

      • Just searching on these two items will return a lot of results. Feel free to narrow down the search by selecting additional criteria
      • The host(s) must support Hardware Virtualization and it must be enabled in the BIOS
      • If your hosts are part of a vSphere cluster you can check the Profile Compliancetab to determine if they are compatible with FT
        • Log into the vSphere client
        • From the inventory select the cluster which contains that hosts you are checking FT compatibility on
        • Select the Profile Compliance tab on the right
        • Click the Check Compliance Now hyperlink
        • Check the Compliance Statusto see whether it is compliant or not
          • Click the Description hyperlink to view the details

image

        • As you can see from the screenshot above, my hosts meet the physical hardware requirements and compatibility requirements (which is covered in the next section) for FT. The Profile Summary is just a list of what items are checked, it does not indicate which checks have passed or failed
  • Identify VMware FT compatibility requirements
    • There are a slew of compatibility requirements on your cluster and virtual machines for FT; lets start with the cluster requirements
    • Cluster Requirements
      • Both hosts participating in FT must have access to the same datastores that the FT virtual machine resides on
      • Host certificate checking must be enabled
        • Log into the vSphere client
        • Click the Administration menu > select vCenter Server Settings…
        • Click SSL Settings > ensure that the vCenter required verified host SSL certificates checkbox is checked

image

      • FT logging and vMotion networks must be configured
      • The hosts must be part of a HA enabled cluster
    • Virtual Machine Requirements
      • Virtual machine files must be on storage that both hosts have access to
      • The FT virtual machine cannot have more than 1 vCPU
      • Ensure the virtual machine is running a supported guest operating system
    • vSphere Compatibility Requirements
      • Virtual machines that are provisioned as a linked clone are not supported for FT
      • Storage vMotion is not supported on FT virtual machines
        • Disable FT if you want to perform a storage vMotion and then re-enable
      • If you are using an application that leverages VADP (API for data protection) to backup your virtual machines then you won’t be able to enable FT on it
      • You cannot snapshot a FT virtual machine
    • Incompatible Features and Devices (list taken straight from the Availability Guide)
      • You can’t use SMP, only 1 vCPU for a FT virtual machine
      • The FT virtual machine cannot have a physical raw device mapping
      • Virtual CD-ROM/floppy drives cannot be backed by a physical or remote device
      • Paravirtualization is not supported on FT virtual machines
      • USB and sound devices
      • NPIV is not supported
      • NIC passthrough or the use of the vlance networking drivers
      • Thin-provisioned virtual disks
        • When enabling FT, thin-provisioned disks will try to be expanded automatically. However, the virtual machine must be powered off
      • Hot-pluggable devices are not supported
      • Extended Page Tables/Rapid Virtualization Indexing is disabled
      • Serial or parallel ports
      • IPv6 is not supported with the FT logging, so use IPv4
      • Video devices that have 3D enabled are not supported for FT

Abilities

  • Modify VM and ESXi host settings to allow for FT compatibility
    • As discussed earlier there are strict requirements that must be followed in order to run FT in your environment. Those requirements are for hosts and virtual machines  (most if not all of this section I just covered in the preceding section, if you think something else should be a part of this section that I have not listed, please feel free to leave a comment)
    • Host Settings
      • Configure the proper networking on each host. You will need at least two physical 1Gb NICs. Configure one vSwitch or port group with on physical NIC for vMotion and create another vSwitch or port group with the other physical NIC for FT logging (more detail on how to configure FT logging is below)
      • Ensure that the hosts you are using for FT are at the same vSphere build
      • Configure shared storage for the hosts. The hosts that will be hosting the FT VMs need to have access to the storage in which the FT virtual machine’s files are located
      • Ensure that all of the requirements listed in the preceding section Identify VMware FT compatibility requirements have been met
    • Virtual Machine Settings
      • No snapshots
      • Only 1 vCPU
      • No physical raw device mappings
      • Refer to the list in the preceding section, Identify VMware FT compatibility requirements, for all the of the virtual machine restrictions
  • Use VMware best practices to prepare a vSphere environment for FT
    • Here are a list of some good best practices to use when preparing to implement FT
      • Your hosts should run +/- 400MHz as it relates to CPU frequency
      • Ensure your hosts are running the same CPU instruction sets. These setting are typically enabled/disabled in the systems BIOS
      • Use 10Gb NICs and enable jumbo frames for FT logging
      • Store any required ISO files on a datastore that both hosts have access to
    • Avoid Network Partitions
      • I discussed this in the first part of this objective. If you have a network partition and the master that owns the primary FT VM does not own the secondary FT VM then the secondary will not start if the primary fails
    • Best practices for performing host upgrades
      • Since FT requires that the FT virtual machines run on hosts with the same version and build, you’ll need a methodical way of performing upgrades to those hosts
        • vMotion the primary and secondary FT virtual machines off of the two hosts that you will be upgrading
        • Upgrade both hosts
        • On the primary VM, turn off FT
        • vMotion the primary VM (which currently has FT disabled) to one of the new upgraded hosts
        • Turn FT back on
    • There are some other FT configuration recommendations that, in most cases, should be followed
      • No more than four FT virtual machines on a single ESXi host. Total should include both primary and secondary VMs
      • Allocate excess memory to the resource pool that contains the FT virtual machines. This allows for overhead memory
        • A reservation is automatically set to the configured memory amount when you enable FT. The excess memory will allow for overhead should the FT VM utilize the the configured amount of memory
      • Do not use more than 16 virtual disks per FT virtual machine
      • Have at least three hosts in a cluster (the third host should meet the same requirements for FT and match the physical and logical configurations of the other two hosts). This allows for n+1 should a host fail, another one in the cluster will be there to allow for the creation of a secondary FT virtual machine
      • When using NFS, have at least one 1Gb NIC on the NAS hardware side
  • Configure FT logging
    • Configuring FT logging is a pretty easier process. You’ll need to do this on each host:
      • Log into the vSphere client
      • Select a host from the inventory that you will be using for FT > click the Configuration tab
      • Click the Networkinghyperlink
        • If you don’t have a vSwitch with a dedicated physical NIC for FT logging, create one now
      • Click the Properties hyperlink for the vSwitch you will be using for FT logging
      • Click the Add button > select VMkernel > click Next
      • Enter in a Network Label such as FT Logging
      • Choose a VLAN if necessary
      • Check the Use this port group for Fault Tolerance logging  > click Next

image

      • Enter in the IP Address and Subnet Mask (this should be on a different subnet than your vMotion VMkernel port group) > click Next
      • Click Finish > click Close to close the vSwitch Properties dialog box
  • Prepare the infrastructure for FT compliance
    • If you have followed all of the guidance and configuration in the preceding sections then your infrastructure should be ready for FT. Here are a few ways that you can check to make sure you’re FT compliant
      • Use Profile Compliance on whichever cluster you will be using for FT (the cluster needs to be enabled for HA for this to work)
        • Log into the vSphere client > select a cluster object from the inventory
        • Click on the Profile Compliance tab
        • Click the Check Compliance Now hyperlink
        • Ensure the Compliance Status is marked as Compliant

image

        • The above screenshot is what you don’t want to see
        • Here you can see which host(s) are not compliant

image

      • Profile Compliance will show you whether you are compliant or not for HA, but if you’re not, it doesn’t tell you which requirements haven’t been met. To do so you need to look at the Summarytab for each host
        • Log into the vSphere client > select a host object from the inventory
        • Click the Summary tab
        • In the General pane you will see Host Configured for FT: and it will say either Yes or No and it has a small dialog icon next to it
        • Click the dialog icon. This will show you if the host does or does not meet FT requirements. if it does not, it will list the issues

image

        • As you can see, the host I used for this screenshot is not configured for FT and has hardware and software configurations that must be changed in order for it to meet FT requirements
  • Test FT failover, secondary restart, and application fault tolerance in a FT Virtual Machine
    • Before we can test FT failover we need to enable FT on a virtual machine. This process is easy once you’ve meet all of the requirements that have been discussed earlier in this post
    • Turn on Fault Tolerance
      • Log into the vSphere client
      • Navigate to the Hosts and Clusters view
      • Right-click on the virtual machine that you want to enable FT on > click Fault Tolerance > select Turn On Fault Tolerance (ensure the VM is powered off)

image

      • Once you do this you will see a warning about a few operations that will occur once you turn FT on
        • Thin-provisioned disks will be zeroed out and made thick
          • This can take quite some time depending on the amount of free space that needs to be zeroed
        • DRS automation for the VM will be set to disabled
        • clear
        • A memory reservation is created; the reservation size is equal to the configured amount of memory

image

      • Click Yes to continue
      • Once complete the icon for the virtual machine will be blue
      • Right-click the VM and select Power > click Power On
      • Once it is powered on click on the VM itself in the inventory > on the right pane select the Summary tab
      • In the Fault Tolerance pane ensure that the Fault Tolerance Status shows as Protected

image

    • Test FT Failover
      • Log into the vSphere client
      • Right-click on the FT protected VM > select Fault Tolerance
      • Click Test Failover
    • The task itself only takes a second to show completed in the Recent Tasks pane, but there is a lot more that still has to happen before the failover test is complete
    • You’ll notice during the failover test that the Fault Tolerance Status goes from Protected to Not protected and that it is Starting

image

    • It will then go into a Not protected state and it will show Need Secondary VM
    • At this point in the test, the primary has failed and the secondary has now completely taken over as the primary. In order for FT to be fully operational it now needs a new secondary

image

    • You will see the status change again from Not protected, Need Secondary VM to Not protected, Starting
    • Eventually the Fault Tolerance Status will show as Protected meaning a new secondary VM has been created and FT logging is now occurring between the new primary and secondary
    • Test Secondary Restart
      • Log into the vSphere client
      • Right-click on the FT protected VM > select Fault Tolerance
      • Click Test Restart Secondary
    • Like you saw previously the Fault Tolerance Status will go from Protected to Not protected, Starting while the secondary FT VM restarts. Once the restart is complete the Fault Tolerance Status will once again show Protected
  • Unfortunately I don’t have a way to show testing of application FT. I’m not 100% sure if that is referring to application monitoring or merely testing an application running in a FT VM while it is failing over or the secondary is restarting. If it is the latter, it is simple enough to monitor and see if you stay connected to that particular application, such as Microsoft Exchange
  • There is a good VMware KB (KB1020058) that covers how to test a FT configuration. It also goes over different scenarios and what behavior to expect from FT. There are certain situations where a FT VM can become unavailable and the secondary will not take over (such as loss of network connectivity to the primary VM).

Tools

Jun 052012
 

For this objective I used the following documents:

Objective 4.1 – Implement and Maintain Complex VMware HA Solutions

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Identify the three admission control policies for HA
    • There are actually three types of admission control mechanisms; host, resource and HA. As you may be aware, HA is the only one of the three that can be disabled. There are several operations within vSphere that will result in resource constraints being applied. Operations such as powering on a virtual machine, migrating a virtual machine or increasing CPU/memory reservations on a virtual machine
    • There are three three types of HA admission control policies:
      • Host Failures Cluster Tolerates
        • Using this policy you would specify a the number of host failures. Meaning, resources are kept available based on the number of hosts you specify in order to ensure resource capacity for failed over virtual machines
        • This is accomplished using a ‘slot size’ mechanism. Slot sizes are logical constructs of memory and CPU and represent a single virtual machine.
        • Slot sizes are calculated based on the largest CPU and memory reservation for a virtual machine. If no reservations are present, the defaults are:
          • 32MHz for CPU
            • this can be changed by modifying the advanced setting das.vmcpuminmhz
          • 0MB + overhead for memory
          • The most restrictive between memory slots and CPU slots will ultimately determine the slot count
        • Lets go through an example:
          • Host 1: 8GB memory, one 2.56GHz CPU
          • Host 2: 8GB memory, one 2.56GHz CPU
          • VM1: 2GB memory reservation, 700Mhz CPU reservation
          • VM2: 3GB memory reservation, 400MHz CPU reservation
          • With the configuration above, The memory slot would be 3GB and the CPU slot would be 700MHz
          • Since these hosts are the same size, the slot size per host is 2.5 for memory and 3 for CPU
          • Since the number of memory slots is the most restrictive, it is used as number of slots per host
          • Total number of cluster slots: 4
          • Used Slots: 2
          • Available Slots: 0
          • Failover Slots: 2
          • Total powerd on vms in cluster: 2
          • Total hosts in cluster: 2
          • Total good hosts in cluster: 2
        • You can view slot information for the cluster using the Advanced Runtime Info
          • Log into the vSphere client > select a cluster
          • Click the Summary tab
          • In the vSphere HA pane click the Advanced Runtime Info hyperlink

image

      • Percentage of Cluster Resources Reserved
        • This admission control policy implements resource constraints based upon a user-defined percentage of memory and CPU resource
        • Virtual Machine resource requirements:
            • If no CPU reservation exists a default of 32Mhz is used
            • If no memory reservation exists a default of 0MB + overhead is used
          • Calculate failover capacity with the following formula
            • Total Host Resources – Total Resource requirement / Total Host Resources
            • Here’s an example:
              • Total Host CPU Resources: 5000GHz
              • Total CPU resource requirements: 2400GHz
              • 5000 – 2400 = 2600 / 5000 = 52% failover capacity
        • When admissions control is invoked, it will check the current CPU and memory failover capacity. If the operation that invoked admission control will violate the percentages defined for the cluster, then admissions control will not allow the operation to complete. Here are the steps:
          • Total resources currently being used by powered-on virtual machines is calculated
          • Total host resources are calculated (excluding overhead)
          • CPU and memory failover capacity is calculated
          • The percentage of failover capacity for CPU and memory is compared to the user-defined percentages of the cluster
          • Prior to the operation being performed, a calculation is done to determine the new failover capacity if the operation is allowed. If the new failover capacity violates the user-defined percentages (CPU or memory), then the operation is not allowed
        • If you log into the vSphere client and look at the Summary tab for a cluster you can see information related to this admission control policy in the vSphere HA pane

image

        • Here you can easily see the current CPU failover and memory capacity as well as the user-defined percentages
      • Specify Failover Hosts
        • This is the most straight-forward policy of the three. Using this admission control policy will set aside whatever number of hosts you specify ONLY for failover purposes
        • If you have a 4-node HA cluster using the Specify Failover Hosts and configure it for 1, then whichever host you specify will never be used except in the event of an HA failover
  • Identify heartbeat options and dependencies
    • vSphere HA has two heartbeating mechanisms; network and datastore heartbeating
    • Network Heartbeating
      • Network heartbeating is pretty straight-forward. Slave nodes will send a heartbeat to the master node and the master node will send a heartbeat to each of the slave nodes. The slaves do not send heartbeats to each other, but will communicate during the master node election process
      • Network heartbeats occur every 1 second by default
      • Networking heartbeating is dependent on the management address of the host
    • Datastore Heartbeating
      • Datastore heartbeating was introduced in vSphere 5 and adds another layer of resiliency for HA. Datastore heartbeating also helps in preventing unnecessary restarts of virtual machines
      • When a master node stops receiving network heartbeats it will then use datastore heartbeats to determine if the host is network partitioned, isolated or if it has complete failed
      • The datastore heartbeating mechanism is only used when:
        • The master node loses connectivity to slave nodes
        • Network heartbeating fails
      • HA will select two datastores to use for datastore heartbeating (by default, you can increase this with an advanced setting which is covered later). The criteria used for the datastore selection is:
        • Datastore that is connected to all hosts
          • this is best effort, if there aren’t datastore connected to all hosts it will select a datastore that with the highest number of connected hosts
        • When possible, VMFS datastores are chosen over NFS datastores
        • When possible, the two datastores selected will be on different storage arrays
      • Datastore heartbeating creates a file on the selected datastores for each host (VMFS)) and the file remains in an up-to-date state as long as the host is connected to the datastore. If the host gets disconnected from the datastore, then the file for that host will no longer be up-to-date. (NFS) The host will write to the heartbeat file every 5 seconds
      • If you so desire, you can manually select the datastores to be used for datastore heartbeating
        • Log into the vSphere client > right-click a cluster and select Edit Settings…
        • Under vSphere HA select Datastore Heartbeating
        • Choose the Select only from my preferred datastores radial button
        • Place a checkbox next to at least two datastores you want to use for datastore heartbeating

image

Skills and Abilities

  • Calculate host failure requirements
    • Earlier I covered how you can manually calculate host failover requirements depending on the admission control policy you’re using, but  I’ll go over it again here
    • Host Failures Cluster Tolerates
      • This uses a logical object called a ‘slot’. Depending on how many virtual machines are powered on, and what resources they are configured with, will determine the amount of slots are required for failover for any given host
      • Once you determine the slot size for CPU and Memory you calculate the total number of slots for the host
        • CPU = Total CPU resources/CPU slot size
        • Mem = Total Mem resources/Mem slot size
      • Here is an example of the slots calculation

image

      • In the example above the host failover requirement could be up to 8 slots
    • Percentage of Cluster Resources Reserved
      • You can configure separate percentages for CPU and memory.
      • If no CPU or memory reservations exist each VM will use 32MHz and 0 + overhead, respectively
      • Same scenario as before

image

      • In this example the percentages for CPU and memory are both set to 30%. The current available percentage is 82% for CPU and 81% for memory. Operations such as powering on and migrating virtual machines will not have any issues as the available percentages are well above the user-defined 30%. Assuming other hosts have the same resource configuration you would need 18% CPU and 19% memory free on another host in order for all virtual machines to be successfully failed over
    • Specify failover hosts
      • There isn’t much to calculate here, the specified hosts will stand idle unless a failover occurs
  • Configure customized isolation response settings
    • You can set custom HA isolation responses for each individual virtual machine
      • Log into the vSphere client
      • Right-click on a cluster > click Edit Settings…
      • Under vSphere HA options click Virtual Machine Options
      • Here you can set the cluster default isolation response and the isolation response for individual virtual machines
      • Find the virtual machine you want to modify > choose an option under the Host Isolation Responsecolumn
        • Leave Powered On
        • Power Off
        • Shut Down
        • Use cluster setting

image

    • There are a multitude of custom HA isolation response settings that you can configure on a HA cluster, These settings are configured at the cluster level, within the vSphere HA > Advanced Options…
      • das.isolationaddress[#] – by default the IP address used to check isolation is the default gateway of the host. You can add more IP addresses for the host to use during an isolation check. A total of 10 addresses can be used (0-9)
      • das.usedefaultisolationaddress – this option is either set to true or false. When set to false a host will NOT use the default gateway as an isolation address. This may be useful when the default gateway of your host is an unpingable address, or a virtual machine, such as a virtual firewall
      • das.isolationShutdownTimeout – use this option to specify the amount of time (in seconds) it will wait for a guest shutdown process that was initiating by invoking the isolation response, before HA will forcefully power off a virtual machine
  • Configure HA redundancy
    • Management Network
      • Since HA uses the management network to send out network heartbeats, it is a good idea and best practice to make your management network redundant. There are two ways that you can accomplish this; use NIC teaming on the vSS or vDS where your management network resides or add an additional vmkernel port on a separate vSS or vDS and enable it for management
      • NIC Teaming
        • Add an additional NIC to the vSS or VDS that hosts the management network
          • Ideally this will be physically connected to a separate switch
        • Set the new NIC as a standby adapter
        • If the active adapter fails, the standby will take over, thus allowing network heartbeats to be transmitted and received
      • Add a new vmkernel port
        • Create a new vmkernel port on an existing or new vSS/vDS that currently is not being used for management
        • Enable the vmkernel port for management
        • Network heartbeats can now be sent/received on this new vSS/vDS which will allow network heartbeats to continue should your primary management network fail
    • Datastore Heartbeat
      • The nature of datastore heartbeating is, by default, redundant. When HA is enabled it will select two datastores to use for datastore heartbeating. VMware states that two datastores are enough for all failure scenarios
      • If you have a need to configure more than two heartbeat datastores per host you can used this advanced setting
        • das.heartbeatDsPerHost – set this to the number heartbeat datastores you want to use
      • If possible, ensure you have two datastores that reside on two separate physical storage arrays
    • Network partitions
      •   A network partition is created when a host or a subset of hosts lose network communication with the master node, but can still communicate with each other. When this happens an election occurs and a one of the hosts is elected as a master
      • The criteria for a network partition is
        • The host(s) cannot communicate with the master node using network heartbeats
        • The host(s) can communicate with the master using datastore heartbeats
        • The host(s) are receiving election traffic
      • I don’t fully understand what network partitions has to do with “Configuring HA for redundancy”, but I do know that network partitions are bad. Why are they bad?
        • vSphere can only connect to one master host, so if you have a subset of hosts in a network partition, they will not receive any configuration changes related to vSphere HA until the network partition is resolved
        • Hosts can only be added to the partitioned segment that communicates with vCenter
        • When using FT, the primary and secondary VMs could end up being on a partition where the host is not responsible for the primary or secondary FT virtual machine. This scenario could prevent the secondary VM from restarting should the primarty VM fail IF the primary VM lived on host that was not responsible for that VM
          • This is possible because a master host that has a lock on a datastore is responsible for all the VMs that live on that datastore. The master host of a network partition that the FT VMs are running on may not be the master that has a lock on that datastore, thereby it is not responsible for it from a HA perspective
      • So I guess the lesson is, configure HA for redundancy in order to avoid network partitions
        • Ensure management network redundancy at the vmkernel layer, the hardware layer (think NICs on a separate bus) and the physical network layer
  • Configure HA related alarms and monitor an HA cluster
    • There are seven default alarms that ship with vCenter related to HA
      • Insufficient vSphere HA failover resources
      • vSphere HA failover in progress
      • Cannot find a vSphere HA master agent
      • vSphere HA host status
      • vSphere HA virtual machine failover failed
      • vSphere HA virtual machine monitoring action
      • vSphere HA virtual machine monitoring error
    • There are plenty of additional alarms that you can create for clusters and virtual machines related to vSphere HA. Here are a list of available triggers for each
      • Clusters

image

      • Virtual Machines

image

    • Aside from the vSphere HA alarms you can monitor an HA cluster using the Summary tab of a given cluster. In the vSphere HA pane you can look at the Cluster Status and any Configuration Issuesthat may be related to HA
      • Log into the vSphere client > click a cluster from the inventory > select the Summary tab
      • Click the Cluster Status hyperlink located in the vSphere HA pane
      • There are three tabs in this dialog box
        • Hosts: allows you to see which host is the master and how many hosts are connected to the master

image

        • VMs: shows you how many VMs are protected/unprotected

image

        • Heartbeat Datastores: shows you which datastores are being used for datastore heartbeating. Clicking each datastore shows you which hosts are using that particular datastore

image

      • Click on the Configuration Issues hyperlink
      • Here you can see any configuration issues for vSphere HA

image

      • As you can see in the example above, there is no management network redundancy for either host that is part of this HA cluster. Remember that having management network redundancy can be key in avoiding network partitions
    • Looking at the summary tab of each host that is part of a HA cluster will show you the vSphere HA Statefor that host
      • Log into the vSphere client > select a host from the inventory > click the Summary tab

image

      • Clicking on the small dialogue button will give you more information about the HA state

image

    • When troubleshooting vSphere HA you can look at logs for a host that is giving you trouble. Here are some key logs and their locations
      • fdm.log – /var/log
      • hostd.log – /var/log
  • Create a custom slot size configuration
    • There are two advanced settings that you can configure in order to create a custom slot size; one for CPU and one for memory
      • das.slotCpuInMHz
      • das.slotMemInMB
    • These two advanced settings allow you to specify the maximum slot size in your cluster
      • If a VM has reservations that exceed the maximum slot size then the VM will use multiple slots
    • Customizing the slot size can have an unintended, and adverse effect during failover
      • You have a custom slot size of 1GB of memory. Let’s say that nets you 20 slots for a host. If you have a virtual machine on that host with a 5GB memory reservation then 5 slots need to be available on that host in order for the VM to be powered on. Now, let’s say across your cluster you have 15 free slots, but none of the hosts in the cluster have 5 free slots, then the VM with the 5GB memory reservation will not be able to power-on during a failover
    • To set these advanced settings
      • Log into the vSphere client > right-click a cluster from the inventory > click Edit Settings…
      • Click vSphere HA > click the Advanced Options… button
      • In the option column add a new option das.slotCpuInMHz > specify the maximum CPU slot size in the value column
      • In the option column add a new option das.slotMemInMB > specify the maximum memory slot size in the value column

image

      • Click OK when finished > click OK again to exit the cluster settings dialog
  • Understand interactions between DRS and HA
    • vSphere DRS and vSphere HA can compliment each other when they are enabled on the same cluster. For example, after a HA failover DRS can help to load balance the cluster. Here are some other interactions
      • If DPM has put hosts in standby mode and HA admission control is disabled, this can cause insufficient resources to be available during a HA failover. When DRS is enabled it can work to bring those hosts out standby mode and allow HA to use them for failover
      • When entering maintenance mode DRS is used to evacuate virtual machines to other hosts. DRS is HA aware and will not migrate a virtual machine to a host, that in doing so, would violate HA admission control rules. When this happens you will have to manually migrate the virtual machine
      • If you are using required DRS VM-HOST affinity rules this may limit the ability to place VMs on certain hosts as HA will not violate required VM-HOST affinity rules
      • If you have a VM that needs to be powered on with enough available resources, but those resources are fragmented, HA will ask DRS to try and defragment those resources in order to allow the VM(s) to be powered on
  • Analyze vSphere environment to determine appropriate HA admission control policy
    • There are multiple factors to be considered when deciding which HA admission control policy should be chosen. Here are some things to consider
      • Availability requirements – across your cluster you need to determine what resources you have available for failover and how limiting you want to be with those available resources
      • Cluster configuration – the size of your hosts, whether the hosts are sized the same or unbalanced with regards to total resources
      • Virtual Machine reservations – if you are using virtual machine reservations you need to look at the largest reservation
      • Frequency of cluster configuration changes – this refers to how often you are adding/removing hosts from your cluster
    • All these things should be considered when choosing the HA admission control policy. Let’s look at the different HA admission control policies and analyze them based on the factors listed above
      • Specify Failover Hosts – This policy is geared towards availability. If you HAVE to have available resources above all other factors to ensure HA failover and have the budget to let hosts stand idle then choose the Specify Failover Hosts admission control policy
        • Geared towards availability
        • Cluster configuration isn’t an issue, specify the proper amount of failover hosts dependent upon your availability requirements
        • Virtual machine reservations don’t matter at this point
        • Frequency of cluster configuration changes do play a small role here. If you are constantly adding new hosts to your cluster there may be a requirement to specify additional failover hosts to meet availability requirements
      • Host Failures Cluster Tolerates– This policy isn’t as cut and dry as Specify Failover hosts. If you are worried about resource fragmentation, meaning you have enough resources spread across the hosts in the cluster, but not enough per host to meet availability requirements during a HA failover, then this policy is for you
        • Meets availability requirements by avoiding the resource fragmentation paradigm
        • Cluster configuration is a serious issue. If you have unbalanced hosts, meaning some hosts have more total resources than others, then this can lead to under utilized hosts. Using this policy the host with the highest amount of slots is NOT included in the slot size calculation, therefore limiting the amount of cluster slots. In other words, the number of powered on virtual machines that can be powered on
        • Virtual machine reservations is another serious issue. If you have some VMs with rather large CPU or memory reservations then the number of slots will be smaller. This leads to a conservative consolidation ratio and again, under utilized hosts
          • You can use advanced settings to limit the size of the CPU and memory slots, but doing so directly undermines resource fragmentation avoidance and may not always meet availability requirements
        • Frequency of cluster configuration changes can be an administrative overhead problem. If you have a 10 host cluster and specify the Host Failures Cluster Tolerates at 3 and then add 10 more hosts, the number of host failures that the cluster will tolerate is still 3. Therefore, if you are constantly adding hosts you will need to change the number of host failures appropriately to meet availability requirements
      • Percentage of Cluster Resources– This policy is meant to be flexible and is the HA admission control policy recommended by VMware for most HA clusters. If you need flexibility and seamless scalability with regards to admission control then this is the policy you’ll want to pick
        • This policy meets availability requirements based on CPU and memory percentages you define as needing to be available
        • Cluster configuration is a non-issue. Regardless of the size of your hosts, balanced or unbalanced, the percentages for CPU and memory that you define will stay the same. You will however need to do a bit more leg work upfront to calculate what percentages to define based on availability requirements. If your hosts are unbalanced it will take more time to do
        • Virtual machine reservations have no effect when using the Percentage of Cluster Resources admission control policy. Again, the user-defined percentages will remain the same regardless of virtual machine reservations
        • The frequency of cluster configuration changes have no impact when using this admission control policy. As you add or remove hosts the total number of cluster resources that need to be available will dynamically change based on resources being added or removed from the cluster
        • The big downside to using this admissions control policy is resource fragmentation. Just because your cluster meets the availability requirements based on the user-defined percentages does not mean that those available resources aren’t fragmented across all the hosts in the cluster. As discussed earlier, if DRS is also enabled and resources are fragmented during a failover event, HA will ask DRS for best effort to try and defragment the cluster in order to facilitate the best outcome of said failover event
    • Again, VMware recommends using the Percentage of Cluster Resources admission control policy for most environments. Should you find this policy does not meet some of your business requirements, evaluate the other two policies based on the factors detailed above to determine the proper course of action
  • Analyze performance metrics to calculate host failure requirements
    • Regardless of the HA admission control policy you choose you need to determine what your host failure requirements are. In order to do this you will need to look at the performance metrics of your virtual machines that will be part of the HA cluster
    • To look at the performance metrics of a virtual machine you can use the vSphere client performance tab to look at advanced metrics, such as CPU and memory utilization, and you can do so over a specified period of time
    • You should look at the virtual machines performance over a period of time to determine the average utilization. You should also look at the hosts performance over a period of time to determine its resource consumption and resource availability
      • Determining the host’s resource availability should give you a better handle on determining your available cluster resources compared to the average virtual machine resource consumption. When you compare those two metrics you can further determine what percentage of resources you need to always keep available in order to satisfy a HA failover. This really adds value when using the Percentage of Cluster Resource admission control policy
    • A big factor that must be considered are the size of your virtual machine reservations. HA will not power on a virtual machine if it violates the admission control policy. HA will also not power on a virtual machine if it can’t meet the reservation. Now, this doesn’t relate directly to performance metrics, I feel it is an important factor to consider when calculating host failure requirements
  • Analyze HA cluster capacity to determine optimum cluster size
    • Trying to right-size a HA cluster can be challenging, especially in a fluid environment. Above all it will come down to availability requirements
      • What VMs do you need available even when a failover occurs
        • What is their resource utilization
      • How many hosts are currently in your cluster
        • Does this meet your availability requirements
        • How does your availability requirements match-up in terms of scaling up within the cluster based on the number of hosts in the cluster. A better way of asking the question; how many more VMs can with my current cluster resources while still maintaining required resource availability
      • What is your current cluster utilization and availability and how does that matchup against availability requirements
      • What admission control policy are you using
    • These are very basic questions, but answering each of them and taking into consideration your calculated host failure requirements should enable you to determine if you have right-sized your cluster, or if configuration changes need to be made to meet availability, and ultimately, business requirements

Tools

May 282012
 

For this objective I used the following documents:

  • Documents listed in the Tools section

Objective 1.3 – Configure and Manage Complex Multipathing and PSA Plugin-ins

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Explain the Pluggable Storage Architecture (PSA) layout
    • The Pluggable Storage Architecture (PSA) is a framework that is use for handling multipathing in a VMware environment. The framework is modular so it allows third-party vendors to build their own multipathing plugins and put them directly inline with storage I/O. The PSA is a collection sits at the vmkernel layer and is essentially a collection of vmkernel APIs                                   (image from vSphere Storage Guide)

image

    • The PSA consists of plug-ins and sub plug-ins and perform different functions
      • Multipathing Plug-in (MPP)
        • These are provided by third-party vendors. An example of of a MPP is EMCs PowerPath/VE. VMware’s Native Multipathing Plug-in is also a MPP
      • Native Multipathing Plug-in (NMP)
        • Path Selection Plug-in (PSP)
          • Determines which active path to use when issuing an I/O request to a storage device
          • If the active path to a particular storage device fails, PSP will determine which path to use next to issue the I/O request
          • Third-party vendors can create and integrate PSPs that run alongside VMware’s PSPs
        • Storage Array Type Plug-ins (SATP)
          • Determines and monitors the physical path states to the storage array
          • Determines when a physical path has failed
          • Activates new physical paths when the active path(s) has failed
          • Perform any other necessary array specific actions required during a storage fail-over
          • Third-party vendors can create and integrate SATPs that run alongside VMware’s SATPs

Skills and Abilities

  • Install and Configure PSA plug-ins
    • Third-party vendors can supply their own MPP, such as EMC PowerPath/VE, or they can supply sub-plugins for PSP or SATP that supplements VMware’s NMP. These plug-ins will come in the form of a bundle and can be installed the following ways:
      • VMware vSphere Update Manager
      • Connected directly to the host via SSH console (use the esxcli software vib install command)
      • Using the vSphere Management Assistant (vMA) using the esxcli software vib install command
      • If the new plugin is not automatically registered you can do so manually
    • If you need to set a new default PSP for a SATP use the following commands:
    • Any devices that are currently using the SATP that you just changed will need to have all of their paths unclaimed and reclaimed. If you want to perform these operations via esxcli you will have to stop all I/O going to these devices, which usually isn’t a possibility. In this case you must reboot the host(s) in order for the new PSP to take effect
    • When you load a third-party SATP into NMP you are doing so in order to use the new SATP with a particular device. Here are the commands to run in order to claim a device under a different SATP – in this example I’m going to change the default SATP for a particular device to another SATP. When you install a third-party SATP the claim rule will most likely be specific to a class of devices and not a device ID, which is what I’m doing here.

  • Understand different multipathing policy functionalities
    • I understand “multipathing policy functionalities” to be the Path Selection Plug-ins, or PSP. If someone has any comments what else this might be referring to, please let me know! VMware KB 1011340 also refers to PSPs as multipathing policies
    • By default there are three PSP’s that ship with vSphere
      • VMW_PSP_MRU
        • The host will use the pat that is most recently used (MRU). When a path fails and another one is activated, the host will continue to use this new active path even when the original path comes back up.
        • Default for active/passive arrays
        • Default for ALUA devices
      • VMW_PSP_FIXED
        • The host will use a fixed path that is either, set as the preferred path by the administrator, or is the first path discovered by the host during the boot process
        • Default for active/active arrays
      • VMW_PSP_RR
        • The host will use all active paths in a round robin (RR) fashion. It uses an algorithm to iterate through all active paths. The default number of I/Os that are issued to a particular path is 1000 before moving on to the next active/available path
        • No default array types are listed for this PSP

 

  • Perform command line configuration of multipathing options
    • There are a multitude of multipathing options that can be changed using the command line. Some can be changed in the GUI as well, but other settings must be changed via command line
    • In the Install and Configuring PSA Plug-ins I covered how to change the default PSP for a particular SATP, so I won’t go over that again here
    • Changing the PSP on a particular device
    • You can view device configurations for individual devices based on their assigned PSP. The following commands will view the device configurations for devices assigned the RR and Fixed PSPs. There will also be a command that lists the generic device configuration regardless of its assigned PSP
    • You can also set different parameters for PSP with esxcli. The following commands will set the preferred path on a device using VMW_PSP_FIXED and customize different parameters for a device using VMW_PSP_RR
    • You can also make changes to a device configuration using the generic option. Here is an example of changing a device that is using the VMW_PSP_RR plug-in
    • As you can see there are a lot of different things you can change with esxcli and multipathing configuration. Here is a video of performing some of these configurations

 

 

  • Change a multipath policy
    • You can change the multipathing policy a either in the GUI or via the command-line. I covered the command-line method in the previous section, Perform command line configuration of multipathing options, so I won’t go over here again. Here is how you change the multipath policy in the GUI
      • Log into the vSphere client > select a host that is connected to the device you want to change the multipathing policy for
      • Click the Configuration tab > click the Storage hyperlink
      • Right-click the datastore you in which you want to modify the multipathing policy for > click Properties…
      • Click the Manage Paths… button

image

      • From the Path Selection: drop-down select the multipathing policy you want to change it to
      • Click Change   << this is important, if you click the Close button without first clicking Change then the multipathing policy will not be changed

image

      • Click Close (MAKE SURE YOU CLICKED CHANGE FIRST)
      • Click Close to exit the datastore properties

 

  • Configure Software iSCSI port binding
    • Prior to vSphere 5 software iSCSI port binding could only be configured via the CLI. With the release of vSphere 5, VMware has made all of our lives easier and added this to the GUI (in the properties of the iSCSI software initiator)
    • Before you begin the port binding process you need to have created 1:1 mappings of vmkernel adapters:physical adapters. This way, we can bind a single vmkernel adapter to a single physical adapter, enabling multipathing. Ensure these steps have been completed:
      • Created as many virtual switches or port groups as the number of physical adapters you will be using for iSCSI
      • You’ve created a vmkernel adapter for each vswitch or port group
      • You changed the NIC Teaming on each vswitch or port group to reflect on one active adapter and no standbys
      • the iSCSI software adapter is enabled and has its targets configured
    • Once you have this done you need to configure port binding. Let’s go through how to do it in the GUI first
      • Log into the vSphere client > select the host for which you are configuring iSCSI port binding on
      • Click the Configuration tab on the right > click the Storage Adapters hyperlink
      • Select the iSCSI software initiator > click the Properties… hyperlink
      • Select the Network Configuration tab > click the Add button

image

      • Select the vswitch or port group that corresponds with they vmkernel adapter and physical adapter that you have setup for iSCSI

image

      • Click OK
      • Ensure that the Port Group Policy is appears as Compliant

image

      • Click Close > click Yes to perform a rescan
    • Now lets do the iSCSI port binding using esxcli
      • Here is the result of the list command, as you can see, vmhba35 and vmk1 are bound

image

Tools

May 132012
 

For this objective I used the following documents:

  • Documents listed in the Tools section

Objective 1.2 – Manage Storage Capacity in a vSphere Environment

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Identify storage provisioning methods
    • There are two types of storage that can be provisioned through vSphere; block storage and NAS.
      • Block Storage
        • Local – any local storage attached to the host; uses VMFS
        • iSCSI – IP storage using a hardware or software iSCSI initiator; uses VMFS
        • FCoE – Fibre Channel over Ethernet using a hardware of software HBA; uses VMFS
        • FC – Fibre Channel using a hardware HBA; uses VMFNAS Storage
      • NAS Storage
        • NFS – currently using NFSv3 to mount NFS shares as datastores; uses NFS instead of VMFS
    • GUI Provisioning Method
      • The easiest way to provision storage is using the vSphere client. From the vSphere client you can create VMFS 3 or VMFS 5 datastores, you can create Raw Device Mappings or create a Network File System. You can do all this through the Add Storage wizard from within the client
        • Log into the vSphere client
        • Select a host > click the Configuration Tab
        • Click the Storage hyperlink
        • Click the Add Storage. . . hyperlink to launch the Add Storage wizard
      • From the Add Storage wizard you can provision block or NAS storage into the vSphere environment
    • Command-line Provisioning Methods
      • To provision storage through the command-line you can use vmkfstools
      • There aren’t a WHOLE lot of options for this command as it relates to creating file systems (you can also use vmkfstools to provision virtual disks. Here are the options:
        • You can specify whether it will be VMFS 3 or VMFS 5
        • You can set a block size (VMFS 3 ONLY)
        • You can set the volume name
      • You can also choose to span or grow an existing file system
      • Check out this example for creating a new VMFS 5 volume with a name of vmkfstools_vcap5_volume (a partition must exist on the LUN prior to creating a file system, which is what partedUtil is used for)  — VMware KB1009829 details this out as well
      • You can also add and remove new NAS volumes in the command-line using esxcli

 

  • Identify available storage monitoring tools, metrics and alarms
    • Two built-in monitoring tools that come with vSphere are Storage Reports and Storage Maps. Both of these can be found in the Storage Views tab within the vSphere client (this pertains to looking at host inventory objects)
      • In the hosts and clusters view click on a host
      • Click the Storage Views tab on the right
    • Different metrics exist to monitor storage performance and utilization. These metrics can be viewed within the vSphere client or by using esxtop/resxtop
    • There are also a number of pre-defined alarms that will assist your monitoring efforts, such as Datastore usage on disk and Thin-provisioned LUN capacity exceeded.  
    • Storage Reports
      • Storage reports will show you information on how different objects within your inventory map to storage entities. By default a storage report for a host inventory object includes:
        • VM Name
        • Multipathing Status
        • Space Used
        • Snapshot Space
        • Number of disks
      • Here is a screen shot detailing out the defaults (the items checked) as well as all available fields that can be displayed within storage reports (for host inventory objects)

image

      • The columns and information displayed will be dependent upon which inventory object you have selected. I’ll let you go through each one and see how these reports vary
      • Reports are updated every 30 minutes by default. You can manually update them by clicking the Update… hyperlink from within Storage Views > Reports located on the upper right of the screen
      • You can filter these reports by selecting which columns you want to search on, and then typing in the keyword(s)

image

      • You can export reports in the following formats
        • HTML
        • XLS
        • CSV
        • XML
        • Export Reports
          • Choose an inventory object
          • Click the Storage Views tab and select Reports
          • Choose which columns you want to view and any filtering
          • Right-click below the table and select Export List…
          • Enter in a name and choose the file format > click Save
    • Storage Maps
      • Storage maps give you a nice representation of storage resources (physical and virtual) as they pertain to a specific inventory object. Storage maps are also updated automatically every 30 minutes and you can manually update them by clicking the Update… hyperlink located near the top right of the inventory object > Storage Views > Maps screen
      • Just as with Storage reports, Storage maps have default views for each type of inventory object. Using the different checkbox within the Maps area you can filter out object relationships that you do not wish to see

image

      • By left-clicking on an object you can drag it to different parts of the screen
      • Storage maps can also be exported in the same fashion as Storage reports, although, as you can imagine, your file type selection will be different
        • .jpeg
        • .bmp
        • .png
        • .tiff
        • .gif
        • .emf
    • Storage Metrics (vSphere Client)
      • As with storage reports and storage maps, the types of metrics you will see as they relate to storage will vary depending upon which inventory object you select. For example, if you select a datastore inventory object you will by default be show space utilization views in a graph format (graphs based on file type and the top 5 virtual machines)
      • You can then change that default view from Space and change it to Performance, which will show you a slew of performance charts for that particular datastore
      • To see the real “meat and potatoes” of metrics as they relate to storage within the vSphere client you need to look at advanced performance charts
        • Select a host from the inventory
        • Click the Performance tab > click the Advanced button
        • From the drop down there are four related storage items
          • Datastore
          • Disk
          • Storage Adapter
          • Storage Path
        • If I went into every counter that you could see for the objects above you will be reading this post for the next 6 weeks. So know where these metrics are and at the very least familiarize yourself with defaults
    • Storage Metrics (esxtop/resxtop)
      • I decided not to go into a lot of detail for this section as there are already some great resources out there. For a good review of this tool check out Duncan Eppings blog post on esxtop. For a detailed review of all statistics for esxtop check out this VMware community post
      • For storage monitoring there are three panels within esxtop that you will want to be intimately familiar with (the letters at the end correspond the the esxtop hotkey for those panels)
        • Storage Adapter Panel (d)
        • Storage Device Panel (u)
        • Virtual Machine Storage Panel (v)
      • Some key metrics you want to look at for the panels above
        • MBREAD/s — megabytes read per second
        • MBWRTN/s — megabytes written per second
        • KAVG — latency generated by the ESXi kernel
        • DAVG — latency generated by the device driver
        • QAVG — latency generated from the queue
        • GAVG — latency as it appears to the guest VM (KAVG + DAVG)
        • AQLEN – storage adapter queue length (amount of I/Os the storage adapter can queue)
        • LQLEN – LUN queue depth (amount of I/Os the LUN can queue)
        • %USD – percentage of the queue depth being actively used by the ESXi kernel (ACTV / QLEN * 100%)
    • Alarms
      • There are a number of different pre-configured alarms related to storage that can be leveraged to alert you of impending storage doom. As with a lot of functions within vSphere, different alarms are pre-defined based on the inventory object that you select. Which means there are different storage related alarms for different inventory objects
        • If you are in the vSphere client and you select the top-most inventory object (the vCenter object) and you go to the Alarms tab, you can select Definitions and view ALL pre-configured alarms for all objects
      • Again, I won’t go into every single alarm and what they do, but here are a list of some I think are important to know, along with their default triggers
        • Cannot connect to storage – this alarm will alert you when a host has an issue connecting to a storage device The three defaukt triggers are:
          • Lost Storage Connectivity
          • Lost Storage Path Redundancy
          • Degraded Storage Path Redundancy
        • Datastore cluster is out of space – this alarm monitors disk space on datastore clusters. The default triggers are:
          • Send a Warning when utilization is above 75%
          • Send an Alert when utilization is above 85%
        • Datastore usage on disk – this alarm monitors disk space on a datastore. The default triggers are:
          • Send a Warning when utilization is above 75%
          • Send an Alert when utilization is above 85%
        • Thin-provisioned LUN capacity exceeded – this alarm monitors thin-provisioned LUNs using the vSphere Storage APIs. Triggers for these alarms must be modified through the vSphere API (VASA) and is implemented by your storage vendor

 

Skills and Abilities

  • Apply space utilization data to manage storage resources
    • I’m not 100% what VMware is looking for on this, but my best guess is to use some of the techniques above to determine current space utilization, and then manage your storage resources appropriately
    • Since we’ve already gone through the different metrics and alarms to monitor, let’s use the ESXi shell to determine VMFS disk usage. The command df, which in Linux speak stands for disk filesystem, is used to display the the filesystems that are mounted to that particular host.

Since I filtered the results you don’t see an explanation of each column. From left to right:                           

      Filesystem        Size       Used         Available    Use%        Mounted on

image

      • At the moment we are focused on space utilization, so we want to focus on the Use%. As you can see, none of my partitions are over 50%. If I had a highly used partition you would most likely get an alarm from the Datastore Usage alarm, and you could use df to see a summary of all your partitions
      • There are lots of way to rectify this, add more space/extents, delete unneeded virtual machines or remove unneeded virtual disks (you could accomplish this through the vSphere client or by using the vmkfstools -U command)
    • The bottom line is that you need to be aware of, not only how you can determine space utilization, but then to apply that data in an intelligent way in order to manage your storage resources effectively

 

  • Provision and manage storage resources according to Virtual Machine requirements
    • I’ve covered some of this in Objective 1.1 – Implement and Manage Complex Storage Solutions. Before you can provision, or manage, storage resources for a virtual machine, you first must know the virtual machine requirements, which includes, but is not limited to:
      • Space – how much space is needed
      • I/O workload – how many spindles are needed to satisfy the workload
      • Resiliency  — how protected does the data need to be
    • Looking at the above list you can look at the application requirements for the recommended amount of disk space. You can use tools such as vscsiStats or IOmeter to determine the workload characteristics and how many spindles you’ll need. Depending upon availability and resiliency requirements will determine RAID level, whether snapshots (array level) will be used, what level of backup and how often to backup and how long the data needs to remain in an off-site location
    • Once you’ve determined the virtual machine requirements you can start to provision and manage your storage based on those requirements. If you have a virtual machine that requires a certain level of service or, say it needs to be on super fast storage, you can leverage a few vSphere features to help you accomplish that goal
      • Profile Driven Storage – again, I covered this in Objective 1.1 on how to configure and implement profile driven storage. You can create a profile based on a virtual machine(s) requirement, such as fast disks, and assign that storage capability to one or more datastores. You can then create a storage profile and apply it to the virtual machine. Whenever that particular virtual machine is on a datastore that doesn’t meet that storage profile, it will be marked non-compliant
      • Datastore Cluster – you can group similar datastores into a construct known as a datastore cluster. This allows you to assign virtual machines to that datastore cluster, and, in conjunction with Storage DRS, the virtual machine will be placed on the least used datastore (in terms of I/O and space utilization)
    • You can provision storage for a virtual machine in a few different ways:
      • Adding a new disk through the vSphere Client
      • vmkfstools
    • Adding storage to a virtual machine through the vSphere client is pretty straight forward so lets go through how you would create an empty virtual disk using vmkfstools

image

        • Above you can see that the command was successful and that the vcap5-flat.vmdk and vcap5.vmdk files were created

 

  • Understand the interactions between virtual storage provisioning and physical storage provisioning
    • The virtual provisioning of physical storage can add benefit to your organization as long as you understand the implications of what you are doing. Virtual storage provisioning allows you to over-commit your storage resources as needed
    • If I had to pick one construct to understand when it comes to the interaction between virtual storage provisioning and physical storage provisioning it would be with Thin Provisioning. Thin provisioning allows you to create a virtual disk that is, for example, 40GB in size, but you’re actually only using 5GB. The guest operating system thinks its hard disk is physically 40GB, while the physical storage has only allocated 5GB
      • The biggest thing that you need to understand here is that by thin provisioning the actual size on the disk is less than what you’ve provisioned, which can get you into trouble if you aren’t paying attention to the physical storage
      • If you have a 100GB datastore, you can put 40 VMs with 5GB virtual hard disks that are thin provisioned. Even those those 40 VMs may only be using 2GB each, they have the potential to grow up to 5GBs, which at a certain point would cause you to physically run out of storage space; NOT GOOD!
      • In the section above we went over created an empty virtual disk, and we created it as a thin disk. Since it is a thin disk,the provisioned size will be different from the actual size. Here is what you’ll see when looking in the datastore browser

image

      • As you can see the Size and Provisioned Size are much different.
      • The same exists when you have a datastore full of thin disks, the Capacity and Provisioned Space will differ. Let’s have a look (Go to the Datastores and Datastore Cluster view > click on a datastore on the left > click the Summary tab on the right)

provisioned_space

      • The Capacity is 1.56 TB while the provisioned space is more than 1TB over the physical capacity. However, my physical free space is still ~600GB
    • The point I’m trying to get across is that you need to be intimately familiar with what your virtual storage environment is, and what it is doing, while keeping the physical storage in mind
    • If you have a thinly provisioned virtual disk that you want/need to physically consume all of its provisioned space AFTER you have created it then you can Inflate the disk. This can be done within the datastore browser by right-clicking on the VMDK file and selecting Inflate. You can also do this from the command line; here is how
      • This operation can take quite a long time to complete depending on how much physical space needs to be zeroed out
      • Now as you can see the Size shows what the Provisioned Size used to show, and now the Provisioned Size column is blank (which is expected as that field isn’t populated unless the virtual disk is thin)

image

 

 

  • Configure Datastore Alarms
    • There are five pre-configured datastore alarms that ship with vSphere 5, see the below screen shot for their names and descriptions

image

    • Aside from the five datastore alarms you see above, there are a lot more triggers we can use to create alarms for the Datastore monitor and whether you choose to monitor for a specific condition/state or for a specific event
      • Log into the vSphere client and navigate to the Datastores and Datastore Cluster view
      • Click on a datastore from the listing on the left > click the Alarms tab > click the Definitions button
      • Right-click anywhere under the pre-configured alarms and select New Alarm…
      • Enter in the following details:
        • Alarm Name: Datastore Over Provisiong Alarm
        • Description: Alarm to monitor the provisioned space on the datastore
        • Alarm Type: Datastore
        • Choose Monitor for specific conditions or state…
        • Enable this alarm: Check this box

image

      • Click on the Triggers tab > click Add to add a new trigger
      • Enter in the following details:
        • Trigger Type: Datastore Disk Provisioned (%)
        • Condition: Is above
        • Warning: 100
        • Alert: 200
        • Select the Trigger if any of the conditions are satisfied radial button

image

      • Click the Reporting tab
      • Choose if you want the alarm to repeat when the condition exceeds a certain range
      • Choose the frequency

image

    • Click the Actions tab > click Add to add an action
    • Enter in the following details
      • Action: Send a notification email
      • Configuration: josh.coen@valcolabs.com
      • You can choose when to perform this action based on the alarm transition state. By default this will perform the action one time when the alarm goes from warning to alert. Just leave the default

 

    • Click OK (you will get a warning message if your vCenter SMTP settings are not configured)

image

    • There are A LOT more triggers that relate to the Datastore monitor when you select the  Monitor for specific events occurring… radial button. Here is a list:

image

    • As you can see you have A LOT of options to choose from and you can use the instructions in the previous steps to create new alarms that can help you effectively monitor your datastores

 

  • Analyze Datastore Alarms and errors to determine space availability
    • Using datastore alarms and errors to determine your available space is pretty straight forward. The default alarm Datastore usage on disk is the perfect alarm to use, and it’s enabled by default
    • The Datastore usage on disk alarm is pre-configured to trigger a warning when its disk usage is over 75%. It will trigger an alert if it gets above 85%. Now again, these are the defaults for this alarm, you may want to edit the thresholds based on your organizations best practices as it relates to %free for storage
    • You can only edit alarms in the scope in which they are defined in. In this case, the Datastore usage on disk alarm is defined at the top level object, which is the vCenter object
    • I created an 8.6GB eagerzeroedthick virtual disk using vmkfstools on a datastore that had only 8.89GB free.
    • Once my view was updated (these are updated every 30 minutes) an alert was triggered

image

    • Now if I was seeing this alert for the first time the first thing I would do is check the space availability of my datastore. If it was in fact close to being at capacity I would either allocate more space, delete unneeded virtual disks/files or perform a storage vMotion to another datastore that had more capacity

  • Configure Datastore Clusters
    • Configuring datastore clusters is an easy enough process, but it is a process and can only be created from the vSphere client (can’t create in vSphere Web Client)
      • Log into the vSphere client and navigate to the Datastores and Datastore Clusters view
      • Right-click on your datacenter object and select New Datastore Cluster…
      • Enter in a name for the datastore cluster and choose whether or not to enable Storage DRS

image

      • Click Next
      • Choose either No Automation (Manual Mode) or Fully Automated
      • We aren’t adding any Advanced Options so click Next

image

      • Decide whether you want to enable the I/O metric for SDRS recommendations
      • Choose the thresholds you want SDRS recommendations to be triggered on
        • Utilized Space — default is 80%
        • I/O Latency — default is 15ms

image

      • Click the Show Advanced Options hyperlink to set the advanced options
        • Set the percentage for the minimum utilization difference between the source and destination datastore before SDRS will make a recommendation
          • Here is an example: If leave this at the default (5%), SDRS will not make a recommendation for a move unless the there is at least a 5% difference between the source datastore and the destination datastore in terms of utilization. So, the datastore first needs to exceed the utilization space threshold and then there needs to be at least 5% difference in terms of utilization before SDRS will make a recommendation
        • Set the frequency that SDRS should check for imbalances — default is 8 hours
        • Set the I/O imbalance threshold

image

      • Click Next
      • Select which cluster(s) you want to use > click Next

image

      • Select which datastores you want as part of the datastore cluster
        • Best practice is to use datastores that have similar capabilities, that way application owners and users should never experience a degradation of service due to an applied SDRS recommendation

image

      • Click Next > click Finish

 

Tools

May 122012
 

For this objective I used the following documents:

Objective 1.1 – Implement and Manage Complex Storage Solutions

Knowledge

**ITEMS IN BOLD ARE TOPICS PULLED FROM THE BLUEPRINT**

  • Identify RAID Levels
    • If you are looking at this blueprint and contemplating taking this exam I’m going to assume that you know what RAID is. If you don’t, then you are possibly in for a LONG VCAP5-DCA preparation. I’m not going to list out every single RAID level, but I will go over the most commonly used ones; RAID 0, 1, 5, 6 and 1+0
      • RAID 0: Striping only, no redundancy. Data is striped over all disks in a RAID 0 set. Minimum of 2 disks.
        • Pros:
          • Very good performance
          • Allows for the maximum use of disk space
        • Cons
          • No redundancy
          • Any drive failure will destroy the entire dataset
      • RAID 1: Mirroring only, no striping. Data is mirrored across disks. If you have a two disk RAID 1 set then the same data is on both disks. Minimum of 2 disks.
        • Pros:
          • Redundant
          • Write performance degradation is minima
        • Cons:
          • You lose half of your disk capacity (two 1TB disks, 2TB total only nets you 1TB)
      • RAID 5: Striping with parity. Data is striped across the all disks in the RAID 5 set and parity bits are distributed across the disks. Minimum of 3 disks
        • Pros:
          • Can sustain a loss of 1 drive in the set
          • Very good read performance
        • Cons:
          • Write performance not as good as RAID 1 due to parity calculation
          • Throughput is degraded when a disk does fail
      • RAID 6: Striping with double parity. Data is striped across all disks in the RAID 6 set along with double parity. Minimum of 4 disks
        • Pros:
          • Can sustain a loss of 2 drives in the set
          • Useful in large RAID sets
          • Very good read performance
        • Cons:
          • Requires 4 disks
          • More disk space is utilized for the extra parity
          • Write performance not as good as RAID 1 or 5 due to double parity calculation
      • RAID 1+0 (RAID 10): Mirroring and Striping. Disks in a RAID 10 set are mirrored and then striped across more disks.Minimum of 4 drives and total drives must be an even number
        • Pros:
          • Great read/write performance
          • Can survive many drive failures as long as all drives in a mirror don’t fail
        • Cons:
          • Only 50% of disk capacity is available due to mirroring
          • Complex compared to RAID 0 and RAID 1

 

  • Identify Supported HBA types
    • The three types of Host Bus Adapters (HBA) that you can use on an ESXi host are Ethernet (iSCSI), Fibre Channel or Fibre Channel over Ethernet (FCoE). In addition to the hardware adapters there is software versions of the iSCSI and FCoE adapters (software FCoE is new with version 5) are available.
    • There are far too many adapters to list, but the usual suspects make them:
      • Broadcom
      • Brocade
      • Cisco
      • Emulex
      • QLogic
    • To see all the results search VMware’s compatibility guide

 

  • Identify virtual disk format types
    • There are three types of virtual disk formats:
      1. Thick Provision Lazy Zeroed – a thick disk is created and all space on the underlying storage is allocated upon creation. The blocks within the allocated space are zeroed out on demand (not at the time of virtual disk creation)
      2. Thick Provision Eager Zeroed – a thick disk is created and all space on the underlying storage is allocated upon creation. The blocks within the allocated space are zeroed out up front – it will take some time (considerable amount of time depending on disk size) to create this type of virtual disk
      3. Thin Provisioned – Only space that is needed is allocated to these types of disks. As the need for more physical space grows a thin provisioned disk will grow to meet that demand, but only up to its configured size
    • Using a Raw Device Mapping (RDM) may also be considered a virtual disk format type. While I don’t consider it a virtual disk format, I wanted to include it anyway. A RDM is a pointer to a physical LUN on a SAN. When you create a RDM a .vmdk file is created, but only contains pointer to the physical LUN

Skills and Abilities

  • Determine use cases for and configure VMware DirectPath I/O
    • DirectPath I/O allows a VM to access a device on the physical server without intervention from the hypervisor
    • The CPUs must have Intel Virtualization Technology for Directed I/O (Intel VT-d) feature or if using AMD processors, have AMD I/O Virtualization Technology (IOMMU). Once you verify your CPUs are capable, ensure the feature is enabled within the BIOS
    • According to test results done by VMware in a recent performance whitepaper, Network I/O Latency in vSphere 5, using DirectPath I/O lowered the round trip time by 10 microseconds. While 10 microseconds may seem miniscule, it can be the difference with very low latency applications
    • A few use cases:
      • Stock Market applications (an example used in the aforementioned white paper)
      • A legacy application that may be bound to the physical device
      • Can improve CPU performance for applications with a high packet rate
    • Configuring DirectPath I/O on the ESXi host (from VMware KB 1010789)
      • In the vSphere client select a host from the inventory > click the Configuration tab > click Advanced Settings  under the Hardware pane
      • Click Edit and select the device(s) you want to use > click OK
      • Reboot the host (once the reboot is complete the devices should now appear with a green icon)
    • Configuring a PCI Device (Direct Path I/O) on a Virtual Machine (from VMware KB 1010789)
      • In the vSphere client right-click the virtual machine you want to add the PCI device to and click Edit Settings…
      • Click the Hardware tab > click Add
      • Choose the PCI device > click Next

 

  • Determine requirements for and configure NPIV
    • N-Port ID Virtualization (NPIV) is used to present multiple World Wide Names (WWN) to a SAN network (fabric) through one physical adapter. NPIV is an extension of the Fibre Channel protocol and is used extensively on converged platforms (think Cisco UCS)

     

    • Here are a list of requirements you must meet in order to use NPIV
      • The Fibre Channel switches must support NPIV
      • The physical HBAs in your hosts must support NPIV
        • vMotioning a virtual machine configured with NPIV to a host whose physical HBA does not support NPIV will revert to using the WWN of the physical HBA
      • Heterogeneous HBAs across physical hosts is not supported
      • The physical HBAs must have access to the LUNs that will be accessed by the NPIV-enabled virtual machines
      • Ensure that the NPIV LUN ID at the storage layer is the same as the NPIV target ID
    • Guest NPIV only works with Fibre Channel switches
    • NPIV does not support Storage vMotion
    • Unfortunately I don’t have an environment that I can go through and document for you the step-by-step process. The steps below are from the vSphere 5 Storage Guide
    • Configuring NPIV
      • Open the New Virtual Machine wizard.
      • Select Custom, and click Next.
      • Follow all steps required to create a custom virtual machine.
      • On the Select a Disk page, select Raw Device Mapping, and click Next.
      • From a list of SAN disks or LUNs, select a raw LUN you want your virtual machine to  access directly.
      • Select a datastore for the RDM mapping file
      • Follow the steps required to create a virtual machine with the RDM.
      • On the Ready to Complete page, select the Edit the virtual machine settings before completion check box and click Continue. The Virtual Machine Properties dialog box opens.
      • Assign WWNs to the virtual machine.
        • Click the Options tab, and select Fibre Channel NPIV.
        • Select Generate new WWNs.
        • Specify the number of WWNNs and WWPNs.
          A minimum of 2 WWPNs are needed to support failover with NPIV. Typically only 1 WWNN is created for each virtual machine.
      • Click Finish.

 

  • Determine appropriate RAID level for various Virtual Machine workloads
    • Earlier in this objective I covered different RAID levels and their respective advantages/disadvantages. Now lets discuss where these RAID levels fit in best with different workloads
    • Typically when your workloads are read intensive it is best to use RAID 5 or RAID 6. When the workload is write intensive you want to use RAID 1 or RAID 1+0. Hopefully the application owner can give you the read/write percentages so that you can determine which RAID level is best.
    • Here’s an example:
          • Formula: (total required IOPs * read%) + (total required IOPs * write% * RAID penalty) = total IOPs required
        • 400 IOPs required
        • 35% read
        • 65% write
      • RAID1 = (400 * 0.35) + (400 * 0.65 * 2) = 660 IOPs
        • 15K disks required = 4
        • 10K disks required = 5
        • 7.2 disks required =  9
      • RAID5 = (400 * 0.35) + (400 * 0.65 * 4) =  1180 IOPs
        • 15K disks required = 7
        • 10K disks required = 9
        • 7.2 disks required =  16
      • RAID6 = (400 * 0.35) + (400 * 0.65 * 6) = 1700 IOPs
        • 15K disks required = 10
        • 10K disks required = 14
        • 7.2 disks required =  23
    • As you can see, the number of disks required depends on the RAID level you choose. So when determining which RAID level to choose, you need to factor in the number of disks you have against the level of protection you will provide. Each of the above RAID levels can meet the IOPs required for the workload, but some require more disks dependent upon the RAID level and type of disks.
    • In the above example I would go with RAID 5 on 15K disks. While RAID 1 would only require 4 disks to meet the IOPs requirement, it may actually require more disks because you lose 50% capacity in any give RAID 1 set.
    • A tool built-in to ESXi that can be VERY useful in determining the I/O characteristics  of a virtual machine workload is vscsiStats. I’m not going to go into real detail here as to how exactly to interpret the statistics is pulls, but will provide you with the basics and a super AWESOME blog that really goes into detail and even provides some templates
      • you can run vscsiStats from anywhere within the shell (console or SSH), but keep in mind that the first “S” in “Stats” is captilized
      • To get going, here is the commands you will run to start, along with an explanation of each paramter
[sourcecode language="text"]
# find the world ID for the VM you want to collect statistics on
vscsiStats -l

# this will start the collection. -s tells it to start and -w specifies the world ID
vscsiStats -s -w 466937

# here is what should be returned after entering the command above
# "vscsiStats: Starting Vscsi stats collection for worldGroup 466937, handleID 8207 (scsi0:0)"
# "Success."

# after this runs for a period of time you need to pull what’s been collected using the -w parameter
# for the world ID and -p <stat> for the stat you want to pull (-p can be ioLength, seekDistance, outstandingIOs,
# latency, interarrival and all. Use the -c parameter to specify a csv format
vscsiStats -w 466937 -p all -c

# once you’re done you want to stop the collection
vscsiStats -x
[/sourcecode]

      • If you want to learn how to interpret these results check out Erik Zandboer’s three-part series, it is definitely a useful resource

 

  • Apply VMware storage best practices
    • Best practices for storage and vSphere will always require a look at your storage vendor’s documentation as it will differ across platforms. However, from the vSphere side we can apply general best practices regardless of the underlying storage platform
    • Best Practices for Fibre Channel Storage
      • First and foremost you should document the environment
        • includes software versions, zoning LUN masking, etc…
      • Only one VMFS datastore per LUN
      • Disable automatic host registration
        • GUI – Modify Advanced Settings > Disk > Disk.EnableNaviReg = 0

image

        • the esxcli way
[sourcecode language="text"]
esxcli system settings advanced set -i=0 -o "/Disk/EnableNaviReg"
[/sourcecode]

      • Use read/write caching on the array
      • ensure non-ESXi hosts are not accessing the same LUNs or physical disks as your ESXi hosts
      • Ensure you have paths to all storage processors for proper load balancing and redundancy
      • Enable Storage I/O Control (SIOC)
      • Ensure you design your storage with proper IOPs in mind (see above section on identifying proper RAID levels)
      • use a dual redundant switching fabric
      • match all queue depths across the application, guest OS, ESXi host, HBA and storage array
    • Best Practices for iSCSI
      • Document the environment
      • Use on one VMFS datastore per LUN
      • Enable read/write cache on the array
      • only ESXi hosts should be accessing the LUN(s) and underlying physical disks
      • Ensure each ESXi hosts has the appropriate number of network adapters to handle throughput for iSCSI traffic
      • Bind multiple network adapters to the iSCSI software adapter for redundancy
      • match all queue depths across the application, guest OS, ESXi host and storage array
      • separate uplinks on the physical switch so they are not using the same buffers
      • Ensure you don’t have Ethernet bottle necks going to your storage (or anywhere for that matter)
      • Isolate storage traffic to its own VLAN if possible
      • Enable Storage I/O Control (SIOC)
    • Best Practices for NFS
      • Isolate storage traffic to its own VLAN if possible
      • Enable Storage I/O Control (SIOC)
      • Mount all NFS exports the same across all hosts
      • If you increase the max number of NFS mounts for a hosts, be sure to also increase the heap size accordingly
        • Increase Max NFS volumes through the GUI
          • Modify Advanced Settings > NFS > NFS.MaxVolumes
        • The esxcli way
[sourcecode language="text"]
esxcli system settings advanced set -i=32 -o "/NFS/MaxVolumes"
[/sourcecode]

        • Increase the TCP Heap Size through the GUI (changing the heap size requires a reboot of the ESXi host)
          • Modify Advanced Settings > Net > Net.TcpipHeapSize
        • The esxcli way
[sourcecode language="text"]
esxcli system settings advanced set -i=16 -o "/Net/TcpipHeapSize"
[/sourcecode]

 

  • Understand the use cases for Raw Device Mapping
    • In order to understand why you would use a Raw Device Mapping (RDM), we need to define it.  “An RDM is a mapping file in a separate VMFS volume that acts as a proxy for a raw physical storage device” – vSphere Storage Guide
    • RDMs come in two flavors; physical compatibility mode and virtual compatibility mode
      • Physical compatibility mode:
        • The VMkernel passes all SCSI commands to the mapped device with the exception of the REPORT LUNs command. This command is virtualized so that the VMkernel can isolate the mapped device to whichever virtual machine owns it
        • Can be greater than 2TB in size (assumes VMFS5)
      • Virtual compatibility mode:
        • Unlike physical compatibility mode, virtual mode will only pass the READ and WRITE command to the mapped device, all other SCSI commands are handled by the VMkernel
        • Cannot be greater than 2TB
    • There are certain scenarios in which you don’t have a choice but to use RDMs:
      • When using Microsoft Clustering Services across physical hosts. Any cluster data disks and quorum disks should be configured as a RDM
      • If at any point you want to use N-Port ID Virtualization (NPIV) within the guest you will need to use a RDM
      • If you need to run SAN management agents inside a virtual machine
    • To fully understand the use cases for RDMs you must also know their limitations
      • Virtual machine snapshots are only available when using a RDM in virtual compatibility mode
      • You can’t map to a certain partition on a device, you must map to the entire LUN
      • You cannot use direct attached storage devices to create a RDM (direct attached devices do not export the SCSI serial number, which is required for a RDM)
    • Now that you have read what a RDM is, the available modes, when you MUST use them and what some of their limiting factors are you can start to narrow down the use cases. To furthur assist you here is a table from the vSphere Storage Guide that outlines the feature sets when using VMFS, virtual RDM and physical RDM
ESXi Features Virtual Disk File Virtual Mode RDM Physical Mode RDM
SCSI Commands Passed Through No No YesREPORT LUNs is not passed through
vCenter Server Support Yes Yes Yes
Snapshots Yes Yes No
Distributed Locking Yes Yes Yes
Clustering Cluster-in-a-box only Cluster-in-a-boxcluster-across-boxes Physical-to-virtual clusteringcluster-across-boxes
SCSI Target-Based Software No No Yes

 

 

  • Configure vCenter Server storage filters
    • There are four different storage filters that can be configured; VMFS Filter, RDM Filter, Same Host and Transports Filter and the Host Rescan Filter. If you don’t know what these are, here is a quick explanation:
      • VMFS Filter: filters out storage devices or LUNs that are already used by a VMFS datastore
      • RDM Filter: filters out LUNs that are already mapped as a RDM
      • Same Host and Transports Filter: filters out LUNs that can’t be used as a VMFS datastore extend.
        • Prevents you from adding LUNs as an extent not exposed to all hosts that share the original VMFS datastore.
        • Prevents you from adding LUNs as an extent that use a storage type different from the original VMFS datastore
      • Host Rescan Filter: Automatically rescans and updates VMFS datastores after you perform datastore management operations
    • You create these filters from vCenter through Administration > vCenter Server Settings… > Advanced Settings. From here you enter in a new Key/Value pair and click the Add button
    • Once those settings are added there are a few different places you can view them:
      • within the Advanced Settings window of where you added them
      • The vpxd.cfgfile on your vCenter server (C:\ProgramData\VMware\VMware VirtualCenter)
        • located between the <filter></filter> tags
      • you can also view the vpxd.cfg file from the ESXi host itself (/etc/vmware/vpxa)
    • All storage filters are enabled by default. To disable them set the following keys to false
VMFS Filter config.vpxd.filter.vmfsFilter
RDM Filter config.vpxd.filter.rdmFilter
Same Hosts and Transports Filter config.vpxd.filter.SameHostAndTransportsFilter
Host Rescan Filter config.vpxd.filter.hostRescanFilter

 

    • Here is a short video of Configuring vCenter Server Storage Filters

 

  • Understand and apply VMFS resignaturing
    • VMFS resignaturing occurs when you you are trying to mount a new LUN to a host that already has a VMFS datastore on it. You have three options when mounting a LUN to an ESXi host with an existing VMFS partition; Keep the existing signature, Assign a new signature and Format the disk. Here is a brief description of each of those options
      • Keep the existing signature: Choosing this option will leave the VMFS partition unchanged. If you want to preserve the VMFS volume (keep the existing UUID), choose this option. This is useful when you are doing LUN replication to a DR site and need to mount the cloned LUN – MUST BE WRITABLE
      • Assign a new signature: Choosing this option will delete the existing disk signature and replace it with a new one. You MUST use this option (or the format option) if the original VMFS volume is still mounted (you can’t have two separate volumes with the same UUID mounted simultaneously). During resignaturing a new UUID and volume label are assigned, which consequently means that any virtual machines that are registered on this VMFS volume must have their configuration files updated to point to the new name/UUID or the virtual machines must be removed/re-added back to the inventory
      • Format the disk: Nothing much new here; choosing this option is the same as creating a new VMFS volume on a blank LUN – – ALL EXISTING DATA WILL BE LOST
    • There are two way that you can add an LUN with an existing VMFS volume to a host; through the GUI and through the command line. The following assumes your host has access to the LUN on the array side:
    • Adding a LUN with an Existing VMFS Volume using the GUI
      1. From within the vSphere client, either connect to vCenter or directly to a host, navigate to the Hosts and Clusters view: Home > Hosts and Clusters (or Ctrl + Shift + H)
      2. Select the host you want to add the LUN to on the right > select the Configuration tab
      3. Click on the Storage Hyperlink
      4. Click the Add Storage… hyperlink in the upper right
      5. Select Disk/LUN > click Next
      6. Select the appropriate LUN > click Next
      7. Select one of the aforementioned options (Keep the existing signature, Assign a new signature or Format the disk)
      8. Click Finish
      9. If you are connected to vCenter you may receive the following error during this process
          1. image
          2. Check out VMware KB1015986 for a workaround (connect directly to the host and add the LUN)

       

    • Adding a LUN with an Existing VMFS Volume using esxcli
      1. SSH or direct console to the ESXi host that you want to add the LUN with the existing VMFS volume to  — You can also connect to a vMA instance and run these commands
      2. Once connected you need to identify the ‘snapshots’ (which volumes have an existing VMFS volume on it)
[sourcecode language="text" padlinenumbers="true"]
# This will list the snapshots that are available
esxcli storage vmfs snapshot list

# Mount a snapshot named ‘replicated_lun’ and keep the existing signature (find the snapshot you want
# to mount using the output from the previous command
esxcli storage vmfs snapshot mount -l ‘replicated_lun’

# Mount a snapshot named ‘replicated_lun’ and assign a new signature (find the snapshot you want
# to mount using the output from the first command
esxcli storage vmfs snapshot resignature -l ‘replicated_lun’
[/sourcecode]

    • Here is a video showing you how to mount a VMFS volumes that has an identical UUID as another volume. It will show you how to mount a volume while keeping the existing signature and by applying a new signature; all using esxcli Enjoy!

 

  • Understand and apply LUN masking using PSA-related commands
    • LUN masking gives you control over which hosts see which LUNs. This allows multiple hosts to be connected to a SAN with multiple LUNs while allowing only hosts that you specify to see a particular LUN(s). The most common place to do LUN masking is on the back-end storage array. For example, an EMC Clariion or VNX provides LUN masking by way of Storage Groups. You add hosts and LUNs to a storage group and you have then essentially “masked” that host to only seeing those LUNs.
    • Now that we have a better idea of what LUN masking is, let’s go into an example of how you would actually do this on an ESXi host.
    • The first thing we need to do is identify which LUN we want to mask. To do this:
      • esxcfg-scsidevs -m  — the -m will display only LUNs with VMFS volumes, along with the volume label. In this example we are using the “vmfs_vcap_masking” volume”
      • image
      • Now that we see the volume we want, we need to find the device ID and copy it (starts with “naa.” In this example our device ID is naa.5000144fd4b74168
    • We have the device ID and now we have to find the path(s) to that LUN
      • esxcfg-mpath -L | grep naa.5000144fd4b74168  — the -L parameter gives a compact list of paths
      • image
      • We now see there are two paths to my LUN, which are C0:T1:L0 and C2:T1:L0
    • Knowing what are paths are we can now create a new claim rule, but first we need to see what claim rules exist in order to not use an existing claim rule number
      • esxcli storage core claimrule list
      • image
    • We can use any rule numbers for our new claim rule that isn’t in the list above. We’ll use 500. Now lets create the new claim rule for the first path; C0:T1:L0 which is on adapter vmhba35
      • esxcli storage core claimrule add -r 500 -t location -A vmhba35 -C 0 -T 1 -L 0 -P MASK_PATH   — you know the command succeeded if you don’t get any errors.
    • Masking one path to a LUN that has two paths will still allow the LUN to be seen on the second path, so we need to mask the second path as well. This time we’ll use 501 for the rule number and C2:T1:L0 as the path. The adapter will still be vmhba35
      • esxcli storage core claimrule add -r 501 -t location -A vmhba35 -C 2 -T 1 -L 0 -P MASK_PATH — you know the command succeeded if you don’t get any errors.
    • Now if you run esxcli storage core claimrule list again you will see the new rules, 500 and 501 but you will notice the Class for those rules show as file which means that it is loaded in /etc/vmware/esx.confbut it isn’t yet loaded into runtime. Let’s load our new rules into runtime
      • esxcli storage core claimrule load
      • Now run esxcli storage core claimrule list and this time you will see those rules displayed twice, once as the file Class and once as the runtime Class
      • image
    • Only one more step left. Before those paths can be associated with the new plugin (MASK_PATH), they need to be disassociated from the plugin they are currently using. In this case those paths are claimed by the NMP plugin (rule 65535). This next command will unclaim all paths for that device and then reclaim them based on the claimrules in runtime. Again we’ll use naa.5000144fd4b74168to specify the device
      • esxcli storage core claiming reclaim -d naa.5000144fd4b74168
      • After about 30 seconds, if you are watching the storage area on your host within the vSphere client you will see that datastore disappear from the list
      • Running esxcfg-mpath -L | grep naa.5000144fd4b74168 again will now show 0 paths(before it showed 2)
    • Here is a quick list of commands you would need to run if you wanted to unmask those two paths to that LUN and get it to show up again in the vSphere client
[sourcecode language="text"]
esxcli storage core claimrule remove -r 500
esxcli storage core claimrule remove -r 501
esxcli storage core claimrule load
esxcli storage core claiming unclaim -t location -A vmhba35 -C 0 -T 1 -L 0
esxcli storage core claiming unclaim -t location -A vmhba35 -C 2 -T 1 -L 0
esxcli storage core adapter rescan -A vmhba35
[/sourcecode]

    • Here is a pretty awesome video of performing LUN masking using the all powerful OZ esxcli

 

  • Identify and tag SSD devices
    • There are a few ways that you can identify an SSD device. The easiest way is to look in the storage area (select host > click Configuration >  click the Storage hyperlink) and look at the Drive Type column of your existing datastores. This will either say Non-SSD or SSD
      • image
    • Now you can only use the previous method if you already have a datastore mounted on that LUN. If you don’t, SSH into your host and let’s use esxclito figure out which devices are SSDs
      • esxcli storage core device list
          • image
        • The Is SSD will show True or False
    • The PowerCLI Way
      [sourcecode language="powershell" padlinenumbers="true"]
      $esxcli = Get-EsxCli
      $esxcli.storage.core.device.list()

      #Here is the output (truncated)

      #AttachedFilters :
      #DevfsPath : /vmfs/devices/disks/na
      #Device : naa.5000144f60f4627a
      #DeviceType : Direct-Access
      #DisplayName : EMC iSCSI Disk (naa.50
      #IsPseudo : false
      #IsRDMCapable : true
      #IsRemovable : false
      #IsSSD : true
      #Model : LIFELINE-DISK

      [/sourcecode]

    • Identifying a SSD device is easy when they are detected automatically, but what if your SSD device isn’t tagged as a SSD by default? The answer is you can manually tag them. This has to be done with our good friend esxcli
      • First you need to identify which device is not being tagged automatically (there are multiple ways of tagging the device, in this example we will use the device name) Run the following command so you can get the Device Display Name and the Path Selection Policy
        • esxcli storage nmp device list
        • image
        • In this example the device name will be naa.5000144f60f4627a and the PSP will be VMW_SATP_DEFAULT_AA– now we must add a PSA claim rule specifying the device, the PSP and the option to enable SSD
          • esxcli storage nmp satp rule add -s VMW_SATP_DEFAULT_AA -d naa.5000144f60f4627a -o enable_ssd    — no result should be displayed
        • Just like our claimrules in the previous section, we need to unclaim the device and load the claimrules into runtime. An additional step is also needed to execute the claimrules (this step was not required when creating LUN Masking claim rules). Again, you will need the device ID for the next command (naa.5000144f60f4627a)
        [sourcecode language="text"]
        # unclaim the device
        esxcli storage core claiming unclaim -t device -d naa.5000144f60f4627a

        # load the claim rules into runtime
        esxcli storage core claimrule load

        # execute the claim rules
        esxcli storage core claimrule run

        # if the device is already mounted you will see it disappear from the Datastore view
        # and then reappear with a Drive Type of SSD
        [/sourcecode]

 

  • Administer hardware acceleration for VAAI
    • Since I only have block storage in my lab I will not be showing examples hardware acceleration for NFS, but will list procedures and capabilities for it
    • Within the vSphere client you can see whether Hardware Acceleration is supported for your device (click on a host > click configuration > click the Storagehyperlink)
      • image
    • The hardware acceleration available for block devices are:
      • Full Copy
      • Block Zeroing
      • Hardware Assisted Locking (ATS)
      • Unmap
    • If your device is T10 compliant, it uses the the T10 based SCSI commands, therefore enabling hardware acceleration support without the use of the VAAI plugin. If your device is not T10 compliant (or is partially) the VAAI plugin is used to bridge the gap and enable hardware acceleration
    • Display Hardware Acceleration Plug-Ins and Filter
      • esxcli storage core plugin list -N VAAI  — displays plugins for VAAI
      • esxcli storage core plugin list -N Filter  – displays VAAI filter
      • image
    • Displaying whether the device supports VAAI and any attached filters (for this example I’m using naa.6006016014422a00683427125a61e011as the device)
      • esxcli storage core device list -d naa.6006016014422a00683427125a61e011
      • image
    • Display VAAI status of each primitive on a device (again using naa.6006016014422a00683427125a61e011)
      • esxcli storage core device vaai status get -d naa.6006016014422a00683427125a61e011
      • image
    • Before we move on to adding hardware acceleration claim rules, lets check out how to display the current claim rules for filters and for VAAI
      • Filter — esxcli storage core claimrule list –c Filter
      • VAAI – esxcli storage core claimrule list –c VAAI
    • Adding hardware acceleration claim rules is a 5 step process. The first two steps are creating two claim rules, one for the VAAI filter and another for the VAAI plugin. The third and fourth steps are loading the claim rules into runtime. The last step is executing the claim rules. Since you are doing this manually you would need to know the Type information, in our case is Vendor and the Vendor information which in this case will be vlabs. Let’s get to it:
[sourcecode language="text"]
# this will create a new claim rule for the VAAI_Filter with a type of "Vendor" and the vendor will be "vlabs"
# the -u parameter automatically assigns the rule number
esxcli storage core claimrule add -c Filter -P VAAI_FILTER -t vendor -V vlabs -u

# this will create a new claim rule for the VAAI Plugin with a plugin name of "VMW_VAAI_VLABS"
# the -f parameter is being used to force the command as the aforemention plugin name is not registered
esxcli storage core claimrule add -c VAAI -P VMW_VAAI_VLABS -t vendor -V vlabs -u -f

# load the filter plugin claim rule into runtime
esxcli storage core claimrule load -c Filter

# load the VAAI plugin claim rule into runtime
esxcli storage core claimrule load -c VAAI

# execute the new claim rules
esxcli storage core claimrule run -c Filter
[/sourcecode]

    • For NFS you will need to install the plug-in provided by your array vendor and then verify the hardware acceleration (use esxcli storage nfs list). To see the full procedure for installing and updating NAS plugins see pages 177-180 of the vSphere Storage Guide

 

  • Configure and administer profile-based storage
    • Before we can administer profile-based storage we first must configure it (I know “DUH”). Of course, before we can configure it we must have a basic understanding of the elemts of profile-based storage. Profile-based storage are profiles of certain storage features an array might have. Those features are added as a capability (if they are not already defined by the array). There are system-defined capabilities and user-defined capabilities. Here are a list of basic steps on the road to profile-based storage
      • Create user-defined capabilities (optional) to go along with any system-defined capabilities
      • Associate those capabilities with datastores that coincide with said capability
      • Enable virtual machine storage profiles (host or cluster level)
      • Create virtual machine storage profiles
      • Associate a virtual machine storage profile with virtual disks or virtual machine files
      • Check for compliance of the associated storage profile on the virtual machine
    • Let’s create some user-defined storage capabilities.
      • Log into vCenter using the vSphere client and click the Home button in the navigation bar
      • Under Management click the VM Storage Profiles button
      • Just under the navigation bar, click Manage Storage Capabilities
      • You’re now presented with a dialog box where you can add your own. Click the Add. . . button
      • Type the Name of the capability > give it a Description > click OK
      • I’ve created three user-defined capabilities; vcap5-dca, 7200 Disks and SSD
      • image
      • When you’re finished adding capabilities, click the Close button
    • We’ve created the capabilities, but now we need to associate them with a datastore(s)
      • Navigate to the Datastores and Datastore Cluster view (Home > Inventory > Datastores and Datastore Clusters or use the hot keys Ctrl + Shift + D)
      • Right-click on the datastore that you want to assign a capability to > click Assign User-Defined Storage Capability…
      • image
      • From the drop-down menu select an existing storage capability (you can also create a new capability from here should you need to by clicking the New…button)
            • image
      • Click OK
      • Repeat on all datastores in which you need to assign a user-defined storage capability. If you are assigning the same storage capability to multiple datastores you can select them all at once and then assign the capability
      • NOTE: You can only assign one storage capability per datastore
    • We need to create virtual machine storage profiles, but first we must enable this on either a host or a cluster
      • In the vSphere client and click the Home button in the navigation bar
      • Under Management click the VM Storage Profiles button
      • Under the navigation bar click Enable VM Storage Profiles
      • From here you can select a particular cluster
        • ALL hosts within the cluster must have a Licensing Status of Licensed. Any other status, such as Unknown and you will not be able to enable it
      • Once you’ve selected which cluster you want click the Enable hyperlink in the top right
      • image
      • Click the Close button once the VM Storage Profile Status changes to Enabled
    • Creating a new VM Storage Profile
      • In the vSphere client and click the Home button in the navigation bar
      • Under Management click the VM Storage Profiles button
      • Under the navigation bar click Create VM Storage Profile
      • Enter in a descriptive name (such as a defined SLA, e.g. Platinum)
      • Enter in a description for the new profile > click Next
      • Select which storage capabilities should be a part of this profile. For this example I’m selecting the vcap5-dcacapability)
        • BE CAREFUL HERE. If you select more capabilities than exist on a single datastore then a VM that has this particular storage profile applied to it will never show up as compliant
      • Click Next > click Finish
    • We have successfully created a VM Storage Profile, but it won’t do us any good until we associate it with a virtual machine
      • In the vSphere client navigate to the VMs and Templates view (Home > Inventory > VMs and Templates or press Ctrl + Shift + V)
      • Right-click on a virtual machine that you want to apply a VM Storage Profile to > click VM Storage Profile > Manage Profiles…
      • image
      • From the drop-down menu choose a profile. In our case it’s the Platinum profile
      •  From here you have two options. You can click Propagate to disks, which will associate all virtual disks for that VM to the Platinum profile. If you don’t want to propagate to all the disks you can manually set which disks you want to be associated with that profile
      • In this example I am forgoing the propagate option and only setting this on Hard disk 1
      • image
      • Click OK when you are finished
    • Lastly, we need to check the compliance of the VM Storage Profile as it relates to that particular VM
      • In the vSphere client navigate to the VMs and Templates view (Home > Inventory > VMs and Templates or press Ctrl + Shift + V)
      • Click on the virtual machine that you just associated the VM Storage Profile with and click the Summary tab (should be default)
      • Look at the VM Storage Profiles section and check the Profile Compliance
      • image
      • Here it will list whether it is compliant or not and the last time it checked (if you need to check it again for compliance you can initiate that by right-clicking the VM > click VM Storage Profile > Check Profiles Compliance

 

  • Prepare storage for maintenance (mounting/un-mounting)
    • Should you need to perform storage maintenance on disks that make up a VMFS volume you will want to unmount it from vSphere. Here are a list of prerequisites for a VMFS datastore before it can be unmounted
      • No virtual machine resides on the datastore
      • The datastore is not part of a Datastore Cluster
      • The datastore is not managed by storage DRS
      • Storage I/O control is disabled for this datastore
      • The datastore is not used for vSphere HA heartbeating
    • To un-mount a datastore perform the following steps:
      • In the vSphere client, navigate to the Hosts and Clusters view
      • Select a host on the left and click the Configuration tab on the right > click the Storage hyperlink
      • Right-click on the datastore you want to un-mount and click Unmount
      • Verify that all the aforementioned checks have passed validation > click OK
      • image
      • If any of the requirements fail to validate then you will not be able to unmount the datastore
    • Using esxcli (I’m using the vmfs_vcap_masking datastore)
      • esxcli storage filesystem unmount -l vmfs_vcap_masking
        • There are scenarios where the GUI won’t let you un-mount a volume, say for example the datastore has a virtual machine on it. In this instance, even if the VM is powered off the GUI won’t let you unmount the datastore. Using the esxcli command above however will let you unmount the datastore IF the VM is powered off
        • If you try to unmount a datastore via esxcli while a powered on VM resides on that datastore you will receive the following error
        • image
        • Here is more information from the vmkernel log (screenshot is broken up)
  • imageimage

 

    • Once you’re complete with you maintenance you want to mount the volume
      • In the vSphere client, navigate to the Hosts and Clusters view
      • Select a host on the left and click the Configuration tab on the right > click the Storage hyperlink
      • Right-click on the datastore you want to mount and click Mount
      • Monitor the Recent Tasks pane to see when the operation is complete. Once complete the datastore will be available
      • Using esxcli (I’m using the vmfs_vcap_masking datastore)
        • esxcli storage filesystem mount -l vmfs_vcap_masking

 

  • Upgrade VMware storage infrastructure
    • As with unmourning/mounting datastores, upgrading your VMware storage infrastructure, particularly upgrading to VMFS5, can be done through the GUI or using esxcli. Here are a few facts about upgrading from VMFS3 to VMFS5
      • VMFS5 has a 1MB block size regardless of disk file size
      • VMFS5 sub-blocks are now 8KB (VMFS3 is 64KB)
      • Block size you used on your VMFS3 partition will carry-over to the VMFS5 partition
      • The disk type of your newly upgraded VMFS5 partition will remain MBR until it exceeds the 2TB limit, at which it will automatically be converted to a GPT disk
      • The upgrade can be done online without disruption to running virtual machines
    • If you have any VMFS2 partitions you will need to first upgrade them to VMFS3 and then you can upgrade to VMFS5
    • If you prefer to build new VMFS5 partitions instead of upgrading, but don’t have space to create a new volume you can use the VM shuffle methodology to move VMs off one datastore to another, wipe the partition and create a new one and then continue with the shuffle until all VMFS datastores are complete. Conrad Ramos wrote a PowerCLI script to automate this, check it out here
    • Upgrade VMFS3 to VMFS5 via the vSphere Client
      • In the vSphere client, navigate to the Hosts and Clusters view
      • Select a host on the left and click the Configuration tab on the right > click the Storage hyperlink
      • Click on the datastore you want to upgrade > below the Datastore pane on the right, click the Upgrade to VMFS-5… hyperlink
      • image
      • Click OK to perform the upgrade
    • Upgrade VMFS3 to VMFS5 via esxcli (upgrading a volume with the name of vmfs3_upgrade)
      • esxcli storage vmfs upgrade -l vmfs3_upgrade
      • once the command completes you will see that volume reflected as VMFS5 under the Type column of the Datastore Views section within the vSphere client

Tools

Command-line Tools

  • vscsistats
  • esxcli
  • vifs
  • vmkfstools
  • esxtop/resxtop