PVS Internals 1 - Cache Manager -
Before reading this article, you definitely want to check out the incredible White Paper Dan Allen called Advanced Memory and Storage Considerations for Provisioning Services. I will not explain the basics here, so I recommend you do your homework first.
I'll write a follow blog later. This first part will focus on theory, while the second part will focus on how you can use this knowledge in real life, how to properly determine the amount of memory required for the PVS server and what tools can help you during testing and troubleshooting.
Lie-to-children
Back in 00, Messrs Ian Stewart, Jack Cohen and Terry Pratchett came up with the term "Lie-to-children" in their marvelous Discworld book in science (I can recommend to everyone). According to Wikipedia, a lie to children-is an expression that describes the simplification of technical equipment or difficult to understand for consumption by children. The word "children" should not be taken literally, but as including everyone in the process of learning about a topic, regardless of age.
There are a lot of "Lies-to-children" in IT all around us, we often just do not recognize them anymore. When we try to explain something to end users or parents of family (especially older ones), we often use to explain complex technical problem and many times we do not even realize that it's not 100% true what we say. It is quite true, and if necessary we can always explain the details later, right
the developers tend to improve the product with each and all versions - and sometimes (especially if we talk about the core components), these changes are very complex and has evolved over the years into something that can not be easily explained in a few sentences. Guess what happens when developer attempts to translate complex MSDN article with tons of references to API to regular iT Pro? That is true -. It's time to come up with just another lie to children
One of the most common examples is related to free memory - the memory you want to have as much freedom as possible to keep your sensitive system but is this really the case? In fact free memory is bad memory - he is lost and useless. You do not want to free memory - you want to use as much as you can. I highly recommend the series of the famous Mark Russinovich called "Pushing the limits of Windows."
Provisioning Services
As we all know, Citrix Provisioning Services enables us to distribute the operating systems and allows management of a single image. The way I like to think that PVS is a super cheap cache that sits between your expensive storage systems and target devices that wants to read from that storage. In this perspective, PVS is not as magical as you wait - we just use the Windows caching mechanism to our advantage, that's why I'll spend most of the article on the internal mechanism of Windows caching of
one. the most common questions that we are often dealt with is "how much memory should I allocate my PVS servers." Answer (and you probably do not want to hear that) is "It depends." If you really want an answer, it would be "your used memory + total size of all your VHD files" - and this would of course mean that you waste your resources and do not allocate them efficiently. To understand this vague answer, let's take a look at how Windows caching mechanism works internally
If you are looking for a basic rule for start the test, you can use the following formula :.
+ 4GB (2GB #vDisksClient *) + (* #vDisksXenApp 5GB)
We use 2GB for customer platform like Windows XP or Windows 7 and 5GB for XenApp servers (hypothesis restarts daily for guests and weekly reboots for servers)
Windows Caching
there
years, I can imagine that there was a discussion between Windows engineers -. there are tons of free memory on there and it is simply wasted, should not we do something? For example, we could use to cache information, right? If we use finishing with the processing of certain data, instead of expensive flushing pages, we'll leave it where it is - and perhaps use it later. This day was very important for us, if known, we should mark our calendars and celebrate each year. Not only greatly improved the responsiveness of the overall system, but it also allowed us to develop our products -. Citrix Provisioning Services
In Windows, most file system operations are automatically cached - and this applies to both read and write. You can divide the memory in your system for a few different categories (I am completely ignorant virtual vs. physical memory here, focusing on only the physical memory - paging lists are stored only in physical memory).
- reserved Hardware
- When using
- Amended
- Standby
- free
We can further divide this list into three different areas of perspective caching manager:
- When using (hardware + booked in use) - here dragons beware, do not touch, do not care
- cached (+ modified before) - here are your cached data, recycle if necessary
- Free - wasted trying to find a use for it
"cache" memory is memory that is not immediately required by any active process, but Windows used for caching information instead. If necessary, however, it can be easily repurposed and assigned to any process that requires more memory (available memory will be used first however). + Free cache memory is called memory.
The best way (and yes, it is a lie) how to explain the difference between standby and modified page lists is that availability is read-only shop, while modified is used for operations write cache
When you read something from your disk, the pages are automatically stored in the list of spare memory -. so if you access these data again, you can avoid costly IRP (IO Request Packet) and you can recover from the fast memory instead. Windows uses standby for read caching - this includes not only the frequently used data, but also data collected by SuperFetch
When you change the data (or other data), the changes are stored in changing memory. the list and are being written to disk when possible. The principle is quite simple to understand -Because it makes a lot of sense. For example, when you copy a file from a network share - during the copy operation, the pages are automatically included in the amended list, lazy writer will these pages, write to disk and automatically move pages to the waiting list. So when you open the file, it is actually open memory and there is no need to read from disk. "Flushing" modified pages does not happen all at once -. It is dynamic, partially controlled by the lazy writer planned and partly by the current conditions of the operating system (for example, the writer is generated when memory is overcommitted)
important lesson - list change is not only a buffer - the data that are written here are moved read only memory (standby) when the operation is complete. If you see 2 GB of data waiting to memory before the copy operation, 100MB in the list changed during the copy operation and 2GB eve of stored data after copying it is not to say that the spare memory is the same as before the copy operation. Instead of copying the pages, they are simply moved from one list to another.
list changed when copying large files
copy operation is complete - notice that the pages have been moved to change Mode
now that many people have difficulty understanding is that technically you are not cached files complete - caching manager uses the virtual block principle; Therefore, it allows you to cache only certain parts of the file. As a developer, you can change this behavior (typical example - when copying large files, prefetching is used and the data is read sequentially on the basis of recognized pattern). Each block is identified by its address and offset - at the end (especially with a random access solution as Provisioning Services), your map file may look more like an Emmental cheese (or Swiss cheese for all you Americans!) . This is also how SuperFetch works - it does not need to understand that when you click "I agree", you click on "Next." You just know that when virtual block at "A" with offset "B" is requested, the next request will address "C" with an offset "D".
The best way how to think before and modified page lists is that Windows uses to store data that might be needed later. There is no harm in keeping them around a little longer, just in case we need it
Now this is again lying to children -. The story is not over yet. When you start to explain to someone cache manager, one of the obvious questions is related to the fact that your valuable data (eg indices or system libraries) could easily get crushed by a large VHD file that you plan not actually running until next year. Internally, there are 8 different waiting lists with different priorities (0-7 - the higher the number, the higher the priority). For example Priority 7 is static set generated by smart people at Microsoft, the priority 6 are residents VIP SuperFetch, Priority 5 is your standard priority and lower priority are used by low priority or reading process before operations.
Consequences for PVS
There are many aspects where understanding the Windows Cache Manager can help you properly design the PVS environment. In the most obvious example (how to correct sizing of the memory PVS) through regular maintenance tasks (how to copy between servers vDisk) to less obvious subjects (if you delete vDisk when you do not plan the use more?).
vDisk replication between servers
Windows Majority and 3rd party utilities are using buffered IOs to perform operations of reading / writing. The reason is quite simple - this functionality is provided by Windows API, so it does not really matter if you use Explorer, Robocopy or other tools. Some utilities (Xcopy, RichCopy and the latest Robocopy) support unbuffered switches for copy operations. But would you use buffered or unbuffered IOs with PVS?
What happens if you use the old approach vDisk active / passive? You've updated your .vhd file on your master server, now you want to replicate to the rest of the servers. Most times, it is actually buffered IO you want - because unbuffered IOs are used in case you do not plan to access the resource afterwards, which is not the case. The copy operation has to go through the priority list 2, so copying large files VHD should not affect your target devices running (most VHD blocks will be cached with priority 5). And since you plan to spend your target devices in this new version of VHD files, copying will actually pre-caching that VHD file for you.
It is also important to remember that it takes two to tango. When you copy vDisk PVS PVS A to B, both servers are affected by this operation. The default behavior when using the copy operation is unbuffered if the server Do not use the memory waiting for the copy operation, the server B will proceed as planned - it will save pages to amended list and transfer them to the watch list after the writer has finished.
it is also important to stress again that this caching mechanism has nothing to do with Provisioning services, it is just the functionality of the operating system that we enjoy. This means that there is another side effect here - as the target device is probably based on Windows customer, this means that it is also using the standby memory for caching. That's why you do not see IOs constant reading of PVS to the target device - target device will simply use its own cache waiting for caching all the read operations from the PVS server. In an interconnected world of Windows devices, a file can easily be cached on multiple devices
When the file is deleted, all cached entries are automatically purged -. So if you do not plan to use these large files and you want to clear your cache, just delete them.
When the cache of the target device is full, the provisioning server waiting cache is not as important as you expect. You can use this behavior to your advantage when you use versioning active / passive with two different vDisk - once your devices operate regularly, you can easily swap the vDisk even if your PVS server does not have enough memory to run two vdisk fully blown files (of course, you want to carefully test this scenario for a negative impact before using it in your production environment)
the file was deleted after copy operation -. notice that the memory was released
Pagefile sizing
Note another pattern here - PVS server is designed to have large amounts of memory "wasted" so that Windows can take and use for caching vDisk. If you have seen the blog post Rintalan Nicholas (The Pagefile Done Right!, One of the items with "must read" tag), you already know that there is no magic formula to calculate the right size pagefile. What you can really say is that if your responsible commitment not to increase, more memory, plus the swap file should be. One of the prime examples is PVS server -. You can hardly find an example where the huge memory can be accommodated with a very small pagefile
Help - my server has 128GB of memory, but none is free
This is very common concern - almost as common as people complain about no free memory in Windows Vista. Based on what we have discussed here, Windows will try to keep the amount of free memory minimum. Remember to empty the cache eve -. This could rise to NT period, but operating systems these days knows better than us what to do and caching and memory is a static process that can be modified by configuration changes magic
However, you might want to check what is stored in the memory - for example, scripts or misconfigured antivirus calculate hashes can also affect your sleep cache. If you want to dig deeper, subscribe to RSS Citrix and wait for the next post on this topic. We will leave the theory behind and dig deep into the Windows cache and practical aspects PVS servers.
Ehm, then what does that really mean?
To summarize:.
- most copy operations in Windows are cached (even the local operations)
- free memory is bad memory and it is quite normal if your server PVS has little free memory
- cache manager is cached blocks, not entire files, so you do not need to plan caching entire VHD memory
- Buffered Unbuffered copy operations can affect your cache, however, the impact is much smaller (and may actually be beneficial) than most people think
- There are different priorities for the data set cached, so you need not fear that the copy operation will empty your cache completely
- not only the design of the memory of the PVS server is important, but also the design target memory devices. If not let memory for caching on target devices, they will need to perform more read operations from the PVS server
In the next part of this blog, we will take a look at some tools that can help you determine the appropriate size of PVS RAM and gives you some insight into how your cache is actually used. Stay tuned.
Last but not least, I would like to thank Dan Allen and Nicholas Rintalan for the great feedback, it helped me a lot while I was preparing this blog.
UPDATE: You can find the second part of this article here
Zugec Martin