Subject: | Windows NT Memory Architecture Overview | |||
Note:46053.1 | Type: | REFERENCE | ||
Last Revision Date: | 17-APR-2001 | Status: | PUBLISHED | |
1) Purpose========== This article is intended to assist customers understand how the Windows NTmemory architecture works and this should help them better understand howthe Oracle database interacts with it when used in combination with articleNote 46001.1. It is not intended to be a definitive guide to the WindowsNT memory architecture, please refer to Intel / Microsoft's own informationfor this. This note is only relevant to Windows NT 4.0, Windows 2000 includes manynew features not addressed here. 2) The Window NT 32 Bit Memory Model==================================== a) Standard Windows NT Memory Model----------------------------------- Windows NT 4.0 has a virtual-memory system that combines physical memory,the file system cache, and disk into a flexible information storage andretrieval system. Each process running on Windows NT has a flat, linear32-bit memory address space. This means that each process can "see" 32-bitsof address space or 4 Gigabytes (GB) of virtual memory. The upper half(0x80000000 through 0xFFFFFFFF) of the virtual memory is reserved for thesystem code and data that is visible to the process only when it is runningin privileged mode. The lower half (0x00000000 through 0x7FFFFFFF) isavailable to the process when it is running in user-mode and to user-modesystem services called by the program. Windows NT versions prior to 3.51 included some 16-bit data structures thatlimited processes to 256MB (64K pages) of virtual memory. These have been converted to 32-bit data structures in Windows NT 4.0, so 2GB of virtualmemory is now available to each and every process. b) Breaking through the Intel Windows NT 2GB limits--------------------------------------------------- With the cost of physical memory continuing to drop and more I/O intensiveapplications such as database management systems becoming available, the2GB limit imposed on processes has become a constraining factor. To addressthis issue Microsoft introduced the 4GT RAM Tuning feature of Windows NTServer, Enterprise Edition version 4.0 (Intel Only). The 4GT tuning feature increases the 2GB user-mode partition of the virtualaddress space to 3GB (0x00000000 through 0xBFFFFFFF) by reducing the kernelmode partition to 1GB (0xC0000000 through 0xFFFFFFFF). One of the mainbenefits of this change is that it is transparent to applications. This has been achieved by moving the Guard page which protects the boundarybetween the User Address Space and System Address Space from 0x7FFFEFFF to0xBFFFEFFF. To enable this feature the /3GB switch must be added to theboot.ini startup line, for example : multi(0)disk(0)rdisk(0)partition(2)\WINNT="Windows NT 4.0 Enterprise" /3GB For applications to take advantage of this feature they need to be linkedwith the /LARGEADDRESSAWARE switch, which sets a bit in the executablesimage header (IMAGE_FILE_LARGE_ADDRESS_AWARE). For executables that werenot linked with this switch, it can be set by running the imagecfg toolagainst the executable, for example : imagecfg -l oracle80.exe This tool is available on the Windows NT 4.0 Enterprise Edition CD 2 under\support\DEBUG\i386\ directory. c) Breaking through the Intel Windows NT 4GB limits--------------------------------------------------- The introduction of Intel Servers that can support greater than 4GB of mainmemory has presented a challenge to Windows NT, because NT is only capableof using up to 4GB in total. This is especially important as a growingnumber of enterprise class applications are capable of deriving benefitfrom this extra memory. Intel has introduced Servers based on the Pentium II/III Xeon processorwith support for the Intel Extended Server Memory (ESM) Architecture whichbreaks through the 4GB (32-bit) memory barrier. ESM includes 36-bit memoryaddressing technologies which are capable of addressing 64GB of mainmemory, using the Page Size Extension 36-bit (PSE36) driver, which mustbe obtained from Intel. The current PSE36 driver is limited to 8GB. The Intel PSE36 driver is a standard RAM disk device (based on the WindowsNT DDK RAM disk driver) that lacks a file system and is backed by mainmemory that is unused by the operating system. The PSE36 driver functionslike a raw disk with much lower latency and allows 4MB pages to exist ataddresses anywhere in the 36-bit address space. Applications must berewritten to make use of this feature. Only one process may open / access the PSE36 driver at a time, this processgets exclusive access to all of the additional memory. The RAM disk is notshared between processes, it is never mapped into the address space of aprocess and it is not backed by the Windows NT page file. Applications thatuse this device driver access it via the same Win32 API function calls usedto access standard raw disk partitions : - CreateFile : obtains a file handle to the PSE36 device driver and specifies access modes - DeviceIoControl : obtains the size of the PSE36 driver device and provides optimised READ and WRITE device controls Systems with less than 4GB of memory can still utilize the PSE36 driver aslong as the /MAXMEM switch is added to the Windows NT boot.ini file. Forexample on a system with 4GB of memory and a Xeon processor MAXMEM couldbe set to 2048 MB : multi(0)disk(0)rdisk(0)partition(2)\WINNT="Windows NT 4.0 EE" /MAXMEM:2048 Under such a configuration, assuming 256MB of address space at the top ofmemory has been reserved for I/O devices, Windows NT would control a 2GBchunk of memory and 1.75GB would be controlled by the PSE36 driver. Forsystems with greater than 4GB of physical memory the MAXMEM parameter canbe used to maximize the amount of memory used by the PSE36 driver whichis useful in systems where processes have only modest kernel memoryrequirements. For example on a machine with 5GB of physical memory, MAXMEMcould be set to 3GB (3072) to increase the memory available to the PSE36driver from 1GB to 2GB. Although it is often unnecessary to set MAXMEM onsuch systems because Windows NT in unable to access memory beyond 4GB. 2) NT's Virtual Memory Manager (VMM)==================================== a) The virtual address space and address translation---------------------------------------------------- As is the case with other Virtual Memory Managers the Windows NT VMM isresponsible for creating the illusion that all processes have exclusiveaccess to 32-bits (4GB) of physical memory, the reality being all processesshare the same physical memory (up to a maximum of 4GB). The 32-bits ofaddress space are known as virtual memory because they do not directlycorrespond to physical memory, it is the VMM responsibility to translateback and forth between virtual and physical memory. Both physical and virtual memory are divided up into blocks known as MemoryManagement Units (MMU) that the VMM performs memory address translationupon. A computers physical MMU is known as a page frame which the processornumbers consecutively with page frame numbers up to maximum physical memoryavailable, where as the virtual MMU is known as a page. The size of a pagevaries with the processor platform, Intel have 4096 bytes per page; Alphaplatforms have 8192 bytes per page. The Windows NT VMM uses a three step address resolution mechanism, wherethe virtual address is split into three parts : - Page Directory Entry/Offset (PDE) : bits 22 to 31 - Page Table Entry/Offset (PTE) : bits 12 to 21 - Page Offset : bits 0 to 11 Each process has its own private page directory and a special hardwareregister is used to point to its address. When the scheduler switchesbetween processes NT copies the new processes pointer into the register.The MMU translation mechanism uses the PDE offset from the virtual addressto retrieve from the page directory the page frame number of the PTE, itthen uses the PTE offset from the virtual address to retrieve to page framenumber of the code or data page required : +-----------+------------+--------+ | Directory | Page Table | Page | Virtual | Offset | Offset | Offset | Address +-----------+------------+--------+ 31 | 21 | 11 | 0 +-------+ | +-----+ | | Page Table | | Page | Page | Page Frame | Directory | +---------+ | +-----------+ | +---------+ | | | | | | | | | | +---------+ | +-----------+ | +---------+ +-> | PF Addr |---+ | | | +->| PT Addr |---+ +---------+ | | +-----------+ +---------+ | | . | | +->| Code/Data | | | | | . | | +-----------+ +---------+ | | . | | | . | | . | | | . | | | . | | . | | +---------+ | +-----------+ | | | ^ | ^ +---------+ | | | | (per process) +-------------+ +--------------+ When a page frame is shared between two processes, the VMM inserts a levelof indirection into its page tables by using a prototype page table entry(prototype PTE) data structure. This ensures only the prototype PTE needsto be updated when a page frame is paged in, rather than the PTE of eachprocess. The Windows NT VMM uses a three step approach to save memory because itassumes most processes have the majority of their 4GB address spaceunallocated. It fully defines the Page Directory, but Page Table Pages aredefined only as and when needed, where as a two step translation will needto fully maintain the PTEs which would require one million entries eachusing a four bytes pointer = 4MB per process. A three step translationwould cause poor performance without features such as Translation LookasideBuffers, where the processor provides an array of associative memory whichholds a direct virtual to physical page mapping for the most frequentlyused pages. b) Paging--------- When the number of available page frames runs low, the VMM selects pageframes to free and copies them to disk, this process is know as paging.Paging is essential to a virtual memory system where multiple processescompete for the same physical memory, although excessive paging canmonopolize processors and disks. The Windows NT VMM architecture includessophisticated strategies for anticipating the code and data requirementsof competing processes to minimize disk access through paging. A page fault occurs when a program requests a page of code or data that isnot in its working set (the set of pages visible to the program in physicalmemory). - A hard page fault occurs when the requested page must be retrieved from disk. - A soft page fault occurs when then the requested page is found elsewhere in physical memory. Soft page faults can be satisfied quickly and relatively easily by theVirtual Memory Manager, but hard faults cause paging from disk, which candegrade performance. There are many causes of soft page faults includingaccessing new PDE/PTE entries and re-accessing pages that were removedfrom a working set but are still unmodified. Pages that are written to disk are written to the Windows NT Page File(pagefile.sys). The paging file can be split across multiple devices (upto 15 secondary files are allowed) but only one file per device. The totalsize of the paging file plus physical memory limits the amount of datathat can be stored in memory by all processes. It is usually recommendedthat the page file is at least twice as large as the physical memory to accommodate a mix of active and inactive processes but the actual size willbe dependant upon the required total number of concurrent committed pages in the system. c) Working Set Management ------------------------- Processes have a certain number of pages that reside in physical memory,these are known as the processes working set and they may have other pagesthat are stored in the pagefile. Three types of working set exist : - system : the is one of these and it belongs to the Windows NT kernel - session : used per logged on session in Windows Terminal Server - process : per user process Every process has a maximum and minimum working set defined for it, duringthe term that the process exists the working set will vary between thesevalues and because the minimum is by default non zero the whole processwill never be completely swapped out. If a process requests more pages thanits maximum defined the VMM will remove one of the processes pages using aFIFO algorithm (oldest first) causing a page fault for each new page. Whena page is removed from a processes working set it remains in physical memory for a period of time and can be brought back into the working setif required avoiding a hard fault. The page frame goes onto either themodified (process wrote to the page / contents not yet on disk) list orthe standby (allocated for reuse) list. When physical memory runs low, the VMM uses a technique known as automaticworking-set trimming to increase the amount of memory available to thesystem. It examines each process by comparing its current working set withits minimum defined and the level of page faulting it incurs, it removespages from the working set making them available to other processes. Forprocesses that haven't released the memory, the page frame goes on to themodified or standby list which means the page frame contents are still inmemory. These lists use FIFO, so a system with lots of free physical memorywill not immediately use a page that just went to the standby list. If aprocess needs to re-access the page, the VMM will revalidate that pageusing the existing image still in memory causing a soft page fault. A diskread (hard page fault) only occurs when a process re-accessed a page andthe page was no longer on the modified or standby lists, i.e. no longer inphysical memory. d) Sharing Memory / Executable Images------------------------------------- Sharing memory is an important feature of any VMM and one of the mechanismsthat Windows NT uses is called memory mapped files which allow normal filesto back physical memory rather than the pagefile. Memory mapped files allowprocesses to map files to their virtual address space by creating a sectionobject and mapping a view of all or part of the file, this process returnsthe value of the starting address of the mapped view. Windows NT uses memory-mapped files to load and execute EXE/DLL files whichgreatly reduces pagefile space plus the time required for an applicationto begin executing. When subsequent instances of a process are started NTsimply opens another memory mapped view of the executable files image, thisallows multiple instances of the same application to share the same codeand data in physical memory. To protect against one instance altering theglobal variables of another's or from a code page being changed by adebugger setting breakpoints, NT use the copy-on-write feature. When anattempt is made to write to a memory mapped file the VMM catches the attempt and allocates a new block of memory for the pages containing thememory the application is trying to write to. The newly allocated pageframes will be backed by the page file. e) Caching Files---------------- Windows NT Server is commonly used as a network file server, to providebetter response times to applications accessing common files across thenetwork and to programs that are I/O intensive NT implements a file systemcache. The size of the Windows NT file system cache is continually adjustedby the VMM based upon the size of physical memory and the demand for memoryspace. The cache is designed to be self-tuning but can be influenced by selecting: - Control Panel -> Network -> Services -> Server. For systems that mainly act as a file server set optimisation to : - Maximize Throughput for File sharing For systems that have applications that are accessed via client / serverarchitectures and often perform their own file caching such as databaseservers set optimisation to : - Maximize Throughput for Network Applications. 3) Virtual Memory Terminology============================= All virtual memory in Windows NT is either reserved, committed, oravailable. The following provides a description of these states : a) Reserved Memory : -------------------- When a process is created and given its address space, the bulk of thisusable address space is free, or unallocated. In order to use portions ofthis address space it must allocate regions within it, this process isknown as reserving. The reserved regions are of contiguous pages androunded up to an even multiple of the page size. This address space isset aside by the VMM for the process but does not count against theprocess's memory quota until it is used. When a process needs to write tomemory, some of the reserved memory is committed to the process. If theprocess runs out of memory, available memory can be reserved and committedsimultaneously. b) Committed Memory :--------------------- To use a reserved region of address space a process must allocate and mapthis storage to the reserved region. This process is known as committingphysical storage and is always committed in whole memory pages, but doesnot need to commit storage to an entire region. The VMM "saves space" forthe committed pages in the Pagefile.sys file in case it needs to be writtento disk. The amount of committed memory for a process is an indication ofhow much memory is it really using. Committed memory is limited by the sizeof the paging file. c) The Commit Limit :--------------------- Is the amount of memory that can be committed without expanding the pagingfile. If disk space is available, the paging file can expanded and thecommit limit will be increased as long as it does not exceed the maximumpage file size. d) Available Memory :--------------------- Memory that is neither reserved nor committed is available. Availablememory includes free memory, zeroed memory (which is cleared and filledwith zeros), and memory on the standby list, which has been removed from aprocesses working set but might be reclaimed. 4) Threads and Memory allocations================================= Windows NT processes consist of one or more threads (commonly known as amulti-thread architecture), where a thread describes a path of executionwithin the process. Threaded architectures increase the complexity ofmemory management because memory accessed by one thread needs to beprotected from invalid access by other threads. When a process is created, Windows NT creates a heap in the processesaddress space, this heap is called the processes default heap. The defaultheap is used by many of the Win32 API calls and C runtime calls such asmalloc / localalloc, although processes can create additional namedheaps within the virtual address space if required. The default heap iscreated as a 1MB region (reserved and committed), as allocations are madeto and released from the heap the heap manager commits or decommits theregion. Access to the heap is serialized using critical sections so thatmultiple threads can not simultaneously access the heap. When a thread is created in a process, Windows NT reserves a region ofthe address space for the threads stack (each thread gets its own stack)and also commits some of the reserved region. When a process is linked inthe standard manor the system reserves a 1MB region of the virtual addressspace for the stack and commits two of the pages. When a thread allocatesa static or global variable, multiple threads can access this variableat the same time, potentially corrupting the variables contents. Localand automatic variables are created on the threads stack and are thereforeless likely to be corrupted. Allocations are made from the top of thestack down, e.g with a stack from 0x08000000 to 0x080FF000 allocations willcause pages to be committed from 0x080FF000 down through to 0x08001000,access to a page at 0x08001000 will cause a stack overflow exception. Thestack can not grow any further and any additional attempt to access thestack will cause an access violation which could cause the process toterminate without notice. We have covered two of the three mechanisms for manipulating memory above,heaps (which are best for managing large numbers of small object) andmemory mapped files (which are best for managing large streams of data),the final mechanism is direct virtual memory allocation. The Win32functions for manipulating virtual memory allow you to directly reserve aregion of the address space and commit physical storage (from the pagefile) to the region. The main APIs used to achieve this are VirtualAlloc /VirtualFree and they allow a contiguous region of the virtual address spaceto be defined at either an explicit or implicit address, rounded to an even64K boundary. They also allow the region to be reserved (MEM_RESERVE) orreserved and committed (MEM_RESERVE | MEM_COMMIT) as well as setting theregions access permissions e.g. PAGE_READWRITE / PAGE_READONLY. Using thismechanism provides the process with a flexible and efficient mechanism ofmanaging limited memory resources, particularly for large allocations. When a thread terminates it can do so in one of two ways; by callingExitThread or TerminateThread. When a thread detaches under normalconditions the ExitThread function will be called, the stack for the threadwill be destroyed and memory will be released back to the virtual addressspace of the process. If a thread is terminated using TerminateThread thethread detach code will not be called and Windows NT will not destroy thethreads stack. As a result the regions and stack for the terminated threadwill not be released back to the processes virtual address space, thismemory will only get released when the process that owns the threadterminates.
In case you are looking into making cash from your visitors by popunder ads, you can use one of the highest paying companies: Propeller Ads.
ReplyDelete