Translated from Jonathan Levin’s site links: the original newosxbook.com/articles/Me…

About

Memory stress in OS X and iOS is a very important aspect of virtual memory management that has been covered in my book [1]. The Jetsam/ MemoryStatus mechanism I mentioned has changed significantly over time, culminating in some very important system mechanisms and system calls recently introduced in Mavericks. I encountered these newly added issues using my OS X and iOS Process Explorer, so I document them here. This is intended as a supplement to chapter 12 of the book, but can also be read separately.

Why should you care? (Target Audience)

Physical memory (RAM), another aspect of the CPU and the rarest resource in the system, is the resource most likely to result in competition, so apps compete for every valid bit. An application’s memory is directly related to performance – often at someone else’s expense. In iOS, there is no swap space to swap memory, which makes memory resources even more important. The purpose of this article is to make you think twice before calling malloc() or mmap() next time, and to clarify the cause of low system memory, the most common cause of crashes on iOS.

Prerequisite: Virtual Memory in a nutshell

No matter how an application is programmed, it must run in memory space. This space is a place where an application can control its own code, data, and state. Of course, if such a space is isolated from other applications, this is very beneficial because it provides better security and stability. We call this space the application’s virtual memory, and it is one of the defining features of an application: all threads of an application will share the same virtual memory space, and thus be defined to be in the same process.

The term “virtual” in virtual memory implies that the memory space, while closely related to the process in question, does not correspond exactly to the actual memory in the system. This is manifested in several aspects:

  • Virtual memory space can exceed the actual amount of memory available – depending on the processor word size and operating system in question, virtual memory space can be as high as 4GB (32-bit) or 256TB (64-bit) [1]. This, especially in the latter case, can far exceed the existing amount of available memory.

  • In practice, virtual memory does not exist: given such a large amount of memory that physical memory is not supported, the system maps physical memory to the supported virtual memory only if the application explicitly requests it (that is, allocates it). Thus, the image of a process’s virtual memory is very sparse, like an “island” in memory in a vast ocean of emptiness.

  • Even if allocated, virtual memory may still be virtual: – When you call malloc(3), it does not mean that the system will jump and find the appropriate amount of RAM to substantially allocate your memory. In most cases, programmers are allocated far more than they need. Thus, malloc(3) allocates only page table entries and very little memory itself. In fact, it is when memory is accessed (say, memset(3)) that physical allocation results.

  • The system can back up memory to disk or on the network – also known as “swapping” memory to background storage. OS X typically uses swap files (in /var/VM). IOS has no switching mechanism.

  • The virtual memory you use may or may not be shared – the operating system reserves the right to implicitly share your virtual memory with other processes. This applies to memory that is backed up with files (that is, declared by calling mmap(2)). If your process has the same file as another process mmap(2), the operating system can give each of your processes a private virtual copy of you that is actually the same physical copy. The above physical copy will be marked as unwritable. As long as each process only reads from memory, one copy is sufficient. However, if anyone writes to such an implicit shared memory, the writing process triggers a page error, which causes the kernel to perform a copy-on-write mechanism (COW) that produces a new physical Copy of the contents that can be modified.

Summing up the above, we can get the following “formula” :

VSS = RSS + LSS + SwSS VSS = RSS + LSS + SwSS VSS = RSS + LSS + SwSS VSS = RSS + LSS + SwSS VSS = RSS + LSS + SwSS It is also possible to display LSS "Lazy" size in top(1), PS (1), etc. - memory that the system agreed to allocate but has not yet allocated SwSS "swap" size - memory that was previously in RAM but was pushed out of swap. In iOS, it's always 0Copy the code

All of the above can be clearly illustrated by a simple example – using VMMap (1) in any process, using the shell itself as an example:



Terminology In this article, the following terms are used:

  • Page – The basic unit of memory management. It is typically 4K (4096) in Intel and ARM, and 16K in ARM64. You can use the pagesize(1) command on OS X (or sysctl hw.pagesize on any operating system) to determine what the default pagesize is. The Intel architecture supports super pages (8K) and giant pages (2MB), but in practice these pages are relatively rare.

  • Phsyical Memory/RAM – A limited amount of Memory installed on the host (Mac or I-device). You can get this value using the hostinfo(1) command.

  • Virtual Memory – Memory allocated by a program or system itself, usually through calls to malloc(3), mmap(2), or higher (such as Objective-C’s [alloc], etc.). Virtual memory can be private (owned by a single process) or shared (owned by 2+ processes). Shared memory can be explicitly or implicitly shared.

  • Page Fault – When the Memory Management Unit (MMU) detects illegal access to virtual memory, one of the following occurs:

    • Access to unallocated memory: Take a pointer to previously unallocated memory -xnu converts it to an EXC_BAD_ACCESS exception, and the process will receive segmentation errors (SIGSEGV, Signal#11).
    • Access allocated but uncommitted memory: Takes a pointer to previously allocated but unused memory (or the corresponding Madvise (2)) -XNU intercept and realizes that it can’t wait any longer and must allocate physical pages. When a page is allocated, the thread causing the failure is frozen.
    • Access to memory without conforming to its permissions: Memory pages are protected by R/W/X in a similar way to standard UNIX file permissions. Attempting to Write to a read-only target (r — or rx) results in a page error, which XNU converts to a bus failure (SIGBUS, Signal#7) or forces a copy-on-write (COW) operation (if it is implicitly shared).

Tools Apple provides several important tools for checking virtual memory:

  • Vmmap (1) – Checks the virtual memory of a single process to lay out its “map” in a manner similar to Linux’s /proc/ /maps.
  • Vm_stat (1) – Provides statistics on virtual memory from a system-wide perspective. This is really just a wrapper that calls the Mach host_STATIStics64 API, printing vm_STATIStics64_t (from < Mach/VM_statistics.h >).
  • Top (1) – Provides performance-related system-wide and per-process statistics. Among them, MemRegions, PhysMem, and VM statistics refer to virtual memory. I shamelessly promote my own tool here, Process Explorer (ProcEXP), which offers (IMHO) better functionality (including richer memory statistics) than Top (1).

Memory Pressure

There are two counters inside the Mach layer that define memory pressure:

  • Vm_page_ free_count: How many pages of RAM are currently free
  • Vm_page_free_target: At least how many pages of RAM should be freed. You can easily view these using sysctl:

If the number of free pages is below the target number – that is, if there is memory pressure (there are other potential cases of course, but I have omitted these for simplicity [2]). You can also use sysctl(8) to query vm.memory_pressure. On OS X 10.9 and later, you can also query kern. memoryStatus_VM_presSURE_level with a value of 1 (NORMAL), 2 (WARN), or 4 (CRITICAL).

After kernel initialization, the main thread becomes vm_pageout and generates a dedicated thread, aptly named VM_presSURE_thread, to monitor for stress events. The thread is idling (blocking itself). When the stress is detected, the thread is woken up by vm_pageout. This behavior has been modified in XNU 2422/3 (OSX 10.9/iOS 7) (most notably moved to vm_Pressure_Response encapsulation).

Stress handling is compiled into XNU only if VM_PRESSURE_EVENTS is #define. If not (for example, custom compilation), vm_pressure_thread does nothing in the 2050 version and would not even start in 2422/3. Also, in the iOS kernel, defining CONFIG_JETSAM more frequently schedules memory processing to the MemoryStatus thread and updates its counters (more on that later) to change certain behaviors.

[mach]_vm_pressure_monitor

XNU exports an unrecorded system call #296, vm_presSURE_monitor (BSD/VM/vm_UNIx.c), which is a wrapper above mach_VM_pressure_monitor (OSFMK/VM /vm_pageout.c). System calls (and the corresponding internal Mach calls) are defined as follows:

Int vm_presSURE_monitor (int wait_for_pressure, int nsecs_monitored, uint32_t * pages_Reclaimed);

This call will either return immediately or block (if wait_for_pressure is not zero). It returns how many physical pages Page_Reclaimed has released in the nSECs_monitored count (it’s not really iterating through that many NSECs). The return value indicates how many pages need to be served (vm.page_free_wanted in the sysctl(8) output above). Invoking system calls is simple and does not require root privileges. (Again, you can also use sysctl(8) to query vm.memory_pressure, although this operation does not wait for memory pressure).

You can try this system call by running Process Explorer with the “vmmon” parameter (otherwise, Process Explorer will do this in a separate thread in interactive mode to display a stress warning). The additional argument specifying “oneshot” will be called without waiting for pressure. Otherwise, the call will wait until pressure is detected:

But how does the system actually reclaim memory? For this, memoryStatus needs to be involved.

MemoryStatus and Jetsam

When XNU was ported to iOS, Apple ran into a major challenge caused by mobile device limitations — no swap space. Compared to the desktop, virtual memory can “overflow” into external storage, which is not applicable here (mainly due to flash limitations). As a result, memory has become a more important (and scarcer) resource.

Introduction: MemoryStatus. This mechanism, originally introduced in iOS, is a kernel thread responsible for handling low-RAM events in the only way iOS thinks possible: discarding (ejected) as much RAM as possible to free up memory for the application — even when that means killing the application. This is the jetsam mechanism for iOS, and you can see the #if CONFIG_JETSAM compilation option in the XNU source code. In OS X, memoryStatus does not stand for kill, but for processes marked as idle to exit, which is a milder approach and more appropriate for desktop environments [3]. Using dmesg, and grep, you can see memoryStatus operations:

The MemoryStatus thread is a separate thread (that is, not directly related to vm_presSURE_thread) that is started in the BSD part of XNU (by calling memoryStatus_init in BSD /kern/bsd_init.c). If CONFIG_JETSAM (iOS) is defined, memoryStatus starts another thread, memoryStatus_JETSAM_thread, which runs most of the time in a blocking loop, Memorystatus_available_pages <= memorystatus_available_pages_critical is awakened, kills the process at the top of the memory list, and then blocks again.

In iOS, MemoryStatus/Jetsam does not print messages, But will certainly be in the/Library/Logs/CrashReporter/LowMemory – – DD YYYY – MM – HHMMSS. Plist has left its traces of kill process – these Logs generated by CrashReporter, Similar to the crash log containing dump. If you have a jailbreak device, a simple way to force Jetsam to perform on a large scale is to run a small binary, keep allocating and memset() size 8MB chunks of memory (this is left for interested readers to exercise), and then run it. You will see the application die until the offending binary is (finally) killed. The log will look like this:



(Note that you can do this on a non-jailbroken device, if you’ve configured it for development, you can create a simple iOS application in Objective-C and perform the same allocation, then collect logs via XCode’s Organizer).

It should be noted that it is not uncommon for Jetsam to ruthlessly kill a process completely: Linux (and its successor, Android) has a similar mechanism in “OOM” (out-of-memory) killer, which holds a (possibly adjustable) score for each process and kills high-score processes when they run out of memory. On desktop Linux, OOM wakes up when the system swap space runs out. In Android, it’s a little earlier and wakes up when RAM runs low. Android’s approach is score-driven (the score is actually a heuristic, depending on how much RAM is used and at what frequency), while iOS’s approach is priority-based.

Starting with XNU 2423, Jetsam uses a “priority bands” (see <sys/kern_memorystatus.h> JETSAM_PRIORITY), That is, the processes tracked by JetSam are maintained in an array of 21 linked lists in kernel space (memstat_bucket). Jetsam selects the first process with the lowest priority element (starting with 0 or JETSAM_PRIORITY_IDLE). If the current priority is empty (see MemoryStatus_get_first_proc_locked, In BSD /kern/ kern_memoryStatus.c), is moved to the next priority list. The default priority for processes is 18, allowing Jetsam to select idle and background processes, which are placed in order ahead of interactive and potentially important processes. As shown below:

Jetsam also operates in another way, setting a “high water mark” on process memory and killing processes that exceed their HWM completely. HWM mode in Jetsam is triggered when a task’s RSS memory exceeds the system-wide limit (more specifically, the task’s Phys_footprint, which includes RSS as well as compressed I/O Kit related memory). HWM can be set with the memoryStatus_control operation #5 (MEMORYSTATUS_CMD_SET_JETSAM_HIGH_WATER_MARK, discussed later).

On iOS, Launchd can set jetSam’s priorities. Previously this was done in the base file of each daemon (that is, in its PList file). It now appears that these Settings have been moved to com. Apple. Jetsamproperties. Model. The plist (for example N51 (5 s), J71 (the Air), etc.). As follows:

<dict> <key>CachedDefaults</key> <! -- Array of dict entries, with key being daemon name e.g. --> <dict> <key>com.apple.usb.networking.addNetworkInterface</key> <dict> <key>JetsamMemoryLimit</key> <integer>integer>6</integer> <key>JetsamPriority</key> <integer>integer>3</integer> <key>WellBehaved</key> <true/> </dict> ..Copy the code

Killing a process completely due to RAM consumption may seem overly demanding, but for systems that lack a switching mechanism, very little can actually be done. Before Jetsam kills a process, memoryStatus allows the process to “redeem itself” and avoid improper terminations by first sending a kernel note (also known as a Kevent) to the process as a “candidate” for termination by acquiring the MemoryStatus thread. This knote (NOTE_VM_PRESSURE,

) will be picked up by the EVFILT_VM kevent() filter, just as UIKit converts it to didReceieveMemoryWarning notification, This is no doubt familiar (and hated) by iOS App developers. Darwin’s libC and GCD also added memory stress handlers, as follows:

  • Darwin’s LibC (<malloc/malloc.h>) defines onemalloc_zone_pressure_relief(Starting with OSX 10.7/iOS 4.3)
  • LibCache (

    ) defines the cache cost (in the case of cache_set_and_retain) to allow the cache to be automatically cleared when a stress event occurs.
  • GCD (<dispatch/source.h>) is definedDISPATCH_SOURCE_TYPE_MEMORYPRESSURE(From OSX 10.9 onwards)

In general, applications that register memory stress (directly through the Darwin API or indirectly through UIKit) should reduce their cache and potentially unwanted memory (it should be noted that traversing memory structures can lead to page errors, which can exacerbate memory stress). UIKit is not open source, but jtool provides a nice disassembly to demonstrate UIApplication’s behavior when it encounters a memory warning:

Sometimes, however, freeing up memory may not be enough to relieve memory stress. In most cases, the freed memory may soon be occupied by another application that is not willing to free it. In these cases, the last resort is to kill the process at the top of the list of potential candidates — so Jetsam shows up.

Controlling memorystatus

A thread that can arbitrarily decide to kill a process can be a bit dangerous. Therefore, Apple uses several apis to “rule” Jetsam/ MemoryStatus. Of course, these are private and undocumented (Apple may kill your developer account if you use them in an app) and they are:

  • Use sysctl kern. memoryStatus_jetSAM_change: You can change the priority list of Jetsam from user space. This is a bit like Linux’s OOM_adj, which allows processes to evade the OOM penalty by specifying negative adjustments (effectively lowering their score). Similarly, in iOS, Launchd (the process that launches all applications) can set the Jetsam priority list. (For example, see com.apple.voice.plist, which specifies JetSamMemoryLimit(8000) and JetsamPriority(-49)), Sysctl internally calls memorystatus_list_change (in BSD /kern/ kern_memoryStatus.c), which again sets the priority and status flags (active, foreground, etc.). – Similar to Linux, Android’s processing mechanism in this case is a “Low Memory Killer” (OOM_ADJ can be adjusted at run time based on the foreground state of the application/activity, thus preferring to kill the background application first). This works on iOS 6.x.

  • Use the memoryStatus_control (#440) system call: Somewhere in XNU 2107 (that is, back in iOS 6 and not until OS X 10.9), this (undocumented) system call is able to control memoryStatus and Jetsam (the latter on iOS) by using one of several “commands”, as shown in the following table:

MEMORYSTATUS_CMD_ const availability usage
GET_PRIORITY_LIST (1) OS X 10.9, iOS 6+ Get priority list – array of memorystatus_priority_entry from <sys/kern_memorystatus.h> Example code can be seen Here
SET_PRIORITY_PROPERTIES (2) iOS only (or CONFIG_JETSAM) Update properties for a given proess
GET_JETSAM_SNAPSHOT (3) iOS only (or CONFIG_JETSAM) Get Jetsam snapshot – array of memorystatus_jetsam_snapshot_t entries (from <sys/kern_memorystatus.h>
GET_PRESSURE_STATUS (4) iOS (or CONFIG_JETSAM) Privileged call: returns 1 if memorystatus_vm_pressure_level is not normal
SET_JETSAM_HIGH_WATER_MARK (5) iOS (or CONFIG_JETSAM) Sets the maximum memory utilization for a given PID, After which it may be killed. Used by processes for processes with a memory limit)
SET_JETSAM_TASK_LIMIT (6) iOS 8 (or CONFIG_JETSAM) Sets the maximum memory utilization for a given PID, after which it will be killed. Used by launchd for processes with a memory limit
SET_MEMLIMIT_PROPERTIES (7) iOS 9 (or CONFIG_JETSAM) Sets memory limits + attributes
GET_MEMLIMIT_PROPERTIES (8) iOS 9 (or CONFIG_JETSAM) Retrieves memory limits + attributes
PRIVILEGED_LISTENER_ENABLE (9) Xnu – 3247 (10.11, iOS 9) Registers self to receive memory notifications
PRIVILEGED_LISTENER_DISABLE (10) Xnu – 3247 (10.11, iOS 9) Stops self receiving memory notifications
TEST_JETSAM (1000) CONFIG_JETSAM && (DEVELOPMENT or DEBUG) Test Jetsam, kill specific processes (Debug/Development kernels only)
TEST_JETSAM_SORT (1001) IOS 9 && (DEVELOPMENT or DEBUG) Test Jetsam sorting (Debug/Development kernels only)
SET_JETSAM_PANIC_BITS (1001/1002) CONFIG_JETSAM && (DEVELOPMENT or DEBUG) Alter Jetsam’s panic settings (Debug/Development kernels only)
  • Using posix_spawnattr_setJetSam: Function from the Posix_spawnattr series, but not recorded, exists only on iOS (this is how Launchd handles Jetsam on iOS 7)

  • Use sysctl kern. Memorypressure_manual_trigger to simulate memory stress levels without actually occupying memory – used by the memory_pressure utility (-S) in OS X 10.9. From < sys/event. H > NOTE_MEMORYSTATUS_PRESSURE_ [NORMAL | WARN | CRITICAL] value.

Other memorystatus configurable values:

  • Use the value of sysctl kern. memoryStatus_PURGE_ON_ * (OS X). These values do not affect MemoryStatus as the PageOut daemon does, but force it to clear values of Warning (2), Urgent (5), or critica(8). Setting these values to 0 clears disabling.

  • Use memoryStatus_get_level (#453) : This system call returns (to int *) a number between 0 and 100, specifying the percentage of memory available. Just a diagnosis. Used on the Activity Monitor (and my Process Explorer) to show memory stress in Mavericks and later versions.

Ledgers

IOS in iOS 5 (or 5.1?) Ledgers were reintroduced in OS X, and the concept has been ported to OS X. I say “reintroduce” because ledgers have been around since the beginning of Mach design, but they just haven’t really been implemented yet.

Ledgers help solve the problem of excessive resource utilization. Different from the classic UN * X model (SetrLimit (2), known to users as Ulimit (1)), LEDgers have a finer granularity model similar to QoS. Ledgers allocate a certain quota (RAM, CPU, I/O) for each unit of time of each resource. And “refills” in strange ways. This allows the operating system to provide leaky-bucket type QoS mechanisms and guarantee a service level that will generate a Mach exception (EXC_RESOURCE, #12, if it is a memory service) if a process exceeds its ledgers.

Going forward, Apple will move entirely to Ledgers-based RAM management, which makes a lot of sense, especially in a world where iOS is scarce (and there is no swap). Jetsam will probably be retained as a last resort.

References:

  1. Mac OS X and iOS Internals, J Levin

ChangeLog

  • 3/1/2014 – Added jetsam properties plist from iPhone5s, and note about ledgers
  • 2/10/2016 – Added jetsam/memorystatus commands for xnu 32xx (iOS 9, OS X 10.11). Also updated procexp to show mem limits on iOS

Footnotes

  1. For simplicity, we ignore the fact that the virtual memory provided for some given processes is really only reserved and mapped for use by the kernel. BTW, 256TB for 64-bit, due to hardware limitations (plus no one will really use it, let alone 16EB for full 64-bit). Mac OS X uses 47 bits (0x7ffFFFFFFf) of 128-TB for user-space virtual memory, the top (technically 0xFFFFFF8…) 128TB is reserved for the kernel.
  2. Again, I haven’t stated the actual conditions for simplicity.
  3. I did not consider the process of idle demotion, that is, (10.9) processes may be moved to the idle band so that they become candidates for idle exit. A process can call proc_info using PROC_INFO_CALL_DIRTYCONTROL to have the kernel track its state and seek protection from being killed when “dirty” and “clean” (idle) voluntarily allow to be killed. Used with the vproc mechanism (

    ).