Introduction
I’m on a system with some Intel® Optane™ Persistent Memory DIMMs, which contain large capacities of non-volatile memory. From here on out, I’ll refer to them as “AEP”, which stands for “Apache Pass”, the memory technology that they use under the hood. The three most common ways of making this memory available to the operating system are:
- 2LM mode (or Memory Mode, or Cache Mode), enabled in the BIOS, which uses a number of DDR4 DIMMs as a cache for the AEP DIMMs.
- Filesystems, in which you first enable 1LM mode, then install a filesystem onto the AEP DIMMs to take advantage of their persistence.
- NUMA mode, in which you enable 1LM mode, then bind the AEP DIMMs’ memory regions to a driver which makes them available as a NUMA node.
In this post, I’m just going to lay out concisely how to get into the first and third modes.
NUMA Mode
In order to get into NUMA mode:
- Ensure you’ve got a new enough kernel. It needs to be at least version
5.1.0-rc4
. - Change the AEP to
1LM
mode in the BIOS. -
Use
ipmctl
to put each individual AEP DIMM into AppDirect mode: To do this to all AEP on socket 0, do:ipmctl create -goal -socket 0 PersistentMemoryType=AppDirect
-
Reboot. Now the DIMMs that you chose from the previous command will show as one distinct memory region in
ipmctl
. However, you needndctl
anddaxctl
to bind this region to the appropriate kernel driver. Thegit checkout
command is only necessary if you want to later use this utility to online your memory asZONE_NORMAL
, and ifv67
hasn’t been released yet.git clone https://github.com/pmem/ndctl.git
git checkout 3fdaf7ed512ac564a78d6d2c7152be218e39c3c1
- Build the project.
- Migrate the device model. This writes to the file
/etc/modprobe.d/daxctl.conf
. If it doesn’t, then you didn’t globally installdaxctl
to your system (it hardcodes some paths, so running this command from the source directory doesn’t work). Reboot again. -
View the memory regions that you created with
ipmctl
, usingndctl
:ndctl list -R
-
Create a namespace consisting of that region:
ndctl create-namespace --region region0 -m dax
-
Ensure you don’t have
dax_pmem_compat
running:lsmod | grep "dax_pmem_compat"
-
Ensure you do have
kmem
running:lsmod | grep kmem
The last step has two methods, depending on whether you want the memory in
ZONE_NORMAL
or ZONE_MOVABLE
. This post
explains this conundrum in excrutiating detail.
If you have the kernel configuration CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
, you can
manually bind the memory to the correct driver, and the kernel will automatically online
the memory as ZONE_NORMAL
for you:
-
Find the device that you want to bind. Examples include
dax0.0
,dax1.0
, etc. This will list the namespaces, and within these should be the device that they’re associated with:ndctl list -N
-
Unbind from the
device_dax
driver:echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
-
Bind to the
kmem
driver:echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
If you don’t have that configuration, then use the daxctl
utility to do it
for you. Keep in mind that if you have version v66
and didn’t do the git
checkout
command above, you won’t have the --no-movable
option.
-
For
ZONE_MOVABLE
:daxctl reconfigure-device --mode=system-ram all
-
For
ZONE_NORMAL
:daxctl reconfigure-device --no-movable --mode=system-ram all
2LM Mode
In order to get back into 2LM mode:
- Switch to “2LM” mode in the BIOS.
-
Disable and destroy all namespaces:
ndctl list -N
ndctl disable-namespace namespace0.0
ndctl destroy-namespace namespace0.0
-
Disable all regions:
ndctl list -R
ndctl disable-region region0
ndctl disable-region region1
-
Put all of each socket’s memory in Memory Mode:
ipmctl create -goal -socket 0 MemoryMode=100
ipmctl create -goal -socket 1 MemoryMode=100
- Reboot again.