DUECA/DUSIME
|
Making the most out of your hardware
DUECA is a middleware environment for real-time distributed calculation. However, that does not mean that DUECA can run always real-time, for running real-time, DUECA relies on the services of the underlying operating system. This page describes some tricks for tuning your Linux system so that it can run real-time tasks.
One of the first things to do is to ensure that the dueca processes do not run in normal scheduling mode, but in a realtime priority mode. There are two things that make this possible:
/etc/security/limits.conf
, or add a file to /etc/security/limits.d
It may be clear that the second option is to be preferred; in addition, by not making your process run as root, the log files that will be written are accessible from your normal user id.
Another thing that must be handled is the memory lock. In Linux and most unix systems, a process can be (partially or fully) swapped to a disk partition (the swap partition) if not enough memory is available. This must be prevented at all times for real-time processes, so the process must be locked in memory. I suggest the following adjustments:
rtdueca
, and add the users that need real-time priorities to this groupdueca.conf
in the directory /etc/security/limits.d
: @rtdueca - rtprio 100 @rtdueca - nice -20 @rtdueca - memlock unlimitedThe option '-' enforces both soft and hard limits together; the default limit is changed to the new value, which is probably the best option for a "production" machine in a simulation set-up. If you want to change the limits on a developer workstation, you might consider using the option 'hard' for only setting the hard limits. The developer can then use the "ulimit" shell command to change the soft limits and allow real-time running, or not use the command, and run dueca non-realtime, e.g. when developing.
Modern Linux kernels and systems use aggressive energy saving strategies. This commonly destroys the real-time performance; as an example on modern DELL workstations with Ubuntu 18.04, wake-up latencies as long as 360 microseconds will be produced. This can be corrected by switching off sleep modes in the kernel. DUECA will, if memory lock has been succesful, try to also do this by seting the kernel to low-latency behaviour through the /dev/cpu_dma_latency
file. A standard DUECA rpm or deb installation will include a udev rules file that makes this device file writeable for the rtdueca group. If you want this for another user, adapt the 90-rtdueca-cpulatency.rules
file.
The default setting for the memory lock limit on Ubuntu 22.04 is almost 500 kB. (to be checked with ulimit -a
).
For development, people normally do not adjust the limits. In common uses, this will allow your DUECA process to lock the memory at start-up, but after this, it is common that the DUECA process grows to beyond 500 kB. When that happens, memory allocation fails, and the process crashes.
So for development on Ubuntu 22 machines, set the memory lock limit to zero. This prevents locked memory alltogether; in the terminal where you are running DUECA, or in your .bashrc
file, set:
ulimit -l 0
To facilitate starting a DUECA process on multiple computers, start-up scripts have been included in the dueca distribution. These scripts facilitate starting DUECA remotely. Currently, the following are available:
/usr/bin/rrundueca /usr/bin/rrunduecaX /usr/bin/rrunduecaXfvwm /usr/bin/rrunduecaXmwm /usr/bin/rrunduecaXxf
These start, respectively (1) a bare (no X) dueca, (2) dueca after starting a bare X server, and the remaining three start the X server, a window manager (fvwm, Motif Window Manager and xfce respectively) and dueca. The latter three are commonly used in testing, when you need several windows with, e.g., plots for debugging.
These scripts are typically run over ssh. You need to ensure unhindered (no password) logins from the machine where you do the remote logins. There is a script (actually, a piece of script) called GenericStart
that can do most of the magic for you. This will be explained at the hand of an example:
#!/bin/bash # In this lab, we use ssh to remote login on Linux RSHL=ssh # the main directory with the configuration files for this simulation MAINDIR=/home/fltsim/dapps/ESVS2/ESVS2/run/HMILab # A list of all nodes, except node no 0 and timing masternode NODES="dutmms3 dutmms2 dutmms6" # Node 0, the node with the dueca interface ZERONODE="dutmms1" # timing master node, will initiate the communication MASTERNODE="dutmms4" # Here we define what type of machine each node is. These are extended # regular expressions # nodes with Linux, and start up X XNODES="dutmms[326]" # nodes with Linux, but don't start X LNODES="dutmms[14]" # for debugging, you can build one or more of the nodes with debug symbols # and manually start them with the debugger. List these nodes in MNODES # MNODES="dutmms1" # beyond this point, you should not have to modify anything source `dueca-config --path-datafiles`/data/GenericStart
A full description of all possible options is in the GenericStart
file. The principle is simple; in your script, you define which nodes need to be started, and how; X, or X with a window manager, or no X.
In the above examples, there are three "normal" nodes, the timing master node and node 0.
The normal nodes are all started with X (XNODES), and thus with the script rrunduecaX. As an example, the DUECA process on dutmms3
will be started from:
/home/fltsim/dapps/ESVS2/ESVS2/run/HMILab/dutmms3
so it will look for its configuration files there. Other nodes will be started from their respective directories.
Note that there are many more options for the GenericStart
script; check with
bash /usr/share/dueca/data/GenericStart
I have experimented with several options for real-time kernels, and have decided that most of the functionality needed for simulation, with a minimum of hassle, can be found in Linux kernels with the PREEMPT_RT
patch set applied. In such a set-up, use nanosleep for timing (see configuring dueca.cnf
). Timing is generally accurate to within 20 microseconds, and the clock on these system is based on one-shot timers, so there are no granularity problems (i.e. the clock can wait until exactly the time you want it to, it does not under- or overshoot the target time by maximally one clock period, which is typically 10 or 1 ms). I maintain a set of these kernels on the opensuse build service, project home:repabuild:preempt
.
On Ubuntu, you can install a real-time kernel with:
apt install linux-image-preempt linux-headers-preempt linux-modules-preempt
To find out which NVIDIA driver is suitable for your set-up (if you have an NVIDIA card, that is), run:
user@linux:~$ ubuntu-drivers devices == /sys/devices/pci0000:64/0000:64:00.0/0000:65:00.0 == modalias : pci:v000010DEd00001CB2sv00001028sd000011BDbc03sc00i00 vendor : NVIDIA Corporation model : GP107GL [Quadro P600] driver : nvidia-driver-470 - distro non-free recommended driver : nvidia-driver-390 - distro non-free driver : nvidia-driver-470-server - distro non-free driver : nvidia-driver-418-server - distro non-free driver : nvidia-driver-460-server - distro non-free driver : nvidia-driver-450-server - distro non-free driver : nvidia-driver-460 - distro non-free driver : xserver-xorg-video-nouveau - distro free builtin
You see that this gives you information on the installed graphics card, and what drivers are compatible with that card. Choose a version and install:
apt install nvidia-driver-470
(of course, Supply the right version for the nvidia module here)
Compilation of the kernel module will now fail, because NVIDIA by default does not accept to build for the PREEMPT kernel series. With a little script you can patch the NVIDIA sources, and then let dpkg correct the configuration of the installed nvidia-dkms package, so it builds and installs the kernel module:
# run a script that modifies the nvidia package code to be compatible # with a PREEMPT built kernel dkms-allow-preempt # re-run the failed package install configuration, which should now # correctly build and install the nvidia modules for the PREEMPT kernel dpkg --configure -a
Nothing is more annoying than being halfway an experiment, and seeing the display go blank. To prevent that, adjust the Serverflags section in your xorg.conf file, or in a file in /etc/X11/xorg.conf.d
, I am suggesting /etc/X11/xorg.conf.d/10-dontblank.conf
:
Section "ServerFlags" Option "BlankTime" "600" Option "StandbyTime" "610" Option "SuspendTime" "620" Option "OffTime" "630" EndSection
This will give you 10 hours (600 minutes) of running until the screen blanks.
For a set of DUECA nodes that communicates over Ethernet, it might be important to tweak the ethernet card settings. Using ethtool, one can for most cards modify the timing settings; this is particularly important for Gigabit ethernet cards, that are in many cases tuned to maximize throughput, at the expense of some latency. The technique used is coalescing, gathering several packets before generating an interrupt.
In real-time communication, latency is more important; the following command turns the coalescing of, by setting the delay for interrupt generation after packet reception to 0 and the number of frames to collect before generating an interrupt to 1.
ethtool -C eth0 adaptive-rx off adaptive-tx off \ rx-usecs 0 rx-usecs-irq 0 rx-frames 1 rx-frames-irq 1 \ tx-usecs 0 tx-usecs-irq 0 tx-frames 1 tx-frames-irq 1
If you want this change to persist, add it to a start-up script that is executed after the network cards are configured.
Some ethernet cards cannot be controlled with ethtool, but the settings of the kernel module that controls the card can be supplied when the module is loaded. The e1000e
driver used for many Intel cards can be configured in this way. There are several steps in this process:
Create a file e1000e.conf
in /etc/modprobe.d
, with the contents
options e1000e InterruptThrottleRate=3,0
Note that the InterruptThrottleRate
takes an array (comma-separated), as argument, each argument corresponds to a card controlled by the driver. "3" is the default value, indicating automated coalescing of interrupts, "0" turns off the coalescing. Use that option for the cards involved in real- time communication, thus adapt the array of arguments as needed.
modprobe -r e1000e && modprobe e1000e
update-initramfs -uk all
For other drivers, use the modinfo command to determine which load options are possible.
In some cases, it is handy to drive several displays off a single computer. It is possible to give each video card in that computer its own X server. These must then be specified in the xorg configuration, and selected when starting the X server. There are several concepts to consider:
Section "ServerLayout" Identifier "sides" Screen 0 "ScreenSides" InputDevice "Mouse1" "CorePointer" InputDevice "Keyboard1" "CoreKeyboard" EndSection
The next step is to consistently attach the keyboard and mouse input devices to physical hardware. In this case, we don't connect the mouse to anything, but you might want to use udev to get a consistent name for e.g. a usb mouse or touchscreen, and then link that input:
Here a mouse attached to nothing (dev/null):
Section "InputDevice" Identifier "Mouse1" Driver "mouse" # Option "Protocol" "auto" Option "Device" "/dev/null" Option "ZAxisMapping" "4 5 6 7" EndSection
And a standard system keyboard:
Section "InputDevice" Identifier "Keyboard1" Driver "kbd" EndSection
These are both used in the above ServerLayout, binding them to a server.
The screen needs to further define the monitor and the associated/connected graphics card:
Section "Screen" Identifier "ScreenSides" Device "Card1" Monitor "Monitor2" SubSection "Display" Viewport 0 0 Depth 24 EndSubSection EndSection
The identifier links back to the server layout.
Section "Device" Identifier "Card1" Driver "nvidia" VendorName "nVidia Corporation" BoardName "Unknown Board" BusID "PCI:23:0:0" Option "TwinView" "true" Option "TwinViewOrientation" "RightOf" Option "UseEdidFreqs" "true" Option "ProbeAllGpus" "false" Option "NoLogo" "true" EndSectionThe BusID and ProbeAllGpus options are used to isolate the single graphics card.
Section "Monitor" Identifier "Monitor2" VendorName "Monitor Vendor" ModelName "Monitor Model" EndSection
Many lower-cost IO devices use USB interface to communicate with the computer. In case you want to read these from your simulation program, you need to figure out which of the many event, mouse or joystick device files you have in your /dev
folder is the one that you need to attach to, and it can be particularly annoying that usb devices end up at a different device file when other devices are plugged in or not. In this case udev scripts can provide persistence.
Supposing you figured out that /dev/input/event3
is attached to your touchscreen. A handy little program for checking this is evtest
. After a next reboot, this device may have moved to /dev/input/event5
. To get a persistent name, you may add a symbolic link pointing to the device when it becomes available.
At any moment, a device is uniquely identified by its device path (but note that that may change after a reboot). You can get udevadm to get information on the device in the following manner
udevadm info -a -p $(udevadm info -q path -n /dev/input/event3)
You will get information on the device itself and on the parents of the device. You can select the device for udev rules by specifying attributes of the device itself and of one of the parent devices (with ==
between the attribute and the value!). Sometimes you can select on, e.g., device name and possibly serial number. If you have only one of these devices, the selection is simple, just select on the name of the thing, like:
SUBSYSTEM=="input", ATTRS{name}=="Wacom BambooPT 2FG Small Pen"
If you have multiple devices of the same type, and they are not distinguished by serial or other tags, you can try to match on the physical bus location of the USB connection. As long as you don't re-plug the devices, this will stay constant, here is an example match on the plug/bus:
SUBSYSTEM=="input", ATTRS{phys}=="usb-0000:00:1d.0-1.6.4/input0"
Using the match, create a rules file in the /etc/udev/rules.d
folder, e.g., 90-touch1.rules
:
SUBSYSTEM=="input", ATTRS{phys}=="usb-0000:00:1d.0-1.6.4/input0", GROUP="users", MODE="0660", SYMLINK+="touchinput1"
This modifies the file mode, group name and adds a symbolic link in /dev
. Note that here you use a single equal sign (=
), or an addition (+=
) between actions and arguments.
To find out, after all the connecting, what devices are available, you can use the xrandr
and xinput
programs.
xrandrThis produces devices like
DP-0
, etc.xinput list
Your touchscreen should have a number here, we assume 14 for now.
xinput --map-to-output 14 DP-0
xinput list-props 14
You can set the coordinate transformation matrix in the xorg.conf, or (if you have only one graphics output/x window) add it to the udev rules magic; using the same example:
SUBSYSTEM=="input", ATTRS{phys}=="usb-0000:00:1d.0-1.6.4/input0", ENV{WL_OUTPUT}="DP-0",ENV{LIBINPUT_CALIBRATION_MATRIX}="0.545455 0.000000 0.000000 0.000000 0.900000 0.000000 0.000000 0.000000 1.000000"
And in xorg, in the InputDevice, add:
Option "TransformationMatrix" "0.545455 0.000000 0.000000 0.000000 0.900000 0.000000 0.000000 0.000000 1.000000"
This description on sound tuning is not yet complete. Here, we assume an Ubuntu 20.04 or 22.04 workstation. Sound is controlled over pulseaudio.
Add the simulation user to the groups audio
and pulse-access
.
To detect which hardware devices are seen by the operating system, check first with the ALSA facility aplay:
aplay -l
To verify if another program is accessing your sound devices directly:
sudo fuser -v /dev/snd/*
To check that pulseaudio has control your sound devices, run pactl.
pactl list sinks pactl list cards
Volume control gui with pavucontrol.
If there is a complaint about bluetooth and pulseaudio in the journalctl log, and there is no bluetooth device to configure, run:
sudo apt-get remove --auto-remove pulseaudio-module-bluetooth
First try to auto-start the pulseaudio daemon for the user. It will start automatically once you try to access the audio (pactl). Have all lines in /etc/pulse/client.conf
commented out.
Since the dueca process is run over an ssh login – unless you are running node 0 on the desktop you are starting from – it may also be needed to run the pulseaudio as a system service, use [this configuration] (https://github.com/shivasiddharth/PulseAudio-System-Wide)
Edit the /etc/pulse/client.conf
file, set
autospawn = no default-server = /var/run/pulse/native
Reboot if needed, verify that there is only one pulseaudio daemon running;
srs@srsefis2:~ ps aux | grep pulseaudio pulse 1003 0.0 0.1 313524 13120 ? S<sl Jan26 0:00 /usr/bin/pulseaudio --daemonize=no --system --realtime --log-target=journal
Further instructions (setting group membership!) here
Selecting specific cards, if needed, at https://unix.stackexchange.com/questions/473846/how-does-pulseaudio-determine-which-alsa-devices-to-make-available-or-not
This requires a reboot before it works.
As a note to future tweakers, audio services are moving to pipewire
.
For exploration and checking of the connection with a bluetooth device, the bluetoothctl
and hcitool
command line tools and small python scripts are useful. To get these to work properly when not running them as root, set up the required permissions:
sudo setcap 'cap_net_raw,cap_net_admin+eip' $(readlink -f $(which python3)) sudo setcap 'cap_net_raw+ep' $(readlink -f $(which hcitool))
In many cases, you will need the mac address of a sensor for connection. To scan for sensors in range:
bluetoothctl scan on ... [NEW] Device 20:1B:88:03:85:AC 20-1B-88-03-85-AC [CHG] Device A4:C1:38:E8:8D:79 RSSI: -52 [CHG] Device 20:1B:88:03:85:AC RSSI: -83 [CHG] Device 20:1B:88:03:85:AC Name: Mi True Wireless EBs Basic 2 [CHG] Device 20:1B:88:03:85:AC Alias: Mi True Wireless EBs Basic 2
The example here is for a Mi wireless humidity and temperature sensor.
When using python scripts to communicate with bluetooth low energy devices, the bleak
module is quite succesful.
hcitool
can be used to modify latency of bluetooth connections, see this (discussion on an archlinus forum)[https://bbs.archlinux.org/viewtopic.php?id=248133]
For IO with bluetooth devices from a DUECA module, the libblepp
library can be used. The latest versions have a configurable receive buffer, which is needed for some devices that send large (usually fragmented in packages) messages. The python3 scripts seem to have no problem there, so if you tested a communication with python, and it does not seem to work with libblepp, check with Wireshark what the message size used by the device is.