Linux kernel modifications for the Kernel Hacking exam
Find a file
Christoph Lameter (Ampere) d0dd066a0f seqcount: replace smp_rmb() in read_seqcount() with load acquire
Many architectures support load acquire which can replace a memory
barrier and save some cycles.

A typical sequence

	do {
		seq = read_seqcount_begin(&s);
		<something>
	} while (read_seqcount_retry(&s, seq);

requires 13 cycles on an N1 Neoverse arm64 core (Ampere Altra, to be
specific) for an empty loop.  Two read memory barriers are needed.  One
for each of the seqcount_* functions.

We can replace the first read barrier with a load acquire of the
seqcount which saves us one barrier.

On the Altra doing so reduces the cycle count from 13 to 8.

According to ARM, this is a general improvement for the ARM64
architecture and not specific to a certain processor.

See

  https://developer.arm.com/documentation/102336/0100/Load-Acquire-and-Store-Release-instructions

 "Weaker ordering requirements that are imposed by Load-Acquire and
  Store-Release instructions allow for micro-architectural
  optimizations, which could reduce some of the performance impacts that
  are otherwise imposed by an explicit memory barrier.

  If the ordering requirement is satisfied using either a Load-Acquire
  or Store-Release, then it would be preferable to use these
  instructions instead of a DMB"

[ NOTE! This is my original minimal patch that unconditionally switches
  over to using smp_load_acquire(), instead of the much more involved
  and subtle patch that Christoph Lameter wrote that made it
  conditional.

  But Christoph gets authorship credit because I had initially thought
  that we needed the more complex model, and Christoph ran with it it
  and did the work. Only after looking at code generation for all the
  relevant architectures, did I come to the conclusion that nobody
  actually really needs the old "smp_rmb()" model.

  Even architectures without load-acquire support generally do as well
  or better with smp_load_acquire().

  So credit to Christoph, but if this then causes issues on other
  architectures, put the blame solidly on me.

  Also note as part of the ruthless simplification, this gets rid of the
  overly subtle optimization where some code uses a non-barrier version
  of the sequence count (see the __read_seqcount_begin() users in
  fs/namei.c). They then play games with their own barriers and/or with
  nested sequence counts.

  Those optimizations are literally meaningless on x86, and questionable
  elsewhere. If somebody can show that they matter, we need to re-do
  them more cleanly than "use an internal helper".       - Linus ]

Signed-off-by: Christoph Lameter (Ampere) <cl@gentwo.org>
Link: https://lore.kernel.org/all/20240912-seq_optimize-v3-1-8ee25e04dffa@gentwo.org/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-22 13:35:36 -07:00
arch Merge branch 'address-masking' 2024-09-22 11:19:35 -07:00
block vfs-6.12.blocksize 2024-09-20 17:53:17 -07:00
certs
crypto crypto: aegis128 - Fix indentation issue in crypto_aegis128_process_crypt() 2024-09-13 18:26:52 +08:00
Documentation ring-buffer: Updates for v6.12: 2024-09-22 09:47:16 -07:00
drivers sched_ext: Initial pull request for v6.12 2024-09-21 09:44:57 -07:00
fs Merge branch 'address-masking' 2024-09-22 11:19:35 -07:00
include seqcount: replace smp_rmb() in read_seqcount() with load acquire 2024-09-22 13:35:36 -07:00
init sched_ext: Initial pull request for v6.12 2024-09-21 09:44:57 -07:00
io_uring slab updates for 6.12 2024-09-18 08:53:53 +02:00
ipc ipc/shm, mm: drop do_vma_munmap() 2024-09-03 21:15:52 -07:00
kernel ring-buffer: Updates for v6.12: 2024-09-22 09:47:16 -07:00
lib Merge branch 'address-masking' 2024-09-22 11:19:35 -07:00
LICENSES LICENSES: add 0BSD license text 2024-09-01 20:43:24 -07:00
mm Many singleton patches - please see the various changelogs for details. 2024-09-21 08:20:50 -07:00
net bpf-next-6.12 2024-09-21 09:27:50 -07:00
rust Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-09-05 20:37:20 -07:00
samples bpf-next-6.12 2024-09-21 09:27:50 -07:00
scripts bpf-next-6.12 2024-09-21 09:27:50 -07:00
security bpf-next-6.12 2024-09-21 09:27:50 -07:00
sound sound updates for 6.12-rc1 2024-09-17 17:03:43 +02:00
tools ring-buffer: Updates for v6.12: 2024-09-22 09:47:16 -07:00
usr
virt KVM: use follow_pfnmap API 2024-09-17 01:06:59 -07:00
.clang-format
.cocciconfig
.editorconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
.rustfmt.toml
COPYING
CREDITS MAINTAINERS: Mark powerpc spufs as orphaned 2024-08-19 21:27:56 +10:00
Kbuild
Kconfig
MAINTAINERS sched_ext: Initial pull request for v6.12 2024-09-21 09:44:57 -07:00
Makefile Linux 6.11 2024-09-15 16:57:56 +02:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.