Skip to content

Commit 0c01c87

Browse files
author
CKI KWF Bot
committed
Merge: bpf: update to v6.15
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-10/-/merge_requests/1401 JIRA: https://issues.redhat.com/browse/RHEL-78202 Omitted-fix: df0cb5c ("bpf: Allow fall back to interpreter for programs with stack size <= 512") - fixes an issue introduced by two commits, only one of which is backported. We will take the fix when we backport the second commit Omitted-fix: 56b4d16 ("bpf: Cleanup unused func args in rqspinlock implementation") - no functional change (just a refactoring), will be taken in the corresponding rebase Update the BPF subsystem to upstream kernel version 6.15 Signed-off-by: Gregory Bell <grbell@redhat.com> Approved-by: Jerome Marchand <jmarchan@redhat.com> Approved-by: Viktor Malik <vmalik@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: Toke Høiland-Jørgensen <toke@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: CKI GitLab Kmaint Pipeline Bot <26919896-cki-kmaint-pipeline-bot@users.noreply.gitlab.com>
2 parents 7880279 + e59fb50 commit 0c01c87

File tree

222 files changed

+11367
-1586
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

222 files changed

+11367
-1586
lines changed

Documentation/bpf/bpf_devel_QA.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -382,6 +382,14 @@ In case of new BPF instructions, once the changes have been accepted
382382
into the Linux kernel, please implement support into LLVM's BPF back
383383
end. See LLVM_ section below for further information.
384384

385+
Q: What "BPF_INTERNAL" symbol namespace is for?
386+
-----------------------------------------------
387+
A: Symbols exported as BPF_INTERNAL can only be used by BPF infrastructure
388+
like preload kernel modules with light skeleton. Most symbols outside
389+
of BPF_INTERNAL are not expected to be used by code outside of BPF either.
390+
Symbols may lack the designation because they predate the namespaces,
391+
or due to an oversight.
392+
385393
Stable submission
386394
=================
387395

Documentation/bpf/bpf_iterators.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ following steps:
8686
The following are a few examples of selftest BPF iterator programs:
8787

8888
* `bpf_iter_tcp4.c <https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_tcp4.c>`_
89-
* `bpf_iter_task_vma.c <https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_task_vma.c>`_
89+
* `bpf_iter_task_vmas.c <https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_task_vmas.c>`_
9090
* `bpf_iter_task_file.c <https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c>`_
9191

9292
Let us look at ``bpf_iter_task_file.c``, which runs in kernel space:

Documentation/bpf/btf.rst

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,8 @@ Each type contains the following common data::
102102
* bits 24-28: kind (e.g. int, ptr, array...etc)
103103
* bits 29-30: unused
104104
* bit 31: kind_flag, currently used by
105-
* struct, union, fwd, enum and enum64.
105+
* struct, union, enum, fwd, enum64,
106+
* decl_tag and type_tag
106107
*/
107108
__u32 info;
108109
/* "size" is used by INT, ENUM, STRUCT, UNION and ENUM64.
@@ -478,7 +479,7 @@ No additional type data follow ``btf_type``.
478479

479480
``struct btf_type`` encoding requirement:
480481
* ``name_off``: offset to a non-empty string
481-
* ``info.kind_flag``: 0
482+
* ``info.kind_flag``: 0 or 1
482483
* ``info.kind``: BTF_KIND_DECL_TAG
483484
* ``info.vlen``: 0
484485
* ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef``
@@ -489,7 +490,6 @@ No additional type data follow ``btf_type``.
489490
__u32 component_idx;
490491
};
491492

492-
The ``name_off`` encodes btf_decl_tag attribute string.
493493
The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``.
494494
For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``.
495495
For the other three types, if the btf_decl_tag attribute is
@@ -499,12 +499,21 @@ the attribute is applied to a ``struct``/``union`` member or
499499
a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a
500500
valid index (starting from 0) pointing to a member or an argument.
501501

502+
If ``info.kind_flag`` is 0, then this is a normal decl tag, and the
503+
``name_off`` encodes btf_decl_tag attribute string.
504+
505+
If ``info.kind_flag`` is 1, then the decl tag represents an arbitrary
506+
__attribute__. In this case, ``name_off`` encodes a string
507+
representing the attribute-list of the attribute specifier. For
508+
example, for an ``__attribute__((aligned(4)))`` the string's contents
509+
is ``aligned(4)``.
510+
502511
2.2.18 BTF_KIND_TYPE_TAG
503512
~~~~~~~~~~~~~~~~~~~~~~~~
504513

505514
``struct btf_type`` encoding requirement:
506515
* ``name_off``: offset to a non-empty string
507-
* ``info.kind_flag``: 0
516+
* ``info.kind_flag``: 0 or 1
508517
* ``info.kind``: BTF_KIND_TYPE_TAG
509518
* ``info.vlen``: 0
510519
* ``type``: the type with ``btf_type_tag`` attribute
@@ -522,6 +531,14 @@ type_tag, then zero or more const/volatile/restrict/typedef
522531
and finally the base type. The base type is one of
523532
int, ptr, array, struct, union, enum, func_proto and float types.
524533

534+
Similarly to decl tags, if the ``info.kind_flag`` is 0, then this is a
535+
normal type tag, and the ``name_off`` encodes btf_type_tag attribute
536+
string.
537+
538+
If ``info.kind_flag`` is 1, then the type tag represents an arbitrary
539+
__attribute__, and the ``name_off`` encodes a string representing the
540+
attribute-list of the attribute specifier.
541+
525542
2.2.19 BTF_KIND_ENUM64
526543
~~~~~~~~~~~~~~~~~~~~~~
527544

Documentation/bpf/standardization/instruction-set.rst

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -324,34 +324,42 @@ register.
324324

325325
.. table:: Arithmetic instructions
326326

327-
===== ===== ======= ==========================================================
327+
===== ===== ======= ===================================================================================
328328
name code offset description
329-
===== ===== ======= ==========================================================
329+
===== ===== ======= ===================================================================================
330330
ADD 0x0 0 dst += src
331331
SUB 0x1 0 dst -= src
332332
MUL 0x2 0 dst \*= src
333333
DIV 0x3 0 dst = (src != 0) ? (dst / src) : 0
334-
SDIV 0x3 1 dst = (src != 0) ? (dst s/ src) : 0
334+
SDIV 0x3 1 dst = (src == 0) ? 0 : ((src == -1 && dst == LLONG_MIN) ? LLONG_MIN : (dst s/ src))
335335
OR 0x4 0 dst \|= src
336336
AND 0x5 0 dst &= src
337337
LSH 0x6 0 dst <<= (src & mask)
338338
RSH 0x7 0 dst >>= (src & mask)
339339
NEG 0x8 0 dst = -dst
340340
MOD 0x9 0 dst = (src != 0) ? (dst % src) : dst
341-
SMOD 0x9 1 dst = (src != 0) ? (dst s% src) : dst
341+
SMOD 0x9 1 dst = (src == 0) ? dst : ((src == -1 && dst == LLONG_MIN) ? 0: (dst s% src))
342342
XOR 0xa 0 dst ^= src
343343
MOV 0xb 0 dst = src
344344
MOVSX 0xb 8/16/32 dst = (s8,s16,s32)src
345345
ARSH 0xc 0 :term:`sign extending<Sign Extend>` dst >>= (src & mask)
346346
END 0xd 0 byte swap operations (see `Byte swap instructions`_ below)
347-
===== ===== ======= ==========================================================
347+
===== ===== ======= ===================================================================================
348348

349349
Underflow and overflow are allowed during arithmetic operations, meaning
350350
the 64-bit or 32-bit value will wrap. If BPF program execution would
351351
result in division by zero, the destination register is instead set to zero.
352+
Otherwise, for ``ALU64``, if execution would result in ``LLONG_MIN``
353+
dividing -1, the desination register is instead set to ``LLONG_MIN``. For
354+
``ALU``, if execution would result in ``INT_MIN`` dividing -1, the
355+
desination register is instead set to ``INT_MIN``.
356+
352357
If execution would result in modulo by zero, for ``ALU64`` the value of
353358
the destination register is unchanged whereas for ``ALU`` the upper
354-
32 bits of the destination register are zeroed.
359+
32 bits of the destination register are zeroed. Otherwise, for ``ALU64``,
360+
if execution would resuslt in ``LLONG_MIN`` modulo -1, the destination
361+
register is instead set to 0. For ``ALU``, if execution would result in
362+
``INT_MIN`` modulo -1, the destination register is instead set to 0.
355363

356364
``{ADD, X, ALU}``, where 'code' = ``ADD``, 'source' = ``X``, and 'class' = ``ALU``, means::
357365

MAINTAINERS

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4232,6 +4232,8 @@ F: include/uapi/linux/filter.h
42324232
F: kernel/bpf/
42334233
F: kernel/trace/bpf_trace.c
42344234
F: lib/buildid.c
4235+
F: arch/*/include/asm/rqspinlock.h
4236+
F: include/asm-generic/rqspinlock.h
42354237
F: lib/test_bpf.c
42364238
F: net/bpf/
42374239
F: net/core/filter.c

arch/arm64/include/asm/insn.h

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -188,8 +188,10 @@ enum aarch64_insn_ldst_type {
188188
AARCH64_INSN_LDST_STORE_PAIR_PRE_INDEX,
189189
AARCH64_INSN_LDST_LOAD_PAIR_POST_INDEX,
190190
AARCH64_INSN_LDST_STORE_PAIR_POST_INDEX,
191+
AARCH64_INSN_LDST_LOAD_ACQ,
191192
AARCH64_INSN_LDST_LOAD_EX,
192193
AARCH64_INSN_LDST_LOAD_ACQ_EX,
194+
AARCH64_INSN_LDST_STORE_REL,
193195
AARCH64_INSN_LDST_STORE_EX,
194196
AARCH64_INSN_LDST_STORE_REL_EX,
195197
AARCH64_INSN_LDST_SIGNED_LOAD_IMM_OFFSET,
@@ -351,8 +353,10 @@ __AARCH64_INSN_FUNCS(ldr_imm, 0x3FC00000, 0x39400000)
351353
__AARCH64_INSN_FUNCS(ldr_lit, 0xBF000000, 0x18000000)
352354
__AARCH64_INSN_FUNCS(ldrsw_lit, 0xFF000000, 0x98000000)
353355
__AARCH64_INSN_FUNCS(exclusive, 0x3F800000, 0x08000000)
354-
__AARCH64_INSN_FUNCS(load_ex, 0x3F400000, 0x08400000)
355-
__AARCH64_INSN_FUNCS(store_ex, 0x3F400000, 0x08000000)
356+
__AARCH64_INSN_FUNCS(load_acq, 0x3FDFFC00, 0x08DFFC00)
357+
__AARCH64_INSN_FUNCS(store_rel, 0x3FDFFC00, 0x089FFC00)
358+
__AARCH64_INSN_FUNCS(load_ex, 0x3FC00000, 0x08400000)
359+
__AARCH64_INSN_FUNCS(store_ex, 0x3FC00000, 0x08000000)
356360
__AARCH64_INSN_FUNCS(mops, 0x3B200C00, 0x19000400)
357361
__AARCH64_INSN_FUNCS(stp, 0x7FC00000, 0x29000000)
358362
__AARCH64_INSN_FUNCS(ldp, 0x7FC00000, 0x29400000)
@@ -602,6 +606,10 @@ u32 aarch64_insn_gen_load_store_pair(enum aarch64_insn_register reg1,
602606
int offset,
603607
enum aarch64_insn_variant variant,
604608
enum aarch64_insn_ldst_type type);
609+
u32 aarch64_insn_gen_load_acq_store_rel(enum aarch64_insn_register reg,
610+
enum aarch64_insn_register base,
611+
enum aarch64_insn_size_type size,
612+
enum aarch64_insn_ldst_type type);
605613
u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
606614
enum aarch64_insn_register base,
607615
enum aarch64_insn_register state,
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#ifndef _ASM_RQSPINLOCK_H
3+
#define _ASM_RQSPINLOCK_H
4+
5+
#include <asm/barrier.h>
6+
7+
/*
8+
* Hardcode res_smp_cond_load_acquire implementations for arm64 to a custom
9+
* version based on [0]. In rqspinlock code, our conditional expression involves
10+
* checking the value _and_ additionally a timeout. However, on arm64, the
11+
* WFE-based implementation may never spin again if no stores occur to the
12+
* locked byte in the lock word. As such, we may be stuck forever if
13+
* event-stream based unblocking is not available on the platform for WFE spin
14+
* loops (arch_timer_evtstrm_available).
15+
*
16+
* Once support for smp_cond_load_acquire_timewait [0] lands, we can drop this
17+
* copy-paste.
18+
*
19+
* While we rely on the implementation to amortize the cost of sampling
20+
* cond_expr for us, it will not happen when event stream support is
21+
* unavailable, time_expr check is amortized. This is not the common case, and
22+
* it would be difficult to fit our logic in the time_expr_ns >= time_limit_ns
23+
* comparison, hence just let it be. In case of event-stream, the loop is woken
24+
* up at microsecond granularity.
25+
*
26+
* [0]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com
27+
*/
28+
29+
#ifndef smp_cond_load_acquire_timewait
30+
31+
#define smp_cond_time_check_count 200
32+
33+
#define __smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, \
34+
time_limit_ns) ({ \
35+
typeof(ptr) __PTR = (ptr); \
36+
__unqual_scalar_typeof(*ptr) VAL; \
37+
unsigned int __count = 0; \
38+
for (;;) { \
39+
VAL = READ_ONCE(*__PTR); \
40+
if (cond_expr) \
41+
break; \
42+
cpu_relax(); \
43+
if (__count++ < smp_cond_time_check_count) \
44+
continue; \
45+
if ((time_expr_ns) >= (time_limit_ns)) \
46+
break; \
47+
__count = 0; \
48+
} \
49+
(typeof(*ptr))VAL; \
50+
})
51+
52+
#define __smp_cond_load_acquire_timewait(ptr, cond_expr, \
53+
time_expr_ns, time_limit_ns) \
54+
({ \
55+
typeof(ptr) __PTR = (ptr); \
56+
__unqual_scalar_typeof(*ptr) VAL; \
57+
for (;;) { \
58+
VAL = smp_load_acquire(__PTR); \
59+
if (cond_expr) \
60+
break; \
61+
__cmpwait_relaxed(__PTR, VAL); \
62+
if ((time_expr_ns) >= (time_limit_ns)) \
63+
break; \
64+
} \
65+
(typeof(*ptr))VAL; \
66+
})
67+
68+
#define smp_cond_load_acquire_timewait(ptr, cond_expr, \
69+
time_expr_ns, time_limit_ns) \
70+
({ \
71+
__unqual_scalar_typeof(*ptr) _val; \
72+
int __wfe = arch_timer_evtstrm_available(); \
73+
\
74+
if (likely(__wfe)) { \
75+
_val = __smp_cond_load_acquire_timewait(ptr, cond_expr, \
76+
time_expr_ns, \
77+
time_limit_ns); \
78+
} else { \
79+
_val = __smp_cond_load_relaxed_spinwait(ptr, cond_expr, \
80+
time_expr_ns, \
81+
time_limit_ns); \
82+
smp_acquire__after_ctrl_dep(); \
83+
} \
84+
(typeof(*ptr))_val; \
85+
})
86+
87+
#endif
88+
89+
#define res_smp_cond_load_acquire(v, c) smp_cond_load_acquire_timewait(v, c, 0, 1)
90+
91+
#include <asm-generic/rqspinlock.h>
92+
93+
#endif /* _ASM_RQSPINLOCK_H */

arch/arm64/lib/insn.c

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -541,6 +541,35 @@ u32 aarch64_insn_gen_load_store_pair(enum aarch64_insn_register reg1,
541541
offset >> shift);
542542
}
543543

544+
u32 aarch64_insn_gen_load_acq_store_rel(enum aarch64_insn_register reg,
545+
enum aarch64_insn_register base,
546+
enum aarch64_insn_size_type size,
547+
enum aarch64_insn_ldst_type type)
548+
{
549+
u32 insn;
550+
551+
switch (type) {
552+
case AARCH64_INSN_LDST_LOAD_ACQ:
553+
insn = aarch64_insn_get_load_acq_value();
554+
break;
555+
case AARCH64_INSN_LDST_STORE_REL:
556+
insn = aarch64_insn_get_store_rel_value();
557+
break;
558+
default:
559+
pr_err("%s: unknown load-acquire/store-release encoding %d\n",
560+
__func__, type);
561+
return AARCH64_BREAK_FAULT;
562+
}
563+
564+
insn = aarch64_insn_encode_ldst_size(size, insn);
565+
566+
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RT, insn,
567+
reg);
568+
569+
return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn,
570+
base);
571+
}
572+
544573
u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
545574
enum aarch64_insn_register base,
546575
enum aarch64_insn_register state,

arch/arm64/net/bpf_jit.h

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,26 @@
119119
aarch64_insn_gen_load_store_ex(Rt, Rn, Rs, A64_SIZE(sf), \
120120
AARCH64_INSN_LDST_STORE_REL_EX)
121121

122+
/* Load-acquire & store-release */
123+
#define A64_LDAR(Rt, Rn, size) \
124+
aarch64_insn_gen_load_acq_store_rel(Rt, Rn, AARCH64_INSN_SIZE_##size, \
125+
AARCH64_INSN_LDST_LOAD_ACQ)
126+
#define A64_STLR(Rt, Rn, size) \
127+
aarch64_insn_gen_load_acq_store_rel(Rt, Rn, AARCH64_INSN_SIZE_##size, \
128+
AARCH64_INSN_LDST_STORE_REL)
129+
130+
/* Rt = [Rn] (load acquire) */
131+
#define A64_LDARB(Wt, Xn) A64_LDAR(Wt, Xn, 8)
132+
#define A64_LDARH(Wt, Xn) A64_LDAR(Wt, Xn, 16)
133+
#define A64_LDAR32(Wt, Xn) A64_LDAR(Wt, Xn, 32)
134+
#define A64_LDAR64(Xt, Xn) A64_LDAR(Xt, Xn, 64)
135+
136+
/* [Rn] = Rt (store release) */
137+
#define A64_STLRB(Wt, Xn) A64_STLR(Wt, Xn, 8)
138+
#define A64_STLRH(Wt, Xn) A64_STLR(Wt, Xn, 16)
139+
#define A64_STLR32(Wt, Xn) A64_STLR(Wt, Xn, 32)
140+
#define A64_STLR64(Xt, Xn) A64_STLR(Xt, Xn, 64)
141+
122142
/*
123143
* LSE atomics
124144
*

0 commit comments

Comments
 (0)