diff options
| author | David S. Miller <davem@davemloft.net> | 2016-03-08 15:28:33 -0500 |
|---|---|---|
| committer | David S. Miller <davem@davemloft.net> | 2016-03-08 15:28:33 -0500 |
| commit | f14b488d50b7dc234ddaed53ce4293c9eac47457 (patch) | |
| tree | ff802293cd7a0020211818c8039cd9f117f30a35 /include/linux | |
| parent | 8aba8b83128a04197991518e241aafd3323b705d (diff) | |
| parent | c3f85cffc50d2f259903555979581a632b945ec2 (diff) | |
| download | cachepc-linux-f14b488d50b7dc234ddaed53ce4293c9eac47457.tar.gz cachepc-linux-f14b488d50b7dc234ddaed53ce4293c9eac47457.zip | |
Merge branch 'bpf-map-prealloc'
Alexei Starovoitov says:
====================
bpf: map pre-alloc
v1->v2:
. fix few issues spotted by Daniel
. converted stackmap into pre-allocation as well
. added a workaround for lockdep false positive
. added pcpu_freelist_populate to be used by hashmap and stackmap
this path set switches bpf hash map to use pre-allocation by default
and introduces BPF_F_NO_PREALLOC flag to keep old behavior for cases
where full map pre-allocation is too memory expensive.
Some time back Daniel Wagner reported crashes when bpf hash map is
used to compute time intervals between preempt_disable->preempt_enable
and recently Tom Zanussi reported a dead lock in iovisor/bcc/funccount
tool if it's used to count the number of invocations of kernel
'*spin*' functions. Both problems are due to the recursive use of
slub and can only be solved by pre-allocating all map elements.
A lot of different solutions were considered. Many implemented,
but at the end pre-allocation seems to be the only feasible answer.
As far as pre-allocation goes it also was implemented 4 different ways:
- simple free-list with single lock
- percpu_ida with optimizations
- blk-mq-tag variant customized for bpf use case
- percpu_freelist
For bpf style of alloc/free patterns percpu_freelist is the best
and implemented in this patch set.
Detailed performance numbers in patch 3.
Patch 2 introduces percpu_freelist
Patch 1 fixes simple deadlocks due to missing recursion checks
Patch 5: converts stackmap to pre-allocation
Patches 6-9: prepare test infra
Patch 10: stress test for hash map infra. It attaches to spin_lock
functions and bpf_map_update/delete are called from different contexts
Patch 11: stress for bpf_get_stackid
Patch 12: map performance test
Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Reported-by: Tom Zanussi <tom.zanussi@linux.intel.com>
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/bpf.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 51e498e5470e..21ee41b92e8a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -10,6 +10,7 @@ #include <uapi/linux/bpf.h> #include <linux/workqueue.h> #include <linux/file.h> +#include <linux/percpu.h> struct bpf_map; @@ -36,6 +37,7 @@ struct bpf_map { u32 key_size; u32 value_size; u32 max_entries; + u32 map_flags; u32 pages; struct user_struct *user; const struct bpf_map_ops *ops; @@ -163,6 +165,8 @@ bool bpf_prog_array_compatible(struct bpf_array *array, const struct bpf_prog *f const struct bpf_func_proto *bpf_get_trace_printk_proto(void); #ifdef CONFIG_BPF_SYSCALL +DECLARE_PER_CPU(int, bpf_prog_active); + void bpf_register_prog_type(struct bpf_prog_type_list *tl); void bpf_register_map_type(struct bpf_map_type_list *tl); @@ -175,6 +179,7 @@ struct bpf_map *__bpf_map_get(struct fd f); void bpf_map_inc(struct bpf_map *map, bool uref); void bpf_map_put_with_uref(struct bpf_map *map); void bpf_map_put(struct bpf_map *map); +int bpf_map_precharge_memlock(u32 pages); extern int sysctl_unprivileged_bpf_disabled; @@ -190,6 +195,7 @@ int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value, u64 flags); int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value, u64 flags); +int bpf_stackmap_copy(struct bpf_map *map, void *key, void *value); /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and * forced to use 'long' read/writes to try to atomically copy long counters. |
