From 94604548aa7163fa14b837149bb0cb708bc613bc Mon Sep 17 00:00:00 2001 From: Andrea Mayer Date: Tue, 27 Apr 2021 17:44:04 +0200 Subject: seg6: add counters support for SRv6 Behaviors This patch provides counters for SRv6 Behaviors as defined in [1], section 6. For each SRv6 Behavior instance, counters defined in [1] are: - the total number of packets that have been correctly processed; - the total amount of traffic in bytes of all packets that have been correctly processed; In addition, this patch introduces a new counter that counts the number of packets that have NOT been properly processed (i.e. errors) by an SRv6 Behavior instance. Counters are not only interesting for network monitoring purposes (i.e. counting the number of packets processed by a given behavior) but they also provide a simple tool for checking whether a behavior instance is working as we expect or not. Counters can be useful for troubleshooting misconfigured SRv6 networks. Indeed, an SRv6 Behavior can silently drop packets for very different reasons (i.e. wrong SID configuration, interfaces set with SID addresses, etc) without any notification/message to the user. Due to the nature of SRv6 networks, diagnostic tools such as ping and traceroute may be ineffective: paths used for reaching a given router can be totally different from the ones followed by probe packets. In addition, paths are often asymmetrical and this makes it even more difficult to keep up with the journey of the packets and to understand which behaviors are actually processing our traffic. When counters are enabled on an SRv6 Behavior instance, it is possible to verify if packets are actually processed by such behavior and what is the outcome of the processing. Therefore, the counters for SRv6 Behaviors offer an non-invasive observability point which can be leveraged for both traffic monitoring and troubleshooting purposes. [1] https://www.rfc-editor.org/rfc/rfc8986.html#name-counters Troubleshooting using SRv6 Behavior counters -------------------------------------------- Let's make a brief example to see how helpful counters can be for SRv6 networks. Let's consider a node where an SRv6 End Behavior receives an SRv6 packet whose Segment Left (SL) is equal to 0. In this case, the End Behavior (which accepts only packets with SL >= 1) discards the packet and increases the error counter. This information can be leveraged by the network operator for troubleshooting. Indeed, the error counter is telling the user that the packet: (i) arrived at the node; (ii) the packet has been taken into account by the SRv6 End behavior; (iii) but an error has occurred during the processing. The error (iii) could be caused by different reasons, such as wrong route settings on the node or due to an invalid SID List carried by the SRv6 packet. Anyway, the error counter is used to exclude that the packet did not arrive at the node or it has not been processed by the behavior at all. Turning on/off counters for SRv6 Behaviors ------------------------------------------ Each SRv6 Behavior instance can be configured, at the time of its creation, to make use of counters. This is done through iproute2 which allows the user to create an SRv6 Behavior instance specifying the optional "count" attribute as shown in the following example: $ ip -6 route add 2001:db8::1 encap seg6local action End count dev eth0 per-behavior counters can be shown by adding "-s" to the iproute2 command line, i.e.: $ ip -s -6 route show 2001:db8::1 2001:db8::1 encap seg6local action End packets 0 bytes 0 errors 0 dev eth0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Impact of counters for SRv6 Behaviors on performance ==================================================== To determine the performance impact due to the introduction of counters in the SRv6 Behavior subsystem, we have carried out extensive tests. We chose to test the throughput achieved by the SRv6 End.DX2 Behavior because, among all the other behaviors implemented so far, it reaches the highest throughput which is around 1.5 Mpps (per core at 2.4 GHz on a Xeon(R) CPU E5-2630 v3) on kernel 5.12-rc2 using packets of size ~ 100 bytes. Three different tests were conducted in order to evaluate the overall throughput of the SRv6 End.DX2 Behavior in the following scenarios: 1) vanilla kernel (without the SRv6 Behavior counters patch) and a single instance of an SRv6 End.DX2 Behavior; 2) patched kernel with SRv6 Behavior counters and a single instance of an SRv6 End.DX2 Behavior with counters turned off; 3) patched kernel with SRv6 Behavior counters and a single instance of SRv6 End.DX2 Behavior with counters turned on. All tests were performed on a testbed deployed on the CloudLab facilities [2], a flexible infrastructure dedicated to scientific research on the future of Cloud Computing. Results of tests are shown in the following table: Scenario (1): average 1504764,81 pps (~1504,76 kpps); std. dev 3956,82 pps Scenario (2): average 1501469,78 pps (~1501,47 kpps); std. dev 2979,85 pps Scenario (3): average 1501315,13 pps (~1501,32 kpps); std. dev 2956,00 pps As can be observed, throughputs achieved in scenarios (2),(3) did not suffer any observable degradation compared to scenario (1). Thanks to Jakub Kicinski and David Ahern for their valuable suggestions and comments provided during the discussion of the proposed RFCs. [2] https://www.cloudlab.us Signed-off-by: Andrea Mayer Reviewed-by: David Ahern Signed-off-by: David S. Miller --- include/uapi/linux/seg6_local.h | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h index 3b39ef1dbb46..5ae3ace84de0 100644 --- a/include/uapi/linux/seg6_local.h +++ b/include/uapi/linux/seg6_local.h @@ -27,6 +27,7 @@ enum { SEG6_LOCAL_OIF, SEG6_LOCAL_BPF, SEG6_LOCAL_VRFTABLE, + SEG6_LOCAL_COUNTERS, __SEG6_LOCAL_MAX, }; #define SEG6_LOCAL_MAX (__SEG6_LOCAL_MAX - 1) @@ -78,4 +79,33 @@ enum { #define SEG6_LOCAL_BPF_PROG_MAX (__SEG6_LOCAL_BPF_PROG_MAX - 1) +/* SRv6 Behavior counters are encoded as netlink attributes guaranteeing the + * correct alignment. + * Each counter is identified by a different attribute type (i.e. + * SEG6_LOCAL_CNT_PACKETS). + * + * - SEG6_LOCAL_CNT_PACKETS: identifies a counter that counts the number of + * packets that have been CORRECTLY processed by an SRv6 Behavior instance + * (i.e., packets that generate errors or are dropped are NOT counted). + * + * - SEG6_LOCAL_CNT_BYTES: identifies a counter that counts the total amount + * of traffic in bytes of all packets that have been CORRECTLY processed by + * an SRv6 Behavior instance (i.e., packets that generate errors or are + * dropped are NOT counted). + * + * - SEG6_LOCAL_CNT_ERRORS: identifies a counter that counts the number of + * packets that have NOT been properly processed by an SRv6 Behavior instance + * (i.e., packets that generate errors or are dropped). + */ +enum { + SEG6_LOCAL_CNT_UNSPEC, + SEG6_LOCAL_CNT_PAD, /* pad for 64 bits values */ + SEG6_LOCAL_CNT_PACKETS, + SEG6_LOCAL_CNT_BYTES, + SEG6_LOCAL_CNT_ERRORS, + __SEG6_LOCAL_CNT_MAX, +}; + +#define SEG6_LOCAL_CNT_MAX (__SEG6_LOCAL_CNT_MAX - 1) + #endif -- cgit v1.2.3-71-gd317 From c7d13358b6a2f49f81a34aa323a2d0878a0532a2 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Fri, 30 Apr 2021 14:00:13 +0200 Subject: netfilter: xt_SECMARK: add new revision to fix structure layout This extension breaks when trying to delete rules, add a new revision to fix this. Fixes: 5e6874cdb8de ("[SECMARK]: Add xtables SECMARK target") Signed-off-by: Phil Sutter Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/xt_SECMARK.h | 6 +++ net/netfilter/xt_SECMARK.c | 88 ++++++++++++++++++++++++------- 2 files changed, 75 insertions(+), 19 deletions(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/netfilter/xt_SECMARK.h b/include/uapi/linux/netfilter/xt_SECMARK.h index 1f2a708413f5..beb2cadba8a9 100644 --- a/include/uapi/linux/netfilter/xt_SECMARK.h +++ b/include/uapi/linux/netfilter/xt_SECMARK.h @@ -20,4 +20,10 @@ struct xt_secmark_target_info { char secctx[SECMARK_SECCTX_MAX]; }; +struct xt_secmark_target_info_v1 { + __u8 mode; + char secctx[SECMARK_SECCTX_MAX]; + __u32 secid; +}; + #endif /*_XT_SECMARK_H_target */ diff --git a/net/netfilter/xt_SECMARK.c b/net/netfilter/xt_SECMARK.c index 75625d13e976..498a0bf6f044 100644 --- a/net/netfilter/xt_SECMARK.c +++ b/net/netfilter/xt_SECMARK.c @@ -24,10 +24,9 @@ MODULE_ALIAS("ip6t_SECMARK"); static u8 mode; static unsigned int -secmark_tg(struct sk_buff *skb, const struct xt_action_param *par) +secmark_tg(struct sk_buff *skb, const struct xt_secmark_target_info_v1 *info) { u32 secmark = 0; - const struct xt_secmark_target_info *info = par->targinfo; switch (mode) { case SECMARK_MODE_SEL: @@ -41,7 +40,7 @@ secmark_tg(struct sk_buff *skb, const struct xt_action_param *par) return XT_CONTINUE; } -static int checkentry_lsm(struct xt_secmark_target_info *info) +static int checkentry_lsm(struct xt_secmark_target_info_v1 *info) { int err; @@ -73,15 +72,15 @@ static int checkentry_lsm(struct xt_secmark_target_info *info) return 0; } -static int secmark_tg_check(const struct xt_tgchk_param *par) +static int +secmark_tg_check(const char *table, struct xt_secmark_target_info_v1 *info) { - struct xt_secmark_target_info *info = par->targinfo; int err; - if (strcmp(par->table, "mangle") != 0 && - strcmp(par->table, "security") != 0) { + if (strcmp(table, "mangle") != 0 && + strcmp(table, "security") != 0) { pr_info_ratelimited("only valid in \'mangle\' or \'security\' table, not \'%s\'\n", - par->table); + table); return -EINVAL; } @@ -116,25 +115,76 @@ static void secmark_tg_destroy(const struct xt_tgdtor_param *par) } } -static struct xt_target secmark_tg_reg __read_mostly = { - .name = "SECMARK", - .revision = 0, - .family = NFPROTO_UNSPEC, - .checkentry = secmark_tg_check, - .destroy = secmark_tg_destroy, - .target = secmark_tg, - .targetsize = sizeof(struct xt_secmark_target_info), - .me = THIS_MODULE, +static int secmark_tg_check_v0(const struct xt_tgchk_param *par) +{ + struct xt_secmark_target_info *info = par->targinfo; + struct xt_secmark_target_info_v1 newinfo = { + .mode = info->mode, + }; + int ret; + + memcpy(newinfo.secctx, info->secctx, SECMARK_SECCTX_MAX); + + ret = secmark_tg_check(par->table, &newinfo); + info->secid = newinfo.secid; + + return ret; +} + +static unsigned int +secmark_tg_v0(struct sk_buff *skb, const struct xt_action_param *par) +{ + const struct xt_secmark_target_info *info = par->targinfo; + struct xt_secmark_target_info_v1 newinfo = { + .secid = info->secid, + }; + + return secmark_tg(skb, &newinfo); +} + +static int secmark_tg_check_v1(const struct xt_tgchk_param *par) +{ + return secmark_tg_check(par->table, par->targinfo); +} + +static unsigned int +secmark_tg_v1(struct sk_buff *skb, const struct xt_action_param *par) +{ + return secmark_tg(skb, par->targinfo); +} + +static struct xt_target secmark_tg_reg[] __read_mostly = { + { + .name = "SECMARK", + .revision = 0, + .family = NFPROTO_UNSPEC, + .checkentry = secmark_tg_check_v0, + .destroy = secmark_tg_destroy, + .target = secmark_tg_v0, + .targetsize = sizeof(struct xt_secmark_target_info), + .me = THIS_MODULE, + }, + { + .name = "SECMARK", + .revision = 1, + .family = NFPROTO_UNSPEC, + .checkentry = secmark_tg_check_v1, + .destroy = secmark_tg_destroy, + .target = secmark_tg_v1, + .targetsize = sizeof(struct xt_secmark_target_info_v1), + .usersize = offsetof(struct xt_secmark_target_info_v1, secid), + .me = THIS_MODULE, + }, }; static int __init secmark_tg_init(void) { - return xt_register_target(&secmark_tg_reg); + return xt_register_targets(secmark_tg_reg, ARRAY_SIZE(secmark_tg_reg)); } static void __exit secmark_tg_exit(void) { - xt_unregister_target(&secmark_tg_reg); + xt_unregister_targets(secmark_tg_reg, ARRAY_SIZE(secmark_tg_reg)); } module_init(secmark_tg_init); -- cgit v1.2.3-71-gd317