cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

ftrace-uses.rst (12150B)


      1=================================
      2Using ftrace to hook to functions
      3=================================
      4
      5.. Copyright 2017 VMware Inc.
      6..   Author:   Steven Rostedt <srostedt@goodmis.org>
      7..  License:   The GNU Free Documentation License, Version 1.2
      8..               (dual licensed under the GPL v2)
      9
     10Written for: 4.14
     11
     12Introduction
     13============
     14
     15The ftrace infrastructure was originally created to attach callbacks to the
     16beginning of functions in order to record and trace the flow of the kernel.
     17But callbacks to the start of a function can have other use cases. Either
     18for live kernel patching, or for security monitoring. This document describes
     19how to use ftrace to implement your own function callbacks.
     20
     21
     22The ftrace context
     23==================
     24.. warning::
     25
     26  The ability to add a callback to almost any function within the
     27  kernel comes with risks. A callback can be called from any context
     28  (normal, softirq, irq, and NMI). Callbacks can also be called just before
     29  going to idle, during CPU bring up and takedown, or going to user space.
     30  This requires extra care to what can be done inside a callback. A callback
     31  can be called outside the protective scope of RCU.
     32
     33There are helper functions to help against recursion, and making sure
     34RCU is watching. These are explained below.
     35
     36
     37The ftrace_ops structure
     38========================
     39
     40To register a function callback, a ftrace_ops is required. This structure
     41is used to tell ftrace what function should be called as the callback
     42as well as what protections the callback will perform and not require
     43ftrace to handle.
     44
     45There is only one field that is needed to be set when registering
     46an ftrace_ops with ftrace:
     47
     48.. code-block:: c
     49
     50 struct ftrace_ops ops = {
     51       .func			= my_callback_func,
     52       .flags			= MY_FTRACE_FLAGS
     53       .private			= any_private_data_structure,
     54 };
     55
     56Both .flags and .private are optional. Only .func is required.
     57
     58To enable tracing call::
     59
     60    register_ftrace_function(&ops);
     61
     62To disable tracing call::
     63
     64    unregister_ftrace_function(&ops);
     65
     66The above is defined by including the header::
     67
     68    #include <linux/ftrace.h>
     69
     70The registered callback will start being called some time after the
     71register_ftrace_function() is called and before it returns. The exact time
     72that callbacks start being called is dependent upon architecture and scheduling
     73of services. The callback itself will have to handle any synchronization if it
     74must begin at an exact moment.
     75
     76The unregister_ftrace_function() will guarantee that the callback is
     77no longer being called by functions after the unregister_ftrace_function()
     78returns. Note that to perform this guarantee, the unregister_ftrace_function()
     79may take some time to finish.
     80
     81
     82The callback function
     83=====================
     84
     85The prototype of the callback function is as follows (as of v4.14):
     86
     87.. code-block:: c
     88
     89   void callback_func(unsigned long ip, unsigned long parent_ip,
     90                      struct ftrace_ops *op, struct pt_regs *regs);
     91
     92@ip
     93	 This is the instruction pointer of the function that is being traced.
     94      	 (where the fentry or mcount is within the function)
     95
     96@parent_ip
     97	This is the instruction pointer of the function that called the
     98	the function being traced (where the call of the function occurred).
     99
    100@op
    101	This is a pointer to ftrace_ops that was used to register the callback.
    102	This can be used to pass data to the callback via the private pointer.
    103
    104@regs
    105	If the FTRACE_OPS_FL_SAVE_REGS or FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED
    106	flags are set in the ftrace_ops structure, then this will be pointing
    107	to the pt_regs structure like it would be if an breakpoint was placed
    108	at the start of the function where ftrace was tracing. Otherwise it
    109	either contains garbage, or NULL.
    110
    111Protect your callback
    112=====================
    113
    114As functions can be called from anywhere, and it is possible that a function
    115called by a callback may also be traced, and call that same callback,
    116recursion protection must be used. There are two helper functions that
    117can help in this regard. If you start your code with:
    118
    119.. code-block:: c
    120
    121	int bit;
    122
    123	bit = ftrace_test_recursion_trylock(ip, parent_ip);
    124	if (bit < 0)
    125		return;
    126
    127and end it with:
    128
    129.. code-block:: c
    130
    131	ftrace_test_recursion_unlock(bit);
    132
    133The code in between will be safe to use, even if it ends up calling a
    134function that the callback is tracing. Note, on success,
    135ftrace_test_recursion_trylock() will disable preemption, and the
    136ftrace_test_recursion_unlock() will enable it again (if it was previously
    137enabled). The instruction pointer (ip) and its parent (parent_ip) is passed to
    138ftrace_test_recursion_trylock() to record where the recursion happened
    139(if CONFIG_FTRACE_RECORD_RECURSION is set).
    140
    141Alternatively, if the FTRACE_OPS_FL_RECURSION flag is set on the ftrace_ops
    142(as explained below), then a helper trampoline will be used to test
    143for recursion for the callback and no recursion test needs to be done.
    144But this is at the expense of a slightly more overhead from an extra
    145function call.
    146
    147If your callback accesses any data or critical section that requires RCU
    148protection, it is best to make sure that RCU is "watching", otherwise
    149that data or critical section will not be protected as expected. In this
    150case add:
    151
    152.. code-block:: c
    153
    154	if (!rcu_is_watching())
    155		return;
    156
    157Alternatively, if the FTRACE_OPS_FL_RCU flag is set on the ftrace_ops
    158(as explained below), then a helper trampoline will be used to test
    159for rcu_is_watching for the callback and no other test needs to be done.
    160But this is at the expense of a slightly more overhead from an extra
    161function call.
    162
    163
    164The ftrace FLAGS
    165================
    166
    167The ftrace_ops flags are all defined and documented in include/linux/ftrace.h.
    168Some of the flags are used for internal infrastructure of ftrace, but the
    169ones that users should be aware of are the following:
    170
    171FTRACE_OPS_FL_SAVE_REGS
    172	If the callback requires reading or modifying the pt_regs
    173	passed to the callback, then it must set this flag. Registering
    174	a ftrace_ops with this flag set on an architecture that does not
    175	support passing of pt_regs to the callback will fail.
    176
    177FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED
    178	Similar to SAVE_REGS but the registering of a
    179	ftrace_ops on an architecture that does not support passing of regs
    180	will not fail with this flag set. But the callback must check if
    181	regs is NULL or not to determine if the architecture supports it.
    182
    183FTRACE_OPS_FL_RECURSION
    184	By default, it is expected that the callback can handle recursion.
    185	But if the callback is not that worried about overehead, then
    186	setting this bit will add the recursion protection around the
    187	callback by calling a helper function that will do the recursion
    188	protection and only call the callback if it did not recurse.
    189
    190	Note, if this flag is not set, and recursion does occur, it could
    191	cause the system to crash, and possibly reboot via a triple fault.
    192
    193	Not, if this flag is set, then the callback will always be called
    194	with preemption disabled. If it is not set, then it is possible
    195	(but not guaranteed) that the callback will be called in
    196	preemptable context.
    197
    198FTRACE_OPS_FL_IPMODIFY
    199	Requires FTRACE_OPS_FL_SAVE_REGS set. If the callback is to "hijack"
    200	the traced function (have another function called instead of the
    201	traced function), it requires setting this flag. This is what live
    202	kernel patches uses. Without this flag the pt_regs->ip can not be
    203	modified.
    204
    205	Note, only one ftrace_ops with FTRACE_OPS_FL_IPMODIFY set may be
    206	registered to any given function at a time.
    207
    208FTRACE_OPS_FL_RCU
    209	If this is set, then the callback will only be called by functions
    210	where RCU is "watching". This is required if the callback function
    211	performs any rcu_read_lock() operation.
    212
    213	RCU stops watching when the system goes idle, the time when a CPU
    214	is taken down and comes back online, and when entering from kernel
    215	to user space and back to kernel space. During these transitions,
    216	a callback may be executed and RCU synchronization will not protect
    217	it.
    218
    219FTRACE_OPS_FL_PERMANENT
    220        If this is set on any ftrace ops, then the tracing cannot disabled by
    221        writing 0 to the proc sysctl ftrace_enabled. Equally, a callback with
    222        the flag set cannot be registered if ftrace_enabled is 0.
    223
    224        Livepatch uses it not to lose the function redirection, so the system
    225        stays protected.
    226
    227
    228Filtering which functions to trace
    229==================================
    230
    231If a callback is only to be called from specific functions, a filter must be
    232set up. The filters are added by name, or ip if it is known.
    233
    234.. code-block:: c
    235
    236   int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
    237                         int len, int reset);
    238
    239@ops
    240	The ops to set the filter with
    241
    242@buf
    243	The string that holds the function filter text.
    244@len
    245	The length of the string.
    246
    247@reset
    248	Non-zero to reset all filters before applying this filter.
    249
    250Filters denote which functions should be enabled when tracing is enabled.
    251If @buf is NULL and reset is set, all functions will be enabled for tracing.
    252
    253The @buf can also be a glob expression to enable all functions that
    254match a specific pattern.
    255
    256See Filter Commands in :file:`Documentation/trace/ftrace.rst`.
    257
    258To just trace the schedule function:
    259
    260.. code-block:: c
    261
    262   ret = ftrace_set_filter(&ops, "schedule", strlen("schedule"), 0);
    263
    264To add more functions, call the ftrace_set_filter() more than once with the
    265@reset parameter set to zero. To remove the current filter set and replace it
    266with new functions defined by @buf, have @reset be non-zero.
    267
    268To remove all the filtered functions and trace all functions:
    269
    270.. code-block:: c
    271
    272   ret = ftrace_set_filter(&ops, NULL, 0, 1);
    273
    274
    275Sometimes more than one function has the same name. To trace just a specific
    276function in this case, ftrace_set_filter_ip() can be used.
    277
    278.. code-block:: c
    279
    280   ret = ftrace_set_filter_ip(&ops, ip, 0, 0);
    281
    282Although the ip must be the address where the call to fentry or mcount is
    283located in the function. This function is used by perf and kprobes that
    284gets the ip address from the user (usually using debug info from the kernel).
    285
    286If a glob is used to set the filter, functions can be added to a "notrace"
    287list that will prevent those functions from calling the callback.
    288The "notrace" list takes precedence over the "filter" list. If the
    289two lists are non-empty and contain the same functions, the callback will not
    290be called by any function.
    291
    292An empty "notrace" list means to allow all functions defined by the filter
    293to be traced.
    294
    295.. code-block:: c
    296
    297   int ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf,
    298                          int len, int reset);
    299
    300This takes the same parameters as ftrace_set_filter() but will add the
    301functions it finds to not be traced. This is a separate list from the
    302filter list, and this function does not modify the filter list.
    303
    304A non-zero @reset will clear the "notrace" list before adding functions
    305that match @buf to it.
    306
    307Clearing the "notrace" list is the same as clearing the filter list
    308
    309.. code-block:: c
    310
    311  ret = ftrace_set_notrace(&ops, NULL, 0, 1);
    312
    313The filter and notrace lists may be changed at any time. If only a set of
    314functions should call the callback, it is best to set the filters before
    315registering the callback. But the changes may also happen after the callback
    316has been registered.
    317
    318If a filter is in place, and the @reset is non-zero, and @buf contains a
    319matching glob to functions, the switch will happen during the time of
    320the ftrace_set_filter() call. At no time will all functions call the callback.
    321
    322.. code-block:: c
    323
    324   ftrace_set_filter(&ops, "schedule", strlen("schedule"), 1);
    325
    326   register_ftrace_function(&ops);
    327
    328   msleep(10);
    329
    330   ftrace_set_filter(&ops, "try_to_wake_up", strlen("try_to_wake_up"), 1);
    331
    332is not the same as:
    333
    334.. code-block:: c
    335
    336   ftrace_set_filter(&ops, "schedule", strlen("schedule"), 1);
    337
    338   register_ftrace_function(&ops);
    339
    340   msleep(10);
    341
    342   ftrace_set_filter(&ops, NULL, 0, 1);
    343
    344   ftrace_set_filter(&ops, "try_to_wake_up", strlen("try_to_wake_up"), 0);
    345
    346As the latter will have a short time where all functions will call
    347the callback, between the time of the reset, and the time of the
    348new setting of the filter.