filesystem-monitoring.rst (3078B)
1.. SPDX-License-Identifier: GPL-2.0 2 3==================================== 4File system Monitoring with fanotify 5==================================== 6 7File system Error Reporting 8=========================== 9 10Fanotify supports the FAN_FS_ERROR event type for file system-wide error 11reporting. It is meant to be used by file system health monitoring 12daemons, which listen for these events and take actions (notify 13sysadmin, start recovery) when a file system problem is detected. 14 15By design, a FAN_FS_ERROR notification exposes sufficient information 16for a monitoring tool to know a problem in the file system has happened. 17It doesn't necessarily provide a user space application with semantics 18to verify an IO operation was successfully executed. That is out of 19scope for this feature. Instead, it is only meant as a framework for 20early file system problem detection and reporting recovery tools. 21 22When a file system operation fails, it is common for dozens of kernel 23errors to cascade after the initial failure, hiding the original failure 24log, which is usually the most useful debug data to troubleshoot the 25problem. For this reason, FAN_FS_ERROR tries to report only the first 26error that occurred for a file system since the last notification, and 27it simply counts additional errors. This ensures that the most 28important pieces of information are never lost. 29 30FAN_FS_ERROR requires the fanotify group to be setup with the 31FAN_REPORT_FID flag. 32 33At the time of this writing, the only file system that emits FAN_FS_ERROR 34notifications is Ext4. 35 36A FAN_FS_ERROR Notification has the following format:: 37 38 :: 39 40 [ Notification Metadata (Mandatory) ] 41 [ Generic Error Record (Mandatory) ] 42 [ FID record (Mandatory) ] 43 44The order of records is not guaranteed, and new records might be added 45in the future. Therefore, applications must not rely on the order and 46must be prepared to skip over unknown records. Please refer to 47``samples/fanotify/fs-monitor.c`` for an example parser. 48 49Generic error record 50-------------------- 51 52The generic error record provides enough information for a file system 53agnostic tool to learn about a problem in the file system, without 54providing any additional details about the problem. This record is 55identified by ``struct fanotify_event_info_header.info_type`` being set 56to FAN_EVENT_INFO_TYPE_ERROR. 57 58 :: 59 60 struct fanotify_event_info_error { 61 struct fanotify_event_info_header hdr; 62 __s32 error; 63 __u32 error_count; 64 }; 65 66The `error` field identifies the type of error using errno values. 67`error_count` tracks the number of errors that occurred and were 68suppressed to preserve the original error information, since the last 69notification. 70 71FID record 72---------- 73 74The FID record can be used to uniquely identify the inode that triggered 75the error through the combination of fsid and file handle. A file system 76specific application can use that information to attempt a recovery 77procedure. Errors that are not related to an inode are reported with an 78empty file handle of type FILEID_INVALID.