123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108 |
- APEI Error INJection
- ~~~~~~~~~~~~~~~~~~~~
- EINJ provides a hardware error injection mechanism
- It is very useful for debugging and testing of other APEI and RAS features.
- To use EINJ, make sure the following are enabled in your kernel
- configuration:
- CONFIG_DEBUG_FS
- CONFIG_ACPI_APEI
- CONFIG_ACPI_APEI_EINJ
- The user interface of EINJ is debug file system, under the
- directory apei/einj. The following files are provided.
- - available_error_type
- Reading this file returns the error injection capability of the
- platform, that is, which error types are supported. The error type
- definition is as follow, the left field is the error type value, the
- right field is error description.
- 0x00000001 Processor Correctable
- 0x00000002 Processor Uncorrectable non-fatal
- 0x00000004 Processor Uncorrectable fatal
- 0x00000008 Memory Correctable
- 0x00000010 Memory Uncorrectable non-fatal
- 0x00000020 Memory Uncorrectable fatal
- 0x00000040 PCI Express Correctable
- 0x00000080 PCI Express Uncorrectable fatal
- 0x00000100 PCI Express Uncorrectable non-fatal
- 0x00000200 Platform Correctable
- 0x00000400 Platform Uncorrectable non-fatal
- 0x00000800 Platform Uncorrectable fatal
- The format of file contents are as above, except there are only the
- available error type lines.
- - error_type
- This file is used to set the error type value. The error type value
- is defined in "available_error_type" description.
- - error_inject
- Write any integer to this file to trigger the error
- injection. Before this, please specify all necessary error
- parameters.
- - param1
- This file is used to set the first error parameter value. Effect of
- parameter depends on error_type specified.
- - param2
- This file is used to set the second error parameter value. Effect of
- parameter depends on error_type specified.
- - notrigger
- The EINJ mechanism is a two step process. First inject the error, then
- perform some actions to trigger it. Setting "notrigger" to 1 skips the
- trigger phase, which *may* allow the user to cause the error in some other
- context by a simple access to the cpu, memory location, or device that is
- the target of the error injection. Whether this actually works depends
- on what operations the BIOS actually includes in the trigger phase.
- BIOS versions based in the ACPI 4.0 specification have limited options
- to control where the errors are injected. Your BIOS may support an
- extension (enabled with the param_extension=1 module parameter, or
- boot command line einj.param_extension=1). This allows the address
- and mask for memory injections to be specified by the param1 and
- param2 files in apei/einj.
- BIOS versions using the ACPI 5.0 specification have more control over
- the target of the injection. For processor related errors (type 0x1,
- 0x2 and 0x4) the APICID of the target should be provided using the
- param1 file in apei/einj. For memory errors (type 0x8, 0x10 and 0x20)
- the address is set using param1 with a mask in param2 (0x0 is equivalent
- to all ones). For PCI express errors (type 0x40, 0x80 and 0x100) the
- segment, bus, device and function are specified using param1:
- 31 24 23 16 15 11 10 8 7 0
- +-------------------------------------------------+
- | segment | bus | device | function | reserved |
- +-------------------------------------------------+
- An ACPI 5.0 BIOS may also allow vendor specific errors to be injected.
- In this case a file named vendor will contain identifying information
- from the BIOS that hopefully will allow an application wishing to use
- the vendor specific extension to tell that they are running on a BIOS
- that supports it. All vendor extensions have the 0x80000000 bit set in
- error_type. A file vendor_flags controls the interpretation of param1
- and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor
- documentation for details (and expect changes to this API if vendors
- creativity in using this feature expands beyond our expectations).
- Example:
- # cd /sys/kernel/debug/apei/einj
- # cat available_error_type # See which errors can be injected
- 0x00000002 Processor Uncorrectable non-fatal
- 0x00000008 Memory Correctable
- 0x00000010 Memory Uncorrectable non-fatal
- # echo 0x12345000 > param1 # Set memory address for injection
- # echo 0xfffffffffffff000 > param2 # Mask - anywhere in this page
- # echo 0x8 > error_type # Choose correctable memory error
- # echo 1 > error_inject # Inject now
- For more information about EINJ, please refer to ACPI specification
- version 4.0, section 17.5 and ACPI 5.0, section 18.6.
|