sinitax/cachepc-linux - Fork of AMDESE/linux with modifications for CachePC side-channel attack

	Commit message (Collapse)	Author	Age	Files	Lines
*	crypto: ccp: Add SNP_INIT_EX support to initialize the AMD-SP for SEV-SNP	Ashish Kalra	2022-10-07	1	-5/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During the execution of SNP_INIT command, the firmware configures and enables SNP security policy enforcement in many system components. Some system components write to regions of memory reserved by early x86 firmware (e.g. UEFI). Other system components write to regions provided by the operation system, hypervisor, or x86 firmware. Such system components can only write to HV-fixed pages or Default pages. They will error when attempting to write to other page states after SNP_INIT enables their SNP enforcement. Starting in SNP firmware v1.52, the SNP_INIT_EX command takes a list of system physical address ranges to convert into the HV-fixed page states during the RMP initialization. If INIT_RMP is 1, hypervisors should provide all system physical address ranges that the hypervisor will never assign to a guest until the next RMP re-initialization. For instance, the memory that UEFI reserves should be included in the range list. This allows system components that occasionally write to memory (e.g. logging to UEFI reserved regions) to not fail due to RMP initialization and SNP enablement. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
*	ccp: remove firmware pages in RMP table only if SNP is initialized	Ashish Kalra	2022-07-14	1	-1/+4
\| \| \| \| \| \| \|	While freeing firmware pages make sure SEV-SNP is initialized before removing the page in RMP table Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
*	ccp: add support to decrypt the page	Brijesh Singh	2022-07-13	1	-3/+30
\| \| \| \| \| \| \|	Add support to decrypt guest encrypted memory, these API interfaces can be used for example to dump VMCBs on SNP guest exit. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto: ccp: Provide APIs to query extended attestation report	Brijesh Singh	2022-07-13	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \|	Version 2 of the GHCB specification defines VMGEXIT that is used to get the extended attestation report. The extended attestation report includes the certificate blobs provided through the SNP_SET_EXT_CONFIG. The snp_guest_ext_guest_request() will be used by the hypervisor to get the extended attestation report. See the GHCB specification for more details. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto: ccp: Add the SNP_{SET,GET}_EXT_CONFIG command	Brijesh Singh	2022-07-13	2	-0/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SEV-SNP firmware provides the SNP_CONFIG command used to set the system-wide configuration value for SNP guests. The information includes the TCB version string to be reported in guest attestation reports. Version 2 of the GHCB specification adds an NAE (SNP extended guest request) that a guest can use to query the reports that include additional certificates. In both cases, userspace provided additional data is included in the attestation reports. The userspace will use the SNP_SET_EXT_CONFIG command to give the certificate blob and the reported TCB version string at once. Note that the specification defines certificate blob with a specific GUID format; the userspace is responsible for building the proper certificate blob. The ioctl treats it an opaque blob. While it is not defined in the spec, but let's add SNP_GET_EXT_CONFIG command that can be used to obtain the data programmed through the SNP_SET_EXT_CONFIG. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto: ccp: Add the SNP_PLATFORM_STATUS command	Brijesh Singh	2022-07-13	1	-0/+45
\| \| \| \| \| \| \|	The command can be used by the userspace to query the SNP platform status report. See the SEV-SNP spec for more details. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto: ccp: Handle the legacy SEV command when SNP is enabled	Brijesh Singh	2022-07-13	2	-10/+348
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The behavior of the SEV-legacy commands is altered when the SNP firmware is in the INIT state. When SNP is in INIT state, all the SEV-legacy commands that cause the firmware to write to memory must be in the firmware state before issuing the command.. A command buffer may contains a system physical address that the firmware may write to. There are two cases that need to be handled: 1) system physical address points to a guest memory 2) system physical address points to a host memory To handle the case #1, change the page state to the firmware in the RMP table before issuing the command and restore the state to shared after the command completes. For the case #2, use a bounce buffer to complete the request. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto: ccp: Handle the legacy TMR allocation when SNP is enabled	Brijesh Singh	2022-07-13	1	-6/+167
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The behavior and requirement for the SEV-legacy command is altered when the SNP firmware is in the INIT state. See SEV-SNP firmware specification for more details. Allocate the Trusted Memory Region (TMR) as a 2mb sized/aligned region when SNP is enabled to satify new requirements for the SNP. Continue allocating a 1mb region for !SNP configuration. While at it, provide API that can be used by others to allocate a page that can be used by the firmware. The immediate user for this API will be the KVM driver. The KVM driver to need to allocate a firmware context page during the guest creation. The context page need to be updated by the firmware. See the SEV-SNP specification for further details. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto:ccp: Provide APIs to issue SEV-SNP commands	Brijesh Singh	2022-07-13	1	-0/+24
\| \| \| \| \| \| \|	Provide the APIs for the hypervisor to manage an SEV-SNP guest. The commands for SEV-SNP is defined in the SEV-SNP firmware specification. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP	Brijesh Singh	2022-07-13	2	-0/+123
\| \| \| \| \| \| \| \| \| \|	Before SNP VMs can be launched, the platform must be appropriately configured and initialized. Platform initialization is accomplished via the SNP_INIT command. Make sure to do a WBINVD and issue DF_FLUSH command to prepare for the first SNP guest launch after INIT. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off by: Ashish Kalra <ashish.kalra@amd.com>
*	crypto:ccp: Define the SEV-SNP commands	Brijesh Singh	2022-07-13	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \|	AMD introduced the next generation of SEV called SEV-SNP (Secure Nested Paging). SEV-SNP builds upon existing SEV and SEV-ES functionality while adding new hardware security protection. Define the commands and structures used to communicate with the AMD-SP when creating and managing the SEV-SNP guests. The SEV-SNP firmware spec is available at developer.amd.com/sev. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
*	crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations	Jason A. Donenfeld	2022-07-06	1	-115/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various accelerated software implementation Kconfig values for S390 were mistakenly placed into drivers/crypto/Kconfig, even though they're mainly just SIMD code and live in arch/s390/crypto/ like usual. This gives them the very unusual dependency on CRYPTO_HW, which leads to problems elsewhere. This patch fixes the issue by moving the Kconfig values for non-hardware drivers into the usual place in crypto/Kconfig. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	crypto: ccp - Fix device IRQ counting by using platform_irq_count()	Tom Lendacky	2022-06-24	1	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ccp driver loops through the platform device resources array to get the IRQ count for the device. With commit a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core"), the IRQ resources are no longer stored in the platform device resource array. As a result, the IRQ count is now always zero. This causes the driver to issue a second call to platform_get_irq(), which fails if the IRQ count is really 1, causing the loading of the driver to fail. Replace looping through the resources array to count the number of IRQs with a call to platform_irq_count(). Fixes: a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
*	virtio-crypto: enable retry for virtio-crypto-dev	lei he	2022-05-31	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable retry for virtio-crypto-dev, so that crypto-engine can process cipher-requests parallelly. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: lei he <helei.sig11@bytedance.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-6-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: adjust dst_len at ops callback	lei he	2022-05-31	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some akcipher operations(eg, decryption of pkcs1pad(rsa)), the length of returned result maybe less than akcipher_req->dst_len, we need to recalculate the actual dst_len through the virt-queue protocol. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: lei he <helei.sig11@bytedance.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-5-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: wait ctrl queue instead of busy polling	zhenwei pi	2022-05-31	4	-55/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, after submitting request into virtio crypto control queue, the guest side polls the result from the virt queue. This works like following: CPU0 CPU1 ... CPUx CPUy \| \| \| \| \ \ / / \--------spin_lock(&vcrypto->ctrl_lock)-------/ \| virtqueue add & kick \| busy poll virtqueue \| spin_unlock(&vcrypto->ctrl_lock) ... There are two problems: 1, The queue depth is always 1, the performance of a virtio crypto device gets limited. Multi user processes share a single control queue, and hit spin lock race from control queue. Test on Intel Platinum 8260, a single worker gets ~35K/s create/close session operations, and 8 workers get ~40K/s operations with 800% CPU utilization. 2, The control request is supposed to get handled immediately, but in the current implementation of QEMU(v6.2), the vCPU thread kicks another thread to do this work, the latency also gets unstable. Tracking latency of virtio_crypto_alg_akcipher_close_session in 5s: usecs : count distribution 0 -> 1 : 0 \| \| 2 -> 3 : 7 \| \| 4 -> 7 : 72 \| \| 8 -> 15 : 186485 \|************************\| 16 -> 31 : 687 \| \| 32 -> 63 : 5 \| \| 64 -> 127 : 3 \| \| 128 -> 255 : 1 \| \| 256 -> 511 : 0 \| \| 512 -> 1023 : 0 \| \| 1024 -> 2047 : 0 \| \| 2048 -> 4095 : 0 \| \| 4096 -> 8191 : 0 \| \| 8192 -> 16383 : 2 \| \| This means that a CPU may hold vcrypto->ctrl_lock as long as 8192~16383us. To improve the performance of control queue, a request on control queue waits completion instead of busy polling to reduce lock racing, and gets completed by control queue callback. CPU0 CPU1 ... CPUx CPUy \| \| \| \| \ \ / / \--------spin_lock(&vcrypto->ctrl_lock)-------/ \| virtqueue add & kick \| ---------spin_unlock(&vcrypto->ctrl_lock)------ / / \ \ \| \| \| \| wait wait wait wait Test this patch, the guest side get ~200K/s operations with 300% CPU utilization. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-4-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: use private buffer for control request	zhenwei pi	2022-05-31	3	-45/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, all of the control requests share a single buffer( ctrl & input & ctrl_status fields in struct virtio_crypto), this allows queue depth 1 only, the performance of control queue gets limited by this design. In this patch, each request allocates request buffer dynamically, and free buffer after request, so the scope protected by ctrl_lock also get optimized here. It's possible to optimize control queue depth in the next step. A necessary comment is already in code, still describe it again: /* * Note: there are padding fields in request, clear them to zero before * sending to host to avoid to divulge any information. * Ex, virtio_crypto_ctrl_request::ctrl::u::destroy_session::padding[48] */ So use kzalloc to allocate buffer of struct virtio_crypto_ctrl_request. Potentially dereferencing uninitialized variables: Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-3-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: change code style	zhenwei pi	2022-05-31	2	-53/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use temporary variable to make code easy to read and maintain. /* Pad cipher's parameters */ vcrypto->ctrl.u.sym_create_session.op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER); vcrypto->ctrl.u.sym_create_session.u.cipher.para.algo = vcrypto->ctrl.header.algo; vcrypto->ctrl.u.sym_create_session.u.cipher.para.keylen = cpu_to_le32(keylen); vcrypto->ctrl.u.sym_create_session.u.cipher.para.op = cpu_to_le32(op); --> sym_create_session = &ctrl->u.sym_create_session; sym_create_session->op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER); sym_create_session->u.cipher.para.algo = ctrl->header.algo; sym_create_session->u.cipher.para.keylen = cpu_to_le32(keylen); sym_create_session->u.cipher.para.op = cpu_to_le32(op); The new style shows more obviously: - the variable we want to operate. - an assignment statement in a single line. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-2-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	Merge tag 'powerpc-5.19-1' of ↵	Linus Torvalds	2022-05-28	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later) - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later - Drop support for system call instruction emulation - Many other small features and fixes Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng. * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/64: Include cache.h directly in paca.h powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set powerpc/xics: Include missing header powerpc/powernv/pci: Drop VF MPS fixup powerpc/fsl_book3e: Don't set rodata RO too early powerpc/microwatt: Add mmu bits to device tree powerpc/powernv/flash: Check OPAL flash calls exist before using powerpc/powermac: constify device_node in of_irq_parse_oldworld() powerpc/powermac: add missing g5_phy_disable_cpu1() declaration selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" powerpc: Enable the DAWR on POWER9 DD2.3 and above powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask powerpc: Fix all occurences of "the the" selftests/powerpc/pmu/ebb: remove fixed_instruction.S powerpc/platforms/83xx: Use of_device_get_match_data() powerpc/eeh: Drop redundant spinlock initialization powerpc/iommu: Add missing of_node_put in iommu_init_early_dart powerpc/pseries/vas: Call misc_deregister if sysfs init fails powerpc/papr_scm: Fix leaking nvdimm_events_map elements ...
\| *	powerpc/powernv/vas: Assign real address to rx_fifo in vas_rx_win_attr	Haren Myneni	2022-05-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In init_winctx_regs(), __pa() is called on winctx->rx_fifo and this function is called to initialize registers for receive and fault windows. But the real address is passed in winctx->rx_fifo for receive windows and the virtual address for fault windows which causes errors with DEBUG_VIRTUAL enabled. Fixes this issue by assigning only real address to rx_fifo in vas_rx_win_attr struct for both receive and fault windows. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/338e958c7ab8f3b266fa794a1f80f99b9671829e.camel@linux.ibm.com
* \|	Merge tag 'v5.19-p1' of ↵	Linus Torvalds	2022-05-27	77	-881/+2605
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Test in-place en/decryption with two sglists in testmgr - Fix process vs softirq race in cryptd Algorithms: - Add arm64 acceleration for sm4 - Add s390 acceleration for chacha20 Drivers: - Add polarfire soc hwrng support in mpsf - Add support for TI SoC AM62x in sa2ul - Add support for ATSHA204 cryptochip in atmel-sha204a - Add support for PRNG in caam - Restore support for storage encryption in qat - Restore support for storage encryption in hisilicon/sec" * tag 'v5.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) hwrng: omap3-rom - fix using wrong clk_disable() in omap_rom_rng_runtime_resume() crypto: hisilicon/sec - delete the flag CRYPTO_ALG_ALLOCATES_MEMORY crypto: qat - add support for 401xx devices crypto: qat - re-enable registration of algorithms crypto: qat - honor CRYPTO_TFM_REQ_MAY_SLEEP flag crypto: qat - add param check for DH crypto: qat - add param check for RSA crypto: qat - remove dma_free_coherent() for DH crypto: qat - remove dma_free_coherent() for RSA crypto: qat - fix memory leak in RSA crypto: qat - add backlog mechanism crypto: qat - refactor submission logic crypto: qat - use pre-allocated buffers in datapath crypto: qat - set to zero DH parameters before free crypto: s390 - add crypto library interface for ChaCha20 crypto: talitos - Uniform coding style with defined variable crypto: octeontx2 - simplify the return expression of otx2_cpt_aead_cbc_aes_sha_setkey() crypto: cryptd - Protect per-CPU resource by disabling BH. crypto: sun8i-ce - do not fallback if cryptlen is less than sg length crypto: sun8i-ce - rework debugging ...
\| * \|	crypto: hisilicon/sec - delete the flag CRYPTO_ALG_ALLOCATES_MEMORY	Kai Ye	2022-05-20	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Should not to uses the CRYPTO_ALG_ALLOCATES_MEMORY in SEC2. The SEC2 driver uses the pre-allocated buffers, including the src sgl pool, dst sgl pool and other qp ctx resources. (e.g. IV buffer, mac buffer, key buffer). The SEC2 driver doesn't allocate memory during request processing. The driver only maps software sgl to allocated hardware sgl during I/O. So here is fix it. Signed-off-by: Kai Ye <yekai13@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add support for 401xx devices	Giovanni Cabiddu	2022-05-20	4	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	QAT_401xx is a derivative of 4xxx. Add support for that device in the qat_4xxx driver by including the DIDs (both PF and VF), extending the probe and the firmware loader. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Srinivas Kerekare <srinivas.kerekare@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - re-enable registration of algorithms	Giovanni Cabiddu	2022-05-20	2	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-enable the registration of algorithms after fixes to (1) use pre-allocated buffers in the datapath and (2) support the CRYPTO_TFM_REQ_MAY_BACKLOG flag. This reverts commit 8893d27ffcaf6ec6267038a177cb87bcde4dd3de. Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - honor CRYPTO_TFM_REQ_MAY_SLEEP flag	Giovanni Cabiddu	2022-05-20	3	-14/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a request has the flag CRYPTO_TFM_REQ_MAY_SLEEP set, allocate memory using the flag GFP_KERNEL otherwise use GFP_ATOMIC. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add param check for DH	Giovanni Cabiddu	2022-05-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reject requests with a source buffer that is bigger than the size of the key. This is to prevent a possible integer underflow that might happen when copying the source scatterlist into a linear buffer. Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add param check for RSA	Giovanni Cabiddu	2022-05-20	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reject requests with a source buffer that is bigger than the size of the key. This is to prevent a possible integer underflow that might happen when copying the source scatterlist into a linear buffer. Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - remove dma_free_coherent() for DH	Giovanni Cabiddu	2022-05-20	1	-49/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions qat_dh_compute_value() allocates memory with dma_alloc_coherent() if the source or the destination buffers are made of multiple flat buffers or of a size that is not compatible with the hardware. This memory is then freed with dma_free_coherent() in the context of a tasklet invoked to handle the response for the corresponding request. According to Documentation/core-api/dma-api-howto.rst, the function dma_free_coherent() cannot be called in an interrupt context. Replace allocations with dma_alloc_coherent() in the function qat_dh_compute_value() with kmalloc() + dma_map_single(). Cc: stable@vger.kernel.org Fixes: c9839143ebbf ("crypto: qat - Add DH support") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - remove dma_free_coherent() for RSA	Giovanni Cabiddu	2022-05-20	1	-77/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit f5ff79fddf0e ("dma-mapping: remove CONFIG_DMA_REMAP"), if the algorithms are enabled, the driver crashes with a BUG_ON while executing vunmap() in the context of a tasklet. This is due to the fact that the function dma_free_coherent() cannot be called in an interrupt context (see Documentation/core-api/dma-api-howto.rst). The functions qat_rsa_enc() and qat_rsa_dec() allocate memory with dma_alloc_coherent() if the source or the destination buffers are made of multiple flat buffers or of a size that is not compatible with the hardware. This memory is then freed with dma_free_coherent() in the context of a tasklet invoked to handle the response for the corresponding request. Replace allocations with dma_alloc_coherent() in the functions qat_rsa_enc() and qat_rsa_dec() with kmalloc() + dma_map_single(). Cc: stable@vger.kernel.org Fixes: a990532023b9 ("crypto: qat - Add support for RSA algorithm") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - fix memory leak in RSA	Giovanni Cabiddu	2022-05-20	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an RSA key represented in form 2 (as defined in PKCS #1 V2.1) is used, some components of the private key persist even after the TFM is released. Replace the explicit calls to free the buffers in qat_rsa_exit_tfm() with a call to qat_rsa_clear_ctx() which frees all buffers referenced in the TFM context. Cc: stable@vger.kernel.org Fixes: 879f77e9071f ("crypto: qat - Add RSA CRT mode") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add backlog mechanism	Giovanni Cabiddu	2022-05-20	9	-18/+123
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The implementations of the crypto algorithms (aead, skcipher, etc) in the QAT driver do not properly support requests with the CRYPTO_TFM_REQ_MAY_BACKLOG flag set. If the HW queue is full, the driver returns -EBUSY but does not enqueue the request. This can result in applications like dm-crypt waiting indefinitely for the completion of a request that was never submitted to the hardware. Fix this by adding a software backlog queue: if the ring buffer is more than eighty percent full, then the request is enqueued to a backlog list and the error code -EBUSY is returned back to the caller. Requests in the backlog queue are resubmitted at a later time, in the context of the callback of a previously submitted request. The request for which -EBUSY is returned is then marked as -EINPROGRESS once submitted to the HW queues. The submission loop inside the function qat_alg_send_message() has been modified to decide which submission policy to use based on the request flags. If the request does not have the CRYPTO_TFM_REQ_MAY_BACKLOG set, the previous behaviour has been preserved. Based on a patch by Vishnu Das Ramachandran <vishnu.dasx.ramachandran@intel.com> Cc: stable@vger.kernel.org Fixes: d370cec32194 ("crypto: qat - Intel(R) QAT crypto interface") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Kyle Sanderson <kyle.leet@gmail.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - refactor submission logic	Giovanni Cabiddu	2022-05-20	6	-54/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the algorithms in qat_algs.c and qat_asym_algs.c use the same pattern to submit messages to the HW queues. Move the submission loop to a new function, qat_alg_send_message(), and share it between the symmetric and the asymmetric algorithms. As part of this rework, since the number of retries before returning an error is inconsistent between the symmetric and asymmetric implementations, set it to a value that works for both (i.e. 20, was 10 in qat_algs.c and 100 in qat_asym_algs.c) In addition fix the return code reported when the HW queues are full. In that case return -ENOSPC instead of -EBUSY. Including stable in CC since (1) the error code returned if the HW queues are full is incorrect and (2) to facilitate the backport of the next fix "crypto: qat - add backlog mechanism". Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - use pre-allocated buffers in datapath	Giovanni Cabiddu	2022-05-20	2	-27/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to do DMAs, the QAT device requires that the scatterlist structures are mapped and translated into a format that the firmware can understand. This is defined as the composition of a scatter gather list (SGL) descriptor header, the struct qat_alg_buf_list, plus a variable number of flat buffer descriptors, the struct qat_alg_buf. The allocation and mapping of these data structures is done each time a request is received from the skcipher and aead APIs. In an OOM situation, this behaviour might lead to a dead-lock if an allocation fails. Based on the conversation in [1], increase the size of the aead and skcipher request contexts to include an SGL descriptor that can handle a maximum of 4 flat buffers. If requests exceed 4 entries buffers, memory is allocated dynamically. [1] https://lore.kernel.org/linux-crypto/20200722072932.GA27544@gondor.apana.org.au/ Cc: stable@vger.kernel.org Fixes: d370cec32194 ("crypto: qat - Intel(R) QAT crypto interface") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - set to zero DH parameters before free	Giovanni Cabiddu	2022-05-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set to zero the context buffers containing the DH key before they are freed. This is a defense in depth measure that avoids keys to be recovered from memory in case the system is compromised between the free of the buffer and when that area of memory (containing keys) gets overwritten. Cc: stable@vger.kernel.org Fixes: c9839143ebbf ("crypto: qat - Add DH support") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: s390 - add crypto library interface for ChaCha20	Vladis Dronov	2022-05-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a crypto library interface for the s390-native ChaCha20 cipher algorithm. This allows us to stop to select CRYPTO_CHACHA20 and instead select CRYPTO_ARCH_HAVE_LIB_CHACHA. This allows BIG_KEYS=y not to build a whole ChaCha20 crypto infrastructure as a built-in, but build a smaller CRYPTO_LIB_CHACHA instead. Make CRYPTO_CHACHA_S390 config entry to look like similar ones on other architectures. Remove CRYPTO_ALGAPI select as anyway it is selected by CRYPTO_SKCIPHER. Add a new test module and a test script for ChaCha20 cipher and its interfaces. Here are test results on an idle z15 machine: Data \| Generic crypto TFM \| s390 crypto TFM \| s390 lib size \| enc dec \| enc dec \| enc dec -----+--------------------+------------------+---------------- 512b \| 1545ns 1295ns \| 604ns 446ns \| 430ns 407ns 4k \| 9536ns 9463ns \| 2329ns 2174ns \| 2170ns 2154ns 64k \| 149.6us 149.3us \| 34.4us 34.5us \| 33.9us 33.1us 6M \| 23.61ms 23.11ms \| 4223us 4160us \| 3951us 4008us 60M \| 143.9ms 143.9ms \| 33.5ms 33.2ms \| 32.2ms 32.1ms Signed-off-by: Vladis Dronov <vdronov@redhat.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: talitos - Uniform coding style with defined variable	jianchunfu	2022-05-13	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the defined variable "desc" to uniform coding style. Signed-off-by: jianchunfu <jianchunfu@cmss.chinamobile.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: octeontx2 - simplify the return expression of ↵	Minghao Chi	2022-05-13	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	otx2_cpt_aead_cbc_aes_sha_setkey() Simplify the return expression. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - do not fallback if cryptlen is less than sg length	Corentin Labbe	2022-05-13	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sg length could be more than remaining data on it. So check the length requirement against the minimum between those two values. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - rework debugging	Corentin Labbe	2022-05-13	4	-19/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "Fallback for xxx" message is annoying, remove it and store the information in the debugfs. Let's add more precise fallback stats and display it better. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - use sg_nents_for_len	Corentin Labbe	2022-05-13	2	-18/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When testing with some large SG list, the sun8i-ce drivers always fallback even if it can handle it. So use sg_nents_for_len() which permits to see less SGs than needed. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - Add function for handling hash padding	Corentin Labbe	2022-05-13	1	-30/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all padding work to a dedicated function. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - do not fallback if cryptlen is less than sg length	Corentin Labbe	2022-05-13	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sg length could be more than remaining data on it. So check the length requirement against the minimum between those two values. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - add hmac(sha1)	Corentin Labbe	2022-05-13	3	-6/+231
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even if sun8i-ss does not handle hmac(sha1) directly, we can provide one which use the already supported acceleration of sha1. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - Add function for handling hash padding	Corentin Labbe	2022-05-13	1	-22/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all padding work to a dedicated function. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - rework debugging	Corentin Labbe	2022-05-13	4	-22/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "Fallback for xxx" message is annoying, remove it and store the information in the debugfs. In the same time, reports more fallback statistics. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - handle requests if last block is not modulo 64	Corentin Labbe	2022-05-13	3	-10/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current sun8i-ss handle only requests with all SG length being modulo 64. But the last SG could be always handled by copying it on the pad buffer. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - do not zeroize all pad	Corentin Labbe	2022-05-13	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of memset all pad buffer, it is faster to only put 0 where needed. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - do not allocate memory when handling hash requests	Corentin Labbe	2022-05-13	3	-12/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of allocate memory on each requests, it is easier to pre-allocate buffers. This made error path easier. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - use sg_nents_for_len	Corentin Labbe	2022-05-13	1	-13/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When testing with some large SG list, the sun8i-ss drivers always fallback even if it can handle it. So use sg_nents_for_len() which permits to see less SGs than needed. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - test error before assigning	Corentin Labbe	2022-05-13	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The first thing we should do after dma_map_single() is to test the result. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>