commit 3e99b3b13a1fc8f7354edaee4c04f73a07faba69 Author: Ming Lei Date: Thu Jun 6 16:34:09 2019 +0800 scsi: core: don't preallocate small SGL in case of NO_SG_CHAIN The preallocated small SGL depends on SG_CHAIN so if the ARCH doesn't support SG_CHAIN, preallocation of small SGL can't work at all. Fix this issue by not using small preallocation in case of NO_SG_CHAIN. Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Ewan D. Milne Cc: Hannes Reinecke Cc: Guenter Roeck Reported-by: Guenter Roeck Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Tested-by: Guenter Roeck Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit b79d9a09ae23c7047bdce3a15e284398334198ea Author: Ming Lei Date: Thu Jun 6 16:34:08 2019 +0800 scsi: lib/sg_pool.c: clear 'first_chunk' in case of no preallocation If user doesn't ask to preallocate by passing zero 'nents_first_chunk' to sg_alloc_table_chained, we need to make sure that 'first_chunk' is cleared. Otherwise, __sg_alloc_table() still may think that the 1st SGL should be from the preallocation. Fixes the issue by clearing 'first_chunk' in sg_alloc_table_chained() if 'nents_first_chunk' is zero. Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Ewan D. Milne Cc: Hannes Reinecke Cc: Guenter Roeck Reported-by: Guenter Roeck Tested-by: Guenter Roeck Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 3dccdf53c2f38399b11085ded4447ce1467f006c Author: Ming Lei Date: Sun Apr 28 15:39:32 2019 +0800 scsi: core: avoid preallocating big SGL for data scsi_mq_setup_tags() preallocates a big buffer for the IO SGL. The size is based on scsi_mq_sgl_size() which is determined based on shost->sg_tablesize and SG_CHUNK_SIZE. Modern DMA engines are often capable of dealing with very big segments so the resulting scsi_mq_sgl_size() is often too big. SG_CHUNK_SIZE results in a static 4KB SGL allocation per command. If an HBA has lots of deep queues, preallocation for the sg list can consume substantial amounts of memory. For lpfc, nr_hw_queues can be 70 and each queue's depth 3781. This means the resulting preallocation for the data SGL is 70*3781*2K = 517MB. Switch to runtime allocation for SGL for lists longer than 2 entries. This is the approach used by NVMe PCI so it should be reasonable for SCSI as well. Runtime SGL allocation has always been the case for the legacy I/O path so this is nothing new. [mkp: attempted to clarify commit desc] Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Ewan D. Milne Cc: Hannes Reinecke Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 92524fa12312d1f082a473e14c590c48b4ef3fe5 Author: Ming Lei Date: Sun Apr 28 15:39:31 2019 +0800 scsi: core: avoid preallocating big SGL for protection information scsi_mq_setup_tags() currently preallocates a big buffer for protection SGL entries. scsi_mq_sgl_size() is used to determine the size for both data and protection information scatterlists but the protection buffer is usually much smaller. For example, one 512-byte sector needs 8 bytes of protection information. Given that the maximum number of sectors for one request is 2560 (BLK_DEF_MAX_SECTORS) sectors, the max protection information buffer size is just 20K. The protection information segment count generally matches the number of bios in the request. As a result, the typical actual number of segments won't be very big. And should the need arise, allocating a bigger SGL from slab is fast enough. Pre-allocate only one SGL entry for protection information and switch to runtime allocation in case that the protection information segment number is bigger than 1. This reduces memory tied up by static command allocations. For example, 500+ MB is saved on single lpfc HBA. [mkp: attempted to clarify commit desc] Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Ewan D. Milne Cc: Hannes Reinecke Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 4635873c561ac57b66adfcc2487c38106b1c916c Author: Ming Lei Date: Sun Apr 28 15:39:30 2019 +0800 scsi: lib/sg_pool.c: improve APIs for allocating sg pool sg_alloc_table_chained() currently allows the caller to provide one preallocated SGL and returns if the requested number isn't bigger than size of that SGL. This is used to inline an SGL for an IO request. However, scattergather code only allows that size of the 1st preallocated SGL to be SG_CHUNK_SIZE(128). This means a substantial amount of memory (4KB) is claimed for the SGL for each IO request. If the I/O is small, it would be prudent to allocate a smaller SGL. Introduce an extra parameter to sg_alloc_table_chained() and sg_free_table_chained() for specifying size of the preallocated SGL. Both __sg_free_table() and __sg_alloc_table() assume that each SGL has the same size except for the last one. Change the code to allow both functions to accept a variable size for the 1st preallocated SGL. [mkp: attempted to clarify commit desc] Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Ewan D. Milne Cc: Hannes Reinecke Cc: Sagi Grimberg Cc: Chuck Lever Cc: netdev@vger.kernel.org Cc: linux-nvme@lists.infradead.org Suggested-by: Christoph Hellwig Signed-off-by: Ming Lei Reviewed-by: Christoph Hellwig Signed-off-by: Martin K. Petersen commit ee5a1dbfec57cc1ffdedf2bd767c84d5e0498ed8 Author: Ming Lei Date: Thu Jun 6 16:34:10 2019 +0800 scsi: esp: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Ewan D. Milne Cc: Hannes Reinecke Cc: Finn Thain Cc: Guenter Roeck Reported-by: Guenter Roeck Tested-by: Guenter Roeck Reviewed-by: Finn Thain Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 0e9fdd2b315c0fde535336ea09499a927e415566 Author: Finn Thain Date: Tue Jun 18 09:37:57 2019 +0800 scsi: NCR5380: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Cc: Michael Schmitz Reviewed-by: Michael Schmitz Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Finn Thain Signed-off-by: Martin K. Petersen commit c3c0fd9b108f6360ceaade790571be097df1f3ef Author: Ming Lei Date: Tue Jun 18 09:37:56 2019 +0800 scsi: wd33c93: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 57ef4e510939e305ad905cc54b8031401ff5450e Author: Ming Lei Date: Tue Jun 18 09:37:55 2019 +0800 scsi: ppa: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 1b3a4640106603ce6b3ad261ef4d20ea60e9790b Author: Ming Lei Date: Tue Jun 18 09:37:54 2019 +0800 scsi: pcmcia: nsp_cs: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 79da19b48fc188f6b5186e9ed605e10e5bef3ad6 Author: Ming Lei Date: Tue Jun 18 09:37:53 2019 +0800 scsi: imm: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit a7a253ba6c26490fb132ee682661c03a6c454fd5 Author: Finn Thain Date: Tue Jun 18 09:37:52 2019 +0800 scsi: aha152x: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. Finn added the change to replace SCp.buffers_residual with sg_is_last() for fixing updating it, and the similar change has been applied on NCR5380.c [mkp: clarified commit message] Signed-off-by: Finn Thain Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 013be03840c2ba3a2717f9cee457f01fdc4d8436 Author: Ming Lei Date: Tue Jun 18 09:37:51 2019 +0800 scsi: s390: zfcp_fc: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Cc: Steffen Maier Cc: Benjamin Block Cc: Martin Schwidefsky Cc: Heiko Carstens Cc: linux-s390@vger.kernel.org Acked-by: Benjamin Block Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit da5567369fb686eb050da15364f11ef561f9a03e Author: Ming Lei Date: Tue Jun 18 09:37:49 2019 +0800 scsi: staging: unisys: visorhba: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Cc: devel@driverdev.osuosl.org Cc: Greg Kroah-Hartman Acked-by: Greg Kroah-Hartman Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 1194b5ce57d27c2ad49ed26f8cec98757c7c23ec Author: Ming Lei Date: Tue Jun 18 09:37:48 2019 +0800 scsi: usb: image: microtek: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Cc: Oliver Neukum Cc: Greg Kroah-Hartman Cc: linux-usb@vger.kernel.org Reviewed-by: Bart Van Assche Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 74eb7446eda5e348dbb90981e91e9caf54f2fde4 Author: Ming Lei Date: Tue Jun 18 09:37:47 2019 +0800 scsi: pmcraid: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit c71ae886d1321e74f524c7c023933cf87768915d Author: Ming Lei Date: Tue Jun 18 09:37:46 2019 +0800 scsi: ipr: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 3c1a30df6d9c21c3235b6af5f25da2765b19d05b Author: Ming Lei Date: Tue Jun 18 09:37:45 2019 +0800 scsi: mvumi: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message and folded in build fix reported by zeroday] Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Reviewed-by: Ewan D. Milne Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit 46e8e475a160be5e31e99171b7c0c8a21eb4d6ad Author: Ming Lei Date: Tue Jun 18 09:37:44 2019 +0800 scsi: lpfc: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed by: Ewan D. Milne Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit c0d0d81ad34a41e2aa562bc59f22bed3e9e98080 Author: Ming Lei Date: Tue Jun 18 09:37:43 2019 +0800 scsi: advansys: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Ewan D. Milne Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen commit cf9648cb71d6f1a463553d2fcd4c8137587dc060 Author: Ming Lei Date: Tue Jun 18 09:37:42 2019 +0800 scsi: vmw_pscsi: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed-by: Ewan D. Milne Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen