zstd fixes for v5.16-rc1

Fix stack usage on parisc & improve code size bloat
 
 This PR contains 3 commits:
 
 1. Fixes a minor unused variable warning reported by Kernel test robot [0].
 2. Improves the reported code bloat (-88KB / 374KB) [1] by outlining
    some functions that are unlikely to be used in performance sensitive
    workloads.
 3. Fixes the reported excess stack usage on parisc [2] by removing -O3
    from zstd's compilation flags. -O3 triggered bugs in the hppa-linux-gnu
    gcc-8 compiler. -O2 performance is acceptable: neutral compression,
    about -1% decompression speed. We also reduce code bloat
    (-105KB / 374KB).
 
 After this commit our code bloat is cut from 374KB to 105KB with gcc-11.
 If we wanted to cut the remaining 105KB we'd likely have to trade
 signicant performance, so I want to say that this is enough for now.
 
 We should be able to get further gains without sacrificing speed, but
 that will take some significant optimization effort, and isn't suitable
 for a quick fix. I've opened an upstream issue [3] to track the code size,
 and try to avoid future regressions, and improve it in the long term.
 
 [0] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
 [1] https://lkml.org/lkml/2021/11/15/710
 [2] https://lkml.org/lkml/2021/11/14/189
 [3] https://github.com/facebook/zstd/issues/2867
 
 Link: https://lore.kernel.org/r/20211117014949.1169186-1-nickrterrell@gmail.com/
 Link: https://lore.kernel.org/r/20211117201459.1194876-1-nickrterrell@gmail.com/
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmGWw4AACgkQuzRpqaNE
 qPUXfQ/5AXp+7Ip+YD25QUa/je10OZkdGNi5/MNh1m7f6gwlOab7Pnn65mpN8qsW
 1OJbje5PAiTkC+BzJgGw6zr8JCcvgXCVVtAoPEV73uT9QLOoeEE3E2Jf4OQQxroB
 cKC+lZaxeDgqV60koIhsVBMgs4pny57ohTm4fK8yqrIi7ZV21a/FJoVxwyNLCnbU
 uRJKzN9xa3lBYESnMzlV4dF0WhKfprgI+3YXenLBjHHDhhz0nyPT7jt0sr/CoblI
 2QMq8RItlnMleV1La1v1S38ROu1E4MXvIy/MrFyu7ebBX3jDgMYtRdZxuAL/I2+1
 TfN3LfEcwjyB4ft6Ty76kk0gwEihnEORhTeRVrhqxXx8FPWgEB+tgWHo+zLd8wPp
 khqfO6gf4PZJnf6kDOlyEYF2yTuNlWNR6J41+bLW0bA104zLYjeUhejDgyh2aRR2
 WYo/xwzs2FbI4Da/rJ4iTKy4hK++AZ/Sba9b3t29Ca+TiQZJHSUp5KnjNbIW5XCr
 0jknMki6bASlG9nrg+d2EC3fIQop8nJhywNrLZV1uJYx/H5DBmIcLPmhCb4oBOSt
 AP3d/rj5EnO0+bOGGDg00qndsnuDuko7fOsAM3D9l2HoaOly7++RQtIzZqu8Y3EX
 F8L90qvg/vIWFOppnvJX+nXaWz2J55P4iooKlBKz+JQpBff7lDA=
 =kBgl
 -----END PGP SIGNATURE-----

Merge tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux

Pull zstd fixes from Nick Terrell:
 "Fix stack usage on parisc & improve code size bloat

  This contains three commits:

   1. Fixes a minor unused variable warning reported by Kernel test
      robot [0].

   2. Improves the reported code bloat (-88KB / 374KB) [1] by outlining
      some functions that are unlikely to be used in performance
      sensitive workloads.

   3. Fixes the reported excess stack usage on parisc [2] by removing
      -O3 from zstd's compilation flags. -O3 triggered bugs in the
      hppa-linux-gnu gcc-8 compiler. -O2 performance is acceptable:
      neutral compression, about -1% decompression speed. We also reduce
      code bloat (-105KB / 374KB).

  After this our code bloat is cut from 374KB to 105KB with gcc-11. If
  we wanted to cut the remaining 105KB we'd likely have to trade
  signicant performance, so I want to say that this is enough for now.

  We should be able to get further gains without sacrificing speed, but
  that will take some significant optimization effort, and isn't
  suitable for a quick fix. I've opened an upstream issue [3] to track
  the code size, and try to avoid future regressions, and improve it in
  the long term"

Link: https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/ [0]
Link: https://lkml.org/lkml/2021/11/15/710 [1]
Link: https://lkml.org/lkml/2021/11/14/189 [2]
Link: https://github.com/facebook/zstd/issues/2867 [3]
Link: https://lore.kernel.org/r/20211117014949.1169186-1-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-1-nickrterrell@gmail.com/

* tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux:
  lib: zstd: Don't add -O3 to cflags
  lib: zstd: Don't inline functions in zstd_opt.c
  lib: zstd: Fix unused variable warning
This commit is contained in:
Linus Torvalds 2021-11-18 17:09:05 -08:00
commit 4c388a8e74
4 changed files with 21 additions and 2 deletions

View file

@ -11,8 +11,6 @@
obj-$(CONFIG_ZSTD_COMPRESS) += zstd_compress.o
obj-$(CONFIG_ZSTD_DECOMPRESS) += zstd_decompress.o
ccflags-y += -O3
zstd_compress-y := \
zstd_compress_module.o \
common/debug.o \

View file

@ -16,6 +16,7 @@
*********************************************************/
/* force inlining */
#if !defined(ZSTD_NO_INLINE)
#if (defined(__GNUC__) && !defined(__STRICT_ANSI__)) || defined(__cplusplus) || defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
# define INLINE_KEYWORD inline
#else
@ -24,6 +25,12 @@
#define FORCE_INLINE_ATTR __attribute__((always_inline))
#else
#define INLINE_KEYWORD
#define FORCE_INLINE_ATTR
#endif
/*
On MSVC qsort requires that functions passed into it use the __cdecl calling conversion(CC).

View file

@ -411,6 +411,8 @@ static size_t ZSTD_seqDecompressedSize(seqStore_t const* seqStore, const seqDef*
const seqDef* sp = sstart;
size_t matchLengthSum = 0;
size_t litLengthSum = 0;
/* Only used by assert(), suppress unused variable warnings in production. */
(void)litLengthSum;
while (send-sp > 0) {
ZSTD_sequenceLength const seqLen = ZSTD_getSequenceLength(seqStore, sp);
litLengthSum += seqLen.litLength;

View file

@ -8,6 +8,18 @@
* You may select, at your option, one of the above-listed licenses.
*/
/*
* Disable inlining for the optimal parser for the kernel build.
* It is unlikely to be used in the kernel, and where it is used
* latency shouldn't matter because it is very slow to begin with.
* We prefer a ~180KB binary size win over faster optimal parsing.
*
* TODO(https://github.com/facebook/zstd/issues/2862):
* Improve the code size of the optimal parser in general, so we
* don't need this hack for the kernel build.
*/
#define ZSTD_NO_INLINE 1
#include "zstd_compress_internal.h"
#include "hist.h"
#include "zstd_opt.h"