twx-linux/lib
Jiong Wang 06ae48269d lib: reciprocal_div: implement the improved algorithm on the paper mentioned
The new added "reciprocal_value_adv" implements the advanced version of the
algorithm described in Figure 4.2 of the paper except when
"divisor > (1U << 31)" whose ceil(log2(d)) result will be 32 which then
requires u128 divide on host. The exception case could be easily handled
before calling "reciprocal_value_adv".

The advanced version requires more complex calculation to get the
reciprocal multiplier and other control variables, but then could reduce
the required emulation operations.

It makes no sense to use this advanced version for host divide emulation,
those extra complexities for calculating multiplier etc could completely
waive our saving on emulation operations.

However, it makes sense to use it for JIT divide code generation (for
example eBPF JIT backends) for which we are willing to trade performance of
JITed code with that of host. As shown by the following pseudo code, the
required emulation operations could go down from 6 (the basic version) to 3
or 4.

To use the result of "reciprocal_value_adv", suppose we want to calculate
n/d, the C-style pseudo code will be the following, it could be easily
changed to real code generation for other JIT targets.

  struct reciprocal_value_adv rvalue;
  u8 pre_shift, exp;

  // handle exception case.
  if (d >= (1U << 31)) {
    result = n >= d;
    return;
  }
  rvalue = reciprocal_value_adv(d, 32)
  exp = rvalue.exp;
  if (rvalue.is_wide_m && !(d & 1)) {
    // floor(log2(d & (2^32 -d)))
    pre_shift = fls(d & -d) - 1;
    rvalue = reciprocal_value_adv(d >> pre_shift, 32 - pre_shift);
  } else {
    pre_shift = 0;
  }

  // code generation starts.
  if (imm == 1U << exp) {
    result = n >> exp;
  } else if (rvalue.is_wide_m) {
    // pre_shift must be zero when reached here.
    t = (n * rvalue.m) >> 32;
    result = n - t;
    result >>= 1;
    result += t;
    result >>= rvalue.sh - 1;
  } else {
    if (pre_shift)
      result = n >> pre_shift;
    result = ((u64)result * rvalue.m) >> 32;
    result >>= rvalue.sh;
  }

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-07 01:45:31 +02:00
..
842
fonts
lz4
lzo
mpi treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
raid6 powerpc updates for 4.17 2018-04-07 12:08:19 -07:00
reed_solomon treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
xz
zlib_deflate
zlib_inflate
zstd lib: zstd: clean up Makefile for simpler composite object handling 2018-03-26 02:01:27 +09:00
.gitignore
argv_split.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
ashldi3.c
ashrdi3.c
asn1_decoder.c
assoc_array.c
atomic64_test.c
atomic64.c
audit.c
bcd.c
bch.c
bitmap.c lib/bitmap.c: micro-optimization for __bitmap_complement() 2018-06-07 17:34:39 -07:00
bitrev.c
bsearch.c
btree.c btree: avoid variable-length allocations 2018-03-14 16:55:29 -07:00
bucket_locks.c mm: kvmalloc does not fallback to vmalloc for incompatible gfp flags 2018-06-07 17:34:38 -07:00
bug.c lib/bug.c: exclude non-BUG/WARN exceptions from report_bug() 2018-03-09 16:40:01 -08:00
build_OID_registry
bust_spinlocks.c
chacha20.c
check_signature.c
checksum.c
clz_ctz.c
clz_tab.c
cmdline.c
cmpdi2.c
compat_audit.c
cordic.c
cpu_rmap.c
cpumask.c lib: optimize cpumask_next_and() 2018-02-06 18:32:44 -08:00
crc4.c
crc7.c
crc8.c
crc16.c
crc32.c
crc32defs.h
crc32test.c
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
ctype.c
debug_info.c
debug_locks.c
debugobjects.c debugobjects: Avoid another unused variable warning 2018-03-14 20:20:01 +01:00
dec_and_lock.c atomic: Add irqsave variant of atomic_dec_and_lock() 2018-06-12 23:33:24 +02:00
decompress_bunzip2.c
decompress_inflate.c
decompress_unlz4.c
decompress_unlzma.c
decompress_unlzo.c
decompress_unxz.c
decompress.c
devres.c devres: combine function devm_ioremap* 2018-03-15 18:08:55 +01:00
digsig.c
div64.c
dump_stack.c printk: move dump stack related code to lib/dump_stack.c 2018-03-15 13:25:36 +01:00
dynamic_debug.c
dynamic_queue_limits.c
earlycpio.c
error-inject.c
errseq.c errseq: Always report a writeback error once 2018-04-27 08:51:26 -04:00
extable.c
fault-inject.c
fdt_empty_tree.c
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit_benchmark.c lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit() 2018-05-11 17:28:45 -07:00
find_bit.c lib: optimize cpumask_next_and() 2018-02-06 18:32:44 -08:00
flex_array.c
flex_proportions.c
gcd.c
gen_crc32table.c
genalloc.c
glob.c
globtest.c
hexdump.c
hweight.c
idr.c lib/idr.c: remove simple_ida_lock 2018-06-07 17:34:39 -07:00
inflate.c
int_sqrt.c lib: Add strongly typed 64bit int_sqrt 2018-02-04 10:17:21 +00:00
interval_tree_test.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
interval_tree.c
iomap_copy.c
iomap.c
iommu-helper.c iommu-helper: mark iommu_is_span_boundary as inline 2018-05-09 06:55:44 +02:00
ioremap.c mm/vmalloc: add interfaces to free unmapped page table 2018-03-22 17:07:01 -07:00
iov_iter.c Merge branch 'for-4.18/mcsafe' into libnvdimm-for-next 2018-06-08 15:16:44 -07:00
irq_poll.c
irq_regs.c
is_single_threaded.c
jedec_ddr_data.c
kasprintf.c
Kconfig Move all the dma-mapping code to kernel/dma 2018-06-20 16:30:01 +09:00
Kconfig.debug fault-injection: reorder config entries 2018-06-15 07:55:24 +09:00
Kconfig.kasan kasan: depend on CONFIG_SLUB_DEBUG 2018-06-28 11:16:44 -07:00
Kconfig.kgdb
Kconfig.ubsan lib: add testing module for UBSAN 2018-04-11 10:28:35 -07:00
kfifo.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
klist.c
kobject_uevent.c netns: restrict uevents 2018-05-01 10:22:41 -04:00
kobject.c kobject: don't use WARN for registration failures 2018-04-23 13:14:55 +02:00
kstrtox.c
kstrtox.h
lcm.c
libcrc32c.c libcrc32c: Add crc32c_impl function 2018-03-26 15:09:38 +02:00
list_debug.c lib/list_debug.c: print unmangled addresses 2018-04-11 10:28:35 -07:00
list_sort.c
llist.c
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-rtmutex.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c
lockref.c lockref: Add lockref_put_not_zero 2018-04-12 09:41:19 -07:00
logic_pio.c lib: Add generic PIO mapping method 2018-03-21 17:18:34 -05:00
lru_cache.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
lshrdi3.c
Makefile Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-24 19:36:16 +08:00
memory-notifier-error-inject.c
memweight.c
muldi3.c
net_utils.c
netdev-notifier-error-inject.c
nlattr.c netlink: Return extack message if attribute validation fails 2018-06-28 16:18:04 +09:00
nmi_backtrace.c
nodemask.c
notifier-error-inject.c
notifier-error-inject.h
of-reconfig-notifier-error-inject.c
oid_registry.c
once.c
parman.c
parser.c
pci_iomap.c
percpu_counter.c
percpu_ida.c lib/percpu_ida.c: don't do alloc from per-CPU list if there is none 2018-06-28 11:16:44 -07:00
percpu_test.c
percpu-refcount.c percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods 2018-03-19 10:09:44 -07:00
plist.c
pm-notifier-error-inject.c
prime_numbers.c
radix-tree.c idr: fix invalid ptr dereference on item delete 2018-05-25 18:12:10 -07:00
random32.c
ratelimit.c
rational.c
rbtree_test.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
rbtree.c
reciprocal_div.c lib: reciprocal_div: implement the improved algorithm on the paper mentioned 2018-07-07 01:45:31 +02:00
refcount.c locking/refcounts: Implement refcount_dec_and_lock_irqsave() 2018-06-12 23:33:25 +02:00
rhashtable.c rhashtable: clean up dereference of ->future_tbl. 2018-06-22 13:43:28 +09:00
sbitmap.c treewide: kzalloc_node() -> kcalloc_node() 2018-06-12 16:19:22 -07:00
scatterlist.c for-linus-20180629 2018-06-30 10:47:46 -07:00
seq_buf.c
sg_pool.c
sg_split.c
sha1.c
sha256.c kernel/kexec_file.c: move purgatories sha256 to common code 2018-04-13 17:10:28 -07:00
show_mem.c
siphash.c
smp_processor_id.c
sort.c
stackdepot.c lib/stackdepot.c: use a non-instrumented version of memcmp() 2018-02-06 18:32:44 -08:00
stmp_device.c
string_helpers.c
string.c
strncpy_from_user.c
strnlen_user.c
syscall.c
test_bitmap.c lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly 2018-05-18 17:17:12 -07:00
test_bpf.c test_bpf: flag tests that cannot be jited on s390 2018-06-28 23:58:39 +02:00
test_debug_virtual.c
test_firmware.c treewide: Use array_size() in vzalloc() 2018-06-12 16:19:22 -07:00
test_hash.c
test_hexdump.c
test_kasan.c kasan: fix invalid-free test crashing the kernel 2018-04-11 10:28:32 -07:00
test_kmod.c treewide: Use array_size() in vzalloc() 2018-06-12 16:19:22 -07:00
test_list_sort.c
test_module.c
test_overflow.c test_overflow: fix an IS_ERR() vs NULL bug 2018-06-12 16:19:22 -07:00
test_parman.c
test_printf.c Revert "lib/test_printf.c: call wait_for_random_bytes() before plain %p tests" 2018-06-25 13:44:20 +02:00
test_rhashtable.c rhashtable: remove nulls_base and related code. 2018-06-22 13:43:27 +09:00
test_siphash.c
test_sort.c lib/test_sort.c: add module unload support 2018-02-06 18:32:45 -08:00
test_static_key_base.c
test_static_keys.c
test_string.c
test_sysctl.c
test_ubsan.c lib/test_ubsan.c: make test_ubsan_misaligned_access() static 2018-04-11 10:28:35 -07:00
test_user_copy.c treewide: simplify Kconfig dependencies for removed archs 2018-03-26 15:55:57 +02:00
test_uuid.c
test-kstrtox.c
test-string_helpers.c
textsearch.c textsearch: fix kernel-doc warnings and add kernel-api section 2018-04-16 18:53:13 -04:00
timerqueue.c
ts_bm.c
ts_fsm.c
ts_kmp.c
ubsan.c lib/ubsan: remove returns-nonnull-attribute checks 2018-02-06 18:32:46 -08:00
ubsan.h lib/ubsan: remove returns-nonnull-attribute checks 2018-02-06 18:32:46 -08:00
ucmpdi2.c Add notrace to lib/ucmpdi2.c 2018-04-23 16:39:35 +01:00
ucs2_string.c lib/ucs2_string.c: add MODULE_LICENSE() 2018-06-07 17:34:39 -07:00
usercopy.c
uuid.c
vsprintf.c Printk changes for 4.18 2018-06-06 16:04:55 -07:00
win_minmax.c
xxhash.c