twx-linux

Author	SHA1	Message	Date
Peter Zijlstra	6cbc304f2f	perf/x86/intel: Fix unwind errors from PEBS entries (mk-II) Vince reported the perf_fuzzer giving various unwinder warnings and Josh reported: > Deja vu. Most of these are related to perf PEBS, similar to the > following issue: > > b8000586c90b ("perf/x86/intel: Cure bogus unwind from PEBS entries") > > This is basically the ORC version of that. setup_pebs_sample_data() is > assembling a franken-pt_regs which ORC isn't happy about. RIP is > inconsistent with some of the other registers (like RSP and RBP). And where the previous unwinder only needed BP,SP ORC also requires IP. But we cannot spoof IP because then the sample will get displaced, entirely negating the point of PEBS. So cure the whole thing differently by doing the unwind early; this does however require a means to communicate we did the unwind early. We (ab)use an unused sample_type bit for this, which we set on events that fill out the data->callchain before the normal perf_prepare_sample(). Debugged-by: Josh Poimboeuf <jpoimboe@redhat.com> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:46:21 +02:00
Srikar Dronamraju	b6a60cf36d	sched/numa: Move task_numa_placement() closer to numa_migrate_preferred() numa_migrate_preferred() is called periodically or when task preferred node changes. Preferred node evaluations happen once per scan sequence. If the scan completion happens just after the periodic NUMA migration, then we try to migrate to the preferred node and the preferred node might change, needing another node migration. Avoid this by checking for scan sequence completion only when checking for periodic migration. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25862.6 26158.1 1.14258 1 74357 72725 -2.19482 Running SPECjbb2005 on a 16 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 8 117019 113992 -2.58 1 179095 174947 -2.31 (numbers from v1 based on v4.17-rc5) Testcase Time: Min Max Avg StdDev numa01.sh Real: 449.46 770.77 615.22 101.70 numa01.sh Sys: 132.72 208.17 170.46 24.96 numa01.sh User: 39185.26 60290.89 50066.76 6807.84 numa02.sh Real: 60.85 61.79 61.28 0.37 numa02.sh Sys: 15.34 24.71 21.08 3.61 numa02.sh User: 5204.41 5249.85 5231.21 17.60 numa03.sh Real: 785.50 916.97 840.77 44.98 numa03.sh Sys: 108.08 133.60 119.43 8.82 numa03.sh User: 61422.86 70919.75 64720.87 3310.61 numa04.sh Real: 429.57 587.37 480.80 57.40 numa04.sh Sys: 240.61 321.97 290.84 33.58 numa04.sh User: 34597.65 40498.99 37079.48 2060.72 numa05.sh Real: 392.09 431.25 414.65 13.82 numa05.sh Sys: 229.41 372.48 297.54 53.14 numa05.sh User: 33390.86 34697.49 34222.43 556.42 Testcase Time: Min Max Avg StdDev %Change numa01.sh Real: 424.63 566.18 498.12 59.26 23.50% numa01.sh Sys: 160.19 256.53 208.98 37.02 -18.4% numa01.sh User: 37320.00 46225.58 42001.57 3482.45 19.20% numa02.sh Real: 60.17 62.47 60.91 0.85 0.607% numa02.sh Sys: 15.30 22.82 17.04 2.90 23.70% numa02.sh User: 5202.13 5255.51 5219.08 20.14 0.232% numa03.sh Real: 823.91 844.89 833.86 8.46 0.828% numa03.sh Sys: 130.69 148.29 140.47 6.21 -14.9% numa03.sh User: 62519.15 64262.20 63613.38 620.05 1.740% numa04.sh Real: 515.30 603.74 548.56 30.93 -12.3% numa04.sh Sys: 459.73 525.48 489.18 21.63 -40.5% numa04.sh User: 40561.96 44919.18 42047.87 1526.85 -11.8% numa05.sh Real: 396.58 454.37 421.13 19.71 -1.53% numa05.sh Sys: 208.72 422.02 348.90 73.60 -14.7% numa05.sh User: 33124.08 36109.35 34846.47 1089.74 -1.79% Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-20-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:08 +02:00
Srikar Dronamraju	f35678b6a1	sched/numa: Use group_weights to identify if migration degrades locality On NUMA_BACKPLANE and NUMA_GLUELESS_MESH systems, tasks/memory should be consolidated to the closest group of nodes. In such a case, relying on group_fault metric may not always help to consolidate. There can always be a case where a node closer to the preferred node may have lesser faults than a node further away from the preferred node. In such a case, moving to node with more faults might avoid numa consolidation. Using group_weight would help to consolidate task/memory around the preferred_node. While here, to be on the conservative side, don't override migrate thread degrades locality logic for CPU_NEWLY_IDLE load balancing. Note: Similar problems exist with should_numa_migrate_memory and will be dealt separately. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25645.4 25960 1.22 1 72142 73550 1.95 Running SPECjbb2005 on a 16 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 8 110199 120071 8.958 1 176303 176249 -0.03 (numbers from v1 based on v4.17-rc5) Testcase Time: Min Max Avg StdDev numa01.sh Real: 490.04 774.86 596.26 96.46 numa01.sh Sys: 151.52 242.88 184.82 31.71 numa01.sh User: 41418.41 60844.59 48776.09 6564.27 numa02.sh Real: 60.14 62.94 60.98 1.00 numa02.sh Sys: 16.11 30.77 21.20 5.28 numa02.sh User: 5184.33 5311.09 5228.50 44.24 numa03.sh Real: 790.95 856.35 826.41 24.11 numa03.sh Sys: 114.93 118.85 117.05 1.63 numa03.sh User: 60990.99 64959.28 63470.43 1415.44 numa04.sh Real: 434.37 597.92 504.87 59.70 numa04.sh Sys: 237.63 397.40 289.74 55.98 numa04.sh User: 34854.87 41121.83 38572.52 2615.84 numa05.sh Real: 386.77 448.90 417.22 22.79 numa05.sh Sys: 149.23 379.95 303.04 79.55 numa05.sh User: 32951.76 35959.58 34562.18 1034.05 Testcase Time: Min Max Avg StdDev %Change numa01.sh Real: 493.19 672.88 597.51 59.38 -0.20% numa01.sh Sys: 150.09 245.48 207.76 34.26 -11.0% numa01.sh User: 41928.51 53779.17 48747.06 3901.39 0.059% numa02.sh Real: 60.63 62.87 61.22 0.83 -0.39% numa02.sh Sys: 16.64 27.97 20.25 4.06 4.691% numa02.sh User: 5222.92 5309.60 5254.03 29.98 -0.48% numa03.sh Real: 821.52 902.15 863.60 32.41 -4.30% numa03.sh Sys: 112.04 130.66 118.35 7.08 -1.09% numa03.sh User: 62245.16 69165.14 66443.04 2450.32 -4.47% numa04.sh Real: 414.53 519.57 476.25 37.00 6.009% numa04.sh Sys: 181.84 335.67 280.41 54.07 3.327% numa04.sh User: 33924.50 39115.39 37343.78 1934.26 3.290% numa05.sh Real: 408.30 441.45 417.90 12.05 -0.16% numa05.sh Sys: 233.41 381.60 295.58 57.37 2.523% numa05.sh User: 33301.31 35972.50 34335.19 938.94 0.661% Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-16-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:08 +02:00
Srikar Dronamraju	30619c89b1	sched/numa: Update the scan period without holding the numa_group lock The metrics for updating scan periods are local or task specific. Currently this update happens under the numa_group lock, which seems unnecessary. Hence move this update outside the lock. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25355.9 25645.4 1.141 1 72812 72142 -0.92 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-15-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:08 +02:00
Srikar Dronamraju	2d4056fafa	sched/numa: Remove numa_has_capacity() task_numa_find_cpu() helps to find the CPU to swap/move the task to. It's guarded by numa_has_capacity(). However node not having capacity shouldn't deter a task swapping if it helps NUMA placement. Further load_too_imbalanced(), which evaluates possibilities of move/swap, provides similar checks as numa_has_capacity. Hence remove numa_has_capacity() to enhance possibilities of task swapping even if load is imbalanced. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25657.9 25804.1 0.569 1 74435 73413 -1.37 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rik van Riel <riel@surriel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-13-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:08 +02:00
Srikar Dronamraju	0ad4e3dfe6	sched/numa: Modify migrate_swap() to accept additional parameters There are checks in migrate_swap_stop() that check if the task/CPU combination is as per migrate_swap_arg before migrating. However atleast one of the two tasks to be swapped by migrate_swap() could have migrated to a completely different CPU before updating the migrate_swap_arg. The new CPU where the task is currently running could be a different node too. If the task has migrated, numa balancer might end up placing a task in a wrong node. Instead of achieving node consolidation, it may end up spreading the load across nodes. To avoid that pass the CPUs as additional parameters. While here, place migrate_swap under CONFIG_NUMA_BALANCING. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25377.3 25226.6 -0.59 1 72287 73326 1.437 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-10-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:07 +02:00
Srikar Dronamraju	10864a9e22	sched/numa: Remove unused task_capacity from 'struct numa_stats' The task_capacity field in 'struct numa_stats' is redundant. Also move nr_running for better packing within the struct. No functional changes. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25308.6 25377.3 0.271 1 72964 72287 -0.92 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Rik van Riel <riel@surriel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-9-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:07 +02:00
Srikar Dronamraju	0ee7e74dc0	sched/numa: Skip nodes that are at 'hoplimit' When comparing two nodes at a distance of 'hoplimit', we should consider nodes only up to 'hoplimit'. Currently we also consider nodes at 'oplimit' distance too. Hence two nodes at a distance of 'hoplimit' will have same groupweight. Fix this by skipping nodes at hoplimit. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25375.3 25308.6 -0.26 1 72617 72964 0.477 Running SPECjbb2005 on a 16 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 8 113372 108750 -4.07684 1 177403 183115 3.21979 (numbers from v1 based on v4.17-rc5) Testcase Time: Min Max Avg StdDev numa01.sh Real: 478.45 565.90 515.11 30.87 numa01.sh Sys: 207.79 271.04 232.94 21.33 numa01.sh User: 39763.93 47303.12 43210.73 2644.86 numa02.sh Real: 60.00 61.46 60.78 0.49 numa02.sh Sys: 15.71 25.31 20.69 3.42 numa02.sh User: 5175.92 5265.86 5235.97 32.82 numa03.sh Real: 776.42 834.85 806.01 23.22 numa03.sh Sys: 114.43 128.75 121.65 5.49 numa03.sh User: 60773.93 64855.25 62616.91 1576.39 numa04.sh Real: 456.93 511.95 482.91 20.88 numa04.sh Sys: 178.09 460.89 356.86 94.58 numa04.sh User: 36312.09 42553.24 39623.21 2247.96 numa05.sh Real: 393.98 493.48 436.61 35.59 numa05.sh Sys: 164.49 329.15 265.87 61.78 numa05.sh User: 33182.65 36654.53 35074.51 1187.71 Testcase Time: Min Max Avg StdDev %Change numa01.sh Real: 414.64 819.20 556.08 147.70 -7.36% numa01.sh Sys: 77.52 205.04 139.40 52.05 67.10% numa01.sh User: 37043.24 61757.88 45517.48 9290.38 -5.06% numa02.sh Real: 60.80 63.32 61.63 0.88 -1.37% numa02.sh Sys: 17.35 39.37 25.71 7.33 -19.5% numa02.sh User: 5213.79 5374.73 5268.90 55.09 -0.62% numa03.sh Real: 780.09 948.64 831.43 63.02 -3.05% numa03.sh Sys: 104.96 136.92 116.31 11.34 4.591% numa03.sh User: 60465.42 73339.78 64368.03 4700.14 -2.72% numa04.sh Real: 412.60 681.92 521.29 96.64 -7.36% numa04.sh Sys: 210.32 314.10 251.77 37.71 41.74% numa04.sh User: 34026.38 45581.20 38534.49 4198.53 2.825% numa05.sh Real: 394.79 439.63 411.35 16.87 6.140% numa05.sh Sys: 238.32 330.09 292.31 38.32 -9.04% numa05.sh User: 33456.45 34876.07 34138.62 609.45 2.741% While there is a regression with this change, this change is needed from a correctness perspective. Also it helps consolidation as seen from perf bench output. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-8-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:07 +02:00
Srikar Dronamraju	67d9f6c256	sched/debug: Reverse the order of printing faults Fix the order in which the private and shared numa faults are getting printed. No functional changes. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25215.7 25375.3 0.63 1 72107 72617 0.70 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-7-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:07 +02:00
Srikar Dronamraju	f03bb6760b	sched/numa: Use task faults only if numa_group is not yet set up When numa_group faults are available, task_numa_placement only uses numa_group faults to evaluate preferred node. However it still accounts task faults and even evaluates the preferred node just based on task faults just to discard it in favour of preferred node chosen on the basis of numa_group. Instead use task faults only if numa_group is not set. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25549.6 25215.7 -1.30 1 73190 72107 -1.47 Running SPECjbb2005 on a 16 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 8 113437 113372 -0.05 1 196130 177403 -9.54 (numbers from v1 based on v4.17-rc5) Testcase Time: Min Max Avg StdDev numa01.sh Real: 506.35 794.46 599.06 104.26 numa01.sh Sys: 150.37 223.56 195.99 24.94 numa01.sh User: 43450.69 61752.04 49281.50 6635.33 numa02.sh Real: 60.33 62.40 61.31 0.90 numa02.sh Sys: 18.12 31.66 24.28 5.89 numa02.sh User: 5203.91 5325.32 5260.29 49.98 numa03.sh Real: 696.47 853.62 745.80 57.28 numa03.sh Sys: 85.68 123.71 97.89 13.48 numa03.sh User: 55978.45 66418.63 59254.94 3737.97 numa04.sh Real: 444.05 514.83 497.06 26.85 numa04.sh Sys: 230.39 375.79 316.23 48.58 numa04.sh User: 35403.12 41004.10 39720.80 2163.08 numa05.sh Real: 423.09 460.41 439.57 13.92 numa05.sh Sys: 287.38 480.15 369.37 68.52 numa05.sh User: 34732.12 38016.80 36255.85 1070.51 Testcase Time: Min Max Avg StdDev %Change numa01.sh Real: 478.45 565.90 515.11 30.87 16.29% numa01.sh Sys: 207.79 271.04 232.94 21.33 -15.8% numa01.sh User: 39763.93 47303.12 43210.73 2644.86 14.04% numa02.sh Real: 60.00 61.46 60.78 0.49 0.871% numa02.sh Sys: 15.71 25.31 20.69 3.42 17.35% numa02.sh User: 5175.92 5265.86 5235.97 32.82 0.464% numa03.sh Real: 776.42 834.85 806.01 23.22 -7.47% numa03.sh Sys: 114.43 128.75 121.65 5.49 -19.5% numa03.sh User: 60773.93 64855.25 62616.91 1576.39 -5.36% numa04.sh Real: 456.93 511.95 482.91 20.88 2.930% numa04.sh Sys: 178.09 460.89 356.86 94.58 -11.3% numa04.sh User: 36312.09 42553.24 39623.21 2247.96 0.246% numa05.sh Real: 393.98 493.48 436.61 35.59 0.677% numa05.sh Sys: 164.49 329.15 265.87 61.78 38.92% numa05.sh User: 33182.65 36654.53 35074.51 1187.71 3.368% Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-6-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:06 +02:00
Srikar Dronamraju	8cd45eee43	sched/numa: Set preferred_node based on best_cpu Currently preferred node is set to dst_nid which is the last node in the iteration whose group weight or task weight is greater than the current node. However it doesn't guarantee that dst_nid has the numa capacity to move. It also doesn't guarantee that dst_nid has the best_cpu which is the CPU/node ideal for node migration. Lets consider faults on a 4 node system with group weight numbers in different nodes being in 0 < 1 < 2 < 3 proportion. Consider the task is running on 3 and 0 is its preferred node but its capacity is full. Consider nodes 1, 2 and 3 have capacity. Then the task should be migrated to node 1. Currently the task gets moved to node 2. env.dst_nid points to the last node whose faults were greater than current node. Modify to set the preferred node based of best_cpu. Earlier setting preferred node was skipped if nr_active_nodes is 1. This could result in the task being moved out of the preferred node to a random node during regular load balancing. Also while modifying task_numa_migrate(), use sched_setnuma to set preferred node. This ensures out numa accounting is correct. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25122.9 25549.6 1.698 1 73850 73190 -0.89 Running SPECjbb2005 on a 16 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 8 105930 113437 7.08676 1 178624 196130 9.80047 (numbers from v1 based on v4.17-rc5) Testcase Time: Min Max Avg StdDev numa01.sh Real: 435.78 653.81 534.58 83.20 numa01.sh Sys: 121.93 187.18 145.90 23.47 numa01.sh User: 37082.81 51402.80 43647.60 5409.75 numa02.sh Real: 60.64 61.63 61.19 0.40 numa02.sh Sys: 14.72 25.68 19.06 4.03 numa02.sh User: 5210.95 5266.69 5233.30 20.82 numa03.sh Real: 746.51 808.24 780.36 23.88 numa03.sh Sys: 97.26 108.48 105.07 4.28 numa03.sh User: 58956.30 61397.05 60162.95 1050.82 numa04.sh Real: 465.97 519.27 484.81 19.62 numa04.sh Sys: 304.43 359.08 334.68 20.64 numa04.sh User: 37544.16 41186.15 39262.44 1314.91 numa05.sh Real: 411.57 457.20 433.29 16.58 numa05.sh Sys: 230.05 435.48 339.95 67.58 numa05.sh User: 33325.54 36896.31 35637.84 1222.64 Testcase Time: Min Max Avg StdDev %Change numa01.sh Real: 506.35 794.46 599.06 104.26 -10.76% numa01.sh Sys: 150.37 223.56 195.99 24.94 -25.55% numa01.sh User: 43450.69 61752.04 49281.50 6635.33 -11.43% numa02.sh Real: 60.33 62.40 61.31 0.90 -0.195% numa02.sh Sys: 18.12 31.66 24.28 5.89 -21.49% numa02.sh User: 5203.91 5325.32 5260.29 49.98 -0.513% numa03.sh Real: 696.47 853.62 745.80 57.28 4.6339% numa03.sh Sys: 85.68 123.71 97.89 13.48 7.3347% numa03.sh User: 55978.45 66418.63 59254.94 3737.97 1.5323% numa04.sh Real: 444.05 514.83 497.06 26.85 -2.464% numa04.sh Sys: 230.39 375.79 316.23 48.58 5.8343% numa04.sh User: 35403.12 41004.10 39720.80 2163.08 -1.153% numa05.sh Real: 423.09 460.41 439.57 13.92 -1.428% numa05.sh Sys: 287.38 480.15 369.37 68.52 -7.964% numa05.sh User: 34732.12 38016.80 36255.85 1070.51 -1.704% Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-5-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:06 +02:00
Srikar Dronamraju	5f95ba7a43	sched/numa: Simplify load_too_imbalanced() Currently load_too_imbalance() cares about the slope of imbalance. It doesn't care of the direction of the imbalance. However this may not work if nodes that are being compared have dissimilar capacities. Few nodes might have more cores than other nodes in the system. Also unlike traditional load balance at a NUMA sched domain, multiple requests to migrate from the same source node to same destination node may run in parallel. This can cause huge load imbalance. This is specially true on a larger machines with either large cores per node or more number of nodes in the system. Hence allow move/swap only if the imbalance is going to reduce. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25058.2 25122.9 0.25 1 72950 73850 1.23 (numbers from v1 based on v4.17-rc5) Testcase Time: Min Max Avg StdDev numa01.sh Real: 516.14 892.41 739.84 151.32 numa01.sh Sys: 153.16 192.99 177.70 14.58 numa01.sh User: 39821.04 69528.92 57193.87 10989.48 numa02.sh Real: 60.91 62.35 61.58 0.63 numa02.sh Sys: 16.47 26.16 21.20 3.85 numa02.sh User: 5227.58 5309.61 5265.17 31.04 numa03.sh Real: 739.07 917.73 795.75 64.45 numa03.sh Sys: 94.46 136.08 109.48 14.58 numa03.sh User: 57478.56 72014.09 61764.48 5343.69 numa04.sh Real: 442.61 715.43 530.31 96.12 numa04.sh Sys: 224.90 348.63 285.61 48.83 numa04.sh User: 35836.84 47522.47 40235.41 3985.26 numa05.sh Real: 386.13 489.17 434.94 43.59 numa05.sh Sys: 144.29 438.56 278.80 105.78 numa05.sh User: 33255.86 36890.82 34879.31 1641.98 Testcase Time: Min Max Avg StdDev %Change numa01.sh Real: 435.78 653.81 534.58 83.20 38.39% numa01.sh Sys: 121.93 187.18 145.90 23.47 21.79% numa01.sh User: 37082.81 51402.80 43647.60 5409.75 31.03% numa02.sh Real: 60.64 61.63 61.19 0.40 0.637% numa02.sh Sys: 14.72 25.68 19.06 4.03 11.22% numa02.sh User: 5210.95 5266.69 5233.30 20.82 0.608% numa03.sh Real: 746.51 808.24 780.36 23.88 1.972% numa03.sh Sys: 97.26 108.48 105.07 4.28 4.197% numa03.sh User: 58956.30 61397.05 60162.95 1050.82 2.661% numa04.sh Real: 465.97 519.27 484.81 19.62 9.385% numa04.sh Sys: 304.43 359.08 334.68 20.64 -14.6% numa04.sh User: 37544.16 41186.15 39262.44 1314.91 2.478% numa05.sh Real: 411.57 457.20 433.29 16.58 0.380% numa05.sh Sys: 230.05 435.48 339.95 67.58 -17.9% numa05.sh User: 33325.54 36896.31 35637.84 1222.64 -2.12% Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-4-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:06 +02:00
Srikar Dronamraju	305c1fac32	sched/numa: Evaluate move once per node task_numa_compare() helps choose the best CPU to move or swap the selected task. To achieve this task_numa_compare() is called for every CPU in the node. Currently it evaluates if the task can be moved/swapped for each of the CPUs. However the move evaluation is mostly independent of the CPU. Evaluating the move logic once per node, provides scope for simplifying task_numa_compare(). Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25705.2 25058.2 -2.51 1 74433 72950 -1.99 Running SPECjbb2005 on a 16 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 8 96589.6 105930 9.670 1 181830 178624 -1.76 (numbers from v1 based on v4.17-rc5) Testcase Time: Min Max Avg StdDev numa01.sh Real: 440.65 941.32 758.98 189.17 numa01.sh Sys: 183.48 320.07 258.42 50.09 numa01.sh User: 37384.65 71818.14 60302.51 13798.96 numa02.sh Real: 61.24 65.35 62.49 1.49 numa02.sh Sys: 16.83 24.18 21.40 2.60 numa02.sh User: 5219.59 5356.34 5264.03 49.07 numa03.sh Real: 822.04 912.40 873.55 37.35 numa03.sh Sys: 118.80 140.94 132.90 7.60 numa03.sh User: 62485.19 70025.01 67208.33 2967.10 numa04.sh Real: 690.66 872.12 778.49 65.44 numa04.sh Sys: 459.26 563.03 494.03 42.39 numa04.sh User: 51116.44 70527.20 58849.44 8461.28 numa05.sh Real: 418.37 562.28 525.77 54.27 numa05.sh Sys: 299.45 481.00 392.49 64.27 numa05.sh User: 34115.09 41324.02 39105.30 2627.68 Testcase Time: Min Max Avg StdDev %Change numa01.sh Real: 516.14 892.41 739.84 151.32 2.587% numa01.sh Sys: 153.16 192.99 177.70 14.58 45.42% numa01.sh User: 39821.04 69528.92 57193.87 10989.48 5.435% numa02.sh Real: 60.91 62.35 61.58 0.63 1.477% numa02.sh Sys: 16.47 26.16 21.20 3.85 0.943% numa02.sh User: 5227.58 5309.61 5265.17 31.04 -0.02% numa03.sh Real: 739.07 917.73 795.75 64.45 9.776% numa03.sh Sys: 94.46 136.08 109.48 14.58 21.39% numa03.sh User: 57478.56 72014.09 61764.48 5343.69 8.813% numa04.sh Real: 442.61 715.43 530.31 96.12 46.79% numa04.sh Sys: 224.90 348.63 285.61 48.83 72.97% numa04.sh User: 35836.84 47522.47 40235.41 3985.26 46.26% numa05.sh Real: 386.13 489.17 434.94 43.59 20.88% numa05.sh Sys: 144.29 438.56 278.80 105.78 40.77% numa05.sh User: 33255.86 36890.82 34879.31 1641.98 12.11% Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-3-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:06 +02:00
Srikar Dronamraju	6e30396767	sched/numa: Remove redundant field 'numa_entry' is a struct list_head defined in task_struct, but never used. No functional change. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-2-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:06 +02:00
Yun Wang	3d6c50c27b	sched/debug: Show the sum wait time of a task group Although we can rely on cpuacct to present the CPU usage of task groups, it is hard to tell how intense the competition is between these groups on CPU resources. Monitoring the wait time or sched_debug of each process could be very expensive, and there is no good way to accurately represent the conflict with these info, we need the wait time on group dimension. Thus we introduce group's wait_sum to represent the resource conflict between task groups, which is simply the sum of the wait time of the group's cfs_rq. The 'cpu.stat' is modified to show the statistic, like: nr_periods 0 nr_throttled 0 throttled_time 0 wait_sum 2035098795584 Now we can monitor the changes of wait_sum to tell how much a a task group is suffering in the fight of CPU resources. For example: (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X% means the task group paid X percentage of period on waiting for the CPU. Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/ff7dae3b-e5f9-7157-1caa-ff02c6b23dc1@linux.alibaba.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:05 +02:00
Vincent Guittot	2e62c4743a	sched/fair: Remove #ifdefs from scale_rt_capacity() Reuse cpu_util_irq() that has been defined for schedutil and set irq util to 0 when !CONFIG_IRQ_TIME_ACCOUNTING. But the compiler is not able to optimize the sequence (at least with aarch64 GCC 7.2.1): free *= (max - irq); free /= max; when irq is fixed to 0 Add a new inline function scale_irq_capacity() that will scale utilization when irq is accounted. Reuse this funciton in schedutil which applies similar formula. Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/1532001606-6689-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:41:05 +02:00
Ingo Molnar	4765096f4f	Merge branch 'sched/urgent' into sched/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:29:58 +02:00
Hailong Liu	f3d133ee0a	sched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE NO_RT_RUNTIME_SHARE feature is used to prevent a CPU borrow enough runtime with a spin-rt-task. However, if RT_RUNTIME_SHARE feature is enabled and rt_rq has borrowd enough rt_runtime at the beginning, rt_runtime can't be restored to its initial bandwidth rt_runtime after we disable RT_RUNTIME_SHARE. E.g. on my PC with 4 cores, procedure to reproduce: 1) Make sure RT_RUNTIME_SHARE is enabled cat /sys/kernel/debug/sched_features GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY CACHE_HOT_BUDDY WAKEUP_PREEMPTION NO_HRTICK NO_DOUBLE_TICK LB_BIAS NONTASK_CAPACITY TTWU_QUEUE NO_SIS_AVG_CPU SIS_PROP NO_WARN_DOUBLE_CLOCK RT_PUSH_IPI RT_RUNTIME_SHARE NO_LB_MIN ATTACH_AGE_LOAD WA_IDLE WA_WEIGHT WA_BIAS 2) Start a spin-rt-task ./loop_rr & 3) set affinity to the last cpu taskset -p 8 $pid_of_loop_rr 4) Observe that last cpu have borrowed enough runtime. cat /proc/sched_debug \| grep rt_runtime .rt_runtime : 950.000000 .rt_runtime : 900.000000 .rt_runtime : 950.000000 .rt_runtime : 1000.000000 5) Disable RT_RUNTIME_SHARE echo NO_RT_RUNTIME_SHARE > /sys/kernel/debug/sched_features 6) Observe that rt_runtime can not been restored cat /proc/sched_debug \| grep rt_runtime .rt_runtime : 950.000000 .rt_runtime : 900.000000 .rt_runtime : 950.000000 .rt_runtime : 1000.000000 This patch help to restore rt_runtime after we disable RT_RUNTIME_SHARE. Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/1531874815-39357-1-git-send-email-liu.hailong6@zte.com.cn Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:29:08 +02:00
Daniel Bristot de Oliveira	840d719604	sched/deadline: Update rq_clock of later_rq when pushing a task Daniel Casini got this warn while running a DL task here at RetisLab: [ 461.137582] ------------[ cut here ]------------ [ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP [ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20 [a ton of modules] [ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3 [ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013 [ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20 [ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b [ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082 [ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006 [ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0 [ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339 [ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20 [ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940 [ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000 [ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0 [ 461.137680] Call Trace: [ 461.137684] push_dl_task.part.46+0x3bc/0x460 [ 461.137686] task_woken_dl+0x60/0x80 [ 461.137689] ttwu_do_wakeup+0x4f/0x150 [ 461.137690] ttwu_do_activate+0x77/0x80 [ 461.137692] try_to_wake_up+0x1d6/0x4c0 [ 461.137693] wake_up_q+0x32/0x70 [ 461.137696] do_futex+0x7e7/0xb50 [ 461.137698] __x64_sys_futex+0x8b/0x180 [ 461.137701] do_syscall_64+0x5a/0x110 [ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 461.137705] RIP: 0033:0x7efe4918ca26 [ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2 [ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca [ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26 [ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24 [ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001 [ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000 [ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10 [ 461.137743] ---[ end trace 187df4cad2bf7649 ]--- This warning happened in the push_dl_task(), because __add_running_bw()->cpufreq_update_util() is getting the rq_clock of the later_rq before its update, which takes place at activate_task(). The fix then is to update the rq_clock before calling add_running_bw(). To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to activate_task(). Reported-by: Daniel Casini <daniel.casini@santannapisa.it> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it> Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points Link: http://lkml.kernel.org/r/ca31d073a4788acf0684a8b255f14fea775ccf20.1532077269.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:29:08 +02:00
Isaac J. Manjarres	2610e88946	stop_machine: Disable preemption after queueing stopper threads This commit: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") does not fully address the race condition that can occur as follows: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Co-Developed-by: Prasad Sodagudi <psodagud@codeaurora.org> Co-Developed-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bigeasy@linutronix.de Cc: gregkh@linuxfoundation.org Cc: matt@codeblueprint.co.uk Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") Link: http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isaacm@codeaurora.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:25:08 +02:00
Yi Wang	6cd0c583b0	sched/topology: Check variable group before dereferencing it The 'group' variable in sched_domain_debug_one() is not checked when firstly used in cpumask_test_cpu(cpu, sched_group_span(group)), but it might be NULL (it is checked later in the following while loop) and may cause NULL pointer dereference. We need to check it before using to avoid NULL dereference. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/1532319547-33335-1-git-send-email-wang.yi59@zte.com.cn Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:25:07 +02:00
Waiman Long	c0dc373a78	locking/pvqspinlock/x86: Use LOCK_PREFIX in __pv_queued_spin_unlock() assembly code The LOCK_PREFIX macro should be used in the __raw_callee_save___pv_queued_spin_unlock() assembly code, so that the lock prefix can be patched out on UP systems. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Joe Mario <jmario@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1531858560-21547-1-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:22:20 +02:00
Peter Rosin	7b94ea5051	i2c/mux, locking/core: Annotate the nested rt_mutex usage If an i2c topology has instances of nested muxes, then a lockdep splat is produced when when i2c_parent_lock_bus() is called. Here is an example: ============================================ WARNING: possible recursive locking detected -------------------------------------------- insmod/68159 is trying to acquire lock: (i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux] but task is already holding lock: (i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(i2c_register_adapter#2); lock(i2c_register_adapter#2); * DEADLOCK * May be due to missing lock nesting notation 1 lock held by insmod/68159: #0: (i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux] stack backtrace: CPU: 13 PID: 68159 Comm: insmod Tainted: G O Call Trace: dump_stack+0x67/0x98 __lock_acquire+0x162e/0x1780 lock_acquire+0xba/0x200 rt_mutex_lock+0x44/0x60 i2c_parent_lock_bus+0x32/0x50 [i2c_mux] i2c_parent_lock_bus+0x3e/0x50 [i2c_mux] i2c_smbus_xfer+0xf0/0x700 i2c_smbus_read_byte+0x42/0x70 my2c_init+0xa2/0x1000 [my2c] do_one_initcall+0x51/0x192 do_init_module+0x62/0x216 load_module+0x20f9/0x2b50 SYSC_init_module+0x19a/0x1c0 SyS_init_module+0xe/0x10 do_syscall_64+0x6c/0x1a0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Reported-by: John Sperbeck <jsperbeck@google.com> Tested-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Deepa Dinamani <deepadinamani@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Chang <dpf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Link: http://lkml.kernel.org/r/20180720083914.1950-3-peda@axentia.se Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:22:20 +02:00
Peter Rosin	62cedf3e60	locking/rtmutex: Allow specifying a subclass for nested locking Needed for annotating rt_mutex locks. Tested-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Deepa Dinamani <deepadinamani@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Chang <dpf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Link: http://lkml.kernel.org/r/20180720083914.1950-2-peda@axentia.se Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-25 11:22:19 +02:00
Masayoshi Mizuma	190bd6e98a	EDAC, sb_edac: Add support for systems with segmented PCI buses Extend the driver to check whether segment number and bus number matches when deciding how to group memory controller PCI devices to CPU sockets. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180724190213.26359-1-msys.mizuma@gmail.com [ Cleanup commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2018-07-25 11:17:15 +02:00
Mika Westerberg	2d8ff0b586	thunderbolt: Add support for runtime PM When Thunderbolt host controller is set to RTD3 mode (Runtime D3) it is present all the time. Because of this it is important to runtime suspend the controller whenever possible. In case of ICM we have following rules which all needs to be true before the host controller can be put to D3: - The controller firmware reports to support RTD3 - All the connected devices announce support for RTD3 - There is no active XDomain connection Implement this using standard Linux runtime PM APIs so that when all the children devices are runtime suspended, the Thunderbolt host controller PCI device is runtime suspended as well. The ICM firmware then starts powering down power domains towards RTD3 but it can prevent this if it detects that there is an active Display Port stream (this is not visible to the software, though). The Thunderbolt host controller will be runtime resumed either when there is a remote wake event (device is connected or disconnected), or when there is access from userspace that requires hardware access. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 10:55:29 +02:00
Colin Ian King	fa3af1cb1e	thunderbolt: Remove redundant variable 'approved' Variable 'approved' is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: warning: variable 'approved' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 10:15:46 +02:00
Mika Westerberg	d04522fa08	thunderbolt: Use correct ICM commands in system suspend The correct way to put the ICM into suspend state is to send it NHI_MAILBOX_DRV_UNLOADS mailbox command. NHI_MAILBOX_SAVE_DEVS is not needed on Intel Titan Ridge so we can skip it. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 10:15:24 +02:00
Mika Westerberg	84db685876	thunderbolt: No need to take tb->lock in domain suspend/complete If the connection manager implementation needs to touch the domain structures it ought to take the lock itself. Currently only ICM implements these hooks and it does not need the lock because we there will be no notifications before driver ready message is sent to it. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 10:15:24 +02:00
Mika Westerberg	fdd92e89a4	thunderbolt: Do not unnecessarily call ICM get route This command is not really fast and can make resume time slower. We only need to get route again if the link was changed and during initial device connected message. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 10:15:24 +02:00
Mika Westerberg	dba3caf621	thunderbolt: Use 64-bit DMA mask if supported by the platform PCI defaults to 32-bit DMA mask but this device is capable of full 64-bit addressing, so make sure we first try 64-bit DMA mask before falling back to the default 32-bit. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 10:15:24 +02:00
Nathan Ciobanu	c356915ebc	thunderbolt: Fix small typo in variable name Fixes small variable name typo and the associated checkpatch spelling warning. Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 10:15:24 +02:00
Greg Kroah-Hartman	3ceefa3ffd	Second set of IIO new device support, features and cleanups. There are also a couple of fixes that can wait for the coming merge window. Core new features * Support for phase channels (used in time of flight sensors amongst other things) * Support for deep UV light channel modifier. New Device Support * AD4758 DAC - New driver and dt bindings. * adxl345 - Support the adxl375 +-200g part which is register compatible. * isl29501 Time of flight sensor. - New driver * meson-saradc - Support the Meson8m2 Socs - right now this is just an ID, but there will be additional difference in future. * mpu6050 - New ID for 6515 variant. * si1133 UV sensor. - New driver * Spreadtrum SC27xx PMIC ADC - New driver and dt bindings. Features * adxl345 - Add calibration offset readback and writing. - Add sampling frequency control. Fixes and Cleanups * ad5933 - Use a macro for the channel definition to reduce duplication. * ad9523 - Replace use of core mlock with a local lock. Part of ongoing efforts to avoid confusing the purpose of mlock which is only about iio core state changes. - Fix displayed phase which was out by a factor of 10. * adxl345 - Add a link to the datasheet. - Rework the use of the address field in the chan_spec structures to allow addition of more per channel information. * adis imu - Mark switch fall throughs. * at91-sama5d2 - Fix some casting on big endian systems. * bmp280 - Drop some DT elements that aren't used and should mostly be done from userspace rather than in DT. * hx711 - add clock-frequency dt binding and resulting delay to deal with capacitance issue on some boards. - fix a spurious unit-address in the example. * ina2xx - Avoid a possible kthread_stop with a stale task_struct. * ltc2632 - Remove some unused local variables (assigned but value never used). * max1363 - Use device_get_match_data to remove some boilerplate. * mma8452 - Mark switch fall throughs. * sca3000 - Fix a missing return in a switch statement (a bad fallthrough previously!) * sigma-delta-modulator - Drop incorrect unit address from the DT example. * st_accel - Use device_get_match_data to drop some boiler plate. - Move to probe_new for i2c driver as second parameter not used. * st_sensors library - Use a strlcpy (safe in this case). * st_lsm6dsx - Add some error logging. * ti-ads7950 - SPDX - Allow simultaneous buffered and polled reads. Needed on a Lego Mindstorms EV3 where some channels are used for power supply monitoring at a very low rate. * ti-dac5571 - Remove an unused variable. * xadc - Drop some dead code. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAltXZAkRHGppYzIzQGtl cm5lbC5vcmcACgkQVIU0mcT0FohYZQ//VAjpDBYjLzYTvTJy5bDt61fbh8KabhBf oxLIpwYrCeleLnpbrY7nU8shdIL7Vm755jtsHbTtQPCKSQ0RGnhLLDoqoWcmn70J rF9iVaSv+S2lZO+9+hv2eeqyX+kSM+74fkWRuLmDbaSZWYO4Jt9zFER1zizmPypY DnxLcViw1kwOLbiZKwmcaK0MqlWHRPhEEcNVKy7VGZHznbylujh8evkzzQNVWOol QrR2NG7V8BcLTflmsYCErQDvgciGjscnVZUAyY3yNLIpceGCSHZfUsE8ld6iPrS+ aPeuiIxDhHAKyoOTQwsGi9ex7KEOUOkoDHhKdR3Jr74mtfcPF5B+TxgXU0p5UZ9g GummuvSX0izYjUZ9P4keVgu3W4bvmR9Kd8oJUHNByWI1iecoXP9bQf33tEyb26R6 G1zvGSDXPNK1V7OEaGvzGkgxOY0ZAIWLRX/+wasErdJnt3lmOV9+cCSkJAFSNrk3 jQ922q2ZWLfYAL6nNIAx2dIiJirxTQ2JIq/bys2BHiYvkuvqNcKoBIDAGlQ4xBKm /c5z9Dm/DxQpdlKFQugHmc5awLEZxpq2LCTBLlgM8z6+uRWXui+slPfIrfX5RWun BHaLmPNm6tKQLadjwWCoxXYjKqgK0wm35Yq5d5He7d45d3QWKvtUgZAj33pcIgTE wKmwF5oaLiU= =T+hS -----END PGP SIGNATURE----- Merge tag 'iio-for-4.19b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next Jonathan writes: Second set of IIO new device support, features and cleanups. There are also a couple of fixes that can wait for the coming merge window. Core new features * Support for phase channels (used in time of flight sensors amongst other things) * Support for deep UV light channel modifier. New Device Support * AD4758 DAC - New driver and dt bindings. * adxl345 - Support the adxl375 +-200g part which is register compatible. * isl29501 Time of flight sensor. - New driver * meson-saradc - Support the Meson8m2 Socs - right now this is just an ID, but there will be additional difference in future. * mpu6050 - New ID for 6515 variant. * si1133 UV sensor. - New driver * Spreadtrum SC27xx PMIC ADC - New driver and dt bindings. Features * adxl345 - Add calibration offset readback and writing. - Add sampling frequency control. Fixes and Cleanups * ad5933 - Use a macro for the channel definition to reduce duplication. * ad9523 - Replace use of core mlock with a local lock. Part of ongoing efforts to avoid confusing the purpose of mlock which is only about iio core state changes. - Fix displayed phase which was out by a factor of 10. * adxl345 - Add a link to the datasheet. - Rework the use of the address field in the chan_spec structures to allow addition of more per channel information. * adis imu - Mark switch fall throughs. * at91-sama5d2 - Fix some casting on big endian systems. * bmp280 - Drop some DT elements that aren't used and should mostly be done from userspace rather than in DT. * hx711 - add clock-frequency dt binding and resulting delay to deal with capacitance issue on some boards. - fix a spurious unit-address in the example. * ina2xx - Avoid a possible kthread_stop with a stale task_struct. * ltc2632 - Remove some unused local variables (assigned but value never used). * max1363 - Use device_get_match_data to remove some boilerplate. * mma8452 - Mark switch fall throughs. * sca3000 - Fix a missing return in a switch statement (a bad fallthrough previously!) * sigma-delta-modulator - Drop incorrect unit address from the DT example. * st_accel - Use device_get_match_data to drop some boiler plate. - Move to probe_new for i2c driver as second parameter not used. * st_sensors library - Use a strlcpy (safe in this case). * st_lsm6dsx - Add some error logging. * ti-ads7950 - SPDX - Allow simultaneous buffered and polled reads. Needed on a Lego Mindstorms EV3 where some channels are used for power supply monitoring at a very low rate. * ti-dac5571 - Remove an unused variable. * xadc - Drop some dead code.	2018-07-25 10:12:07 +02:00
Daniel Thompson	633786736e	backlight: pwm_bl: Fix uninitialized variable Currently, if the DT does not define num-interpolated-steps then num_steps is undefined and the interpolation code will deploy randomly. Fix with a simple initialize to zero. Fixes: 573fe6d1c25c ("backlight: pwm_bl: Linear interpolation between brightness-levels") Reported-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Tested-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>	2018-07-25 09:09:59 +01:00
Benjamin Herrenschmidt	2450ceaf21	ARM: dts: aspeed: Add coprocessor interrupt controller Add a node for the CVIC (the coprocessor interrupt controller) and add a label to the SRAM node so it can be referenced from the board device-tree file. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Joel Stanley <joel@jms.id.au>	2018-07-25 17:38:03 +09:30
Kalle Valo	bf9b608e63	Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath.git patches for 4.19. Major changes: wcn36xx * fix WEP in client mode wil6210 * add support for Talyn-MB (Talyn ver 2.0) device * add support for enhanced DMA firmware feature	2018-07-25 10:50:54 +03:00
Rafał Miłecki	299b6365a3	brcmfmac: fix regression in parsing NVRAM for multiple devices NVRAM is designed to work with Broadcom's SDK Linux kernel which fakes PCI domain 0 for all internal MMIO devices. Since official Linux kernel uses platform devices for that purpose there is a mismatch in numbering PCI domains. There used to be a fix for that problem but it was accidentally dropped during the last firmware loading rework. That resulted in brcmfmac not being able to extract device specific NVRAM content and all kind of calibration problems. Reported-by: Aditya Xavier <adityaxavier@gmail.com> Fixes: 2baa3aaee27f ("brcmfmac: introduce brcmf_fw_alloc_request() function") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>	2018-07-25 10:30:36 +03:00
Emmanuel Grumbach	0a5257bc6d	iwlwifi: add more card IDs for 9000 series Add new device IDs for the 9000 series. Cc: stable@vger.kernel.org # 4.14 Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>	2018-07-25 10:29:29 +03:00
Alan Chiang	a2b3bf4846	eeprom: at24: Add support for address-width property Provide a flexible way to determine the addressing bits of eeprom. Pass the addressing bits to driver through address-width property. Signed-off-by: Alan Chiang <alanx.chiang@intel.com> Signed-off-by: Andy Yeh <andy.yeh@intel.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>	2018-07-25 09:17:57 +02:00
Alan Chiang	21d0405450	dt-bindings: at24: Add address-width property Currently the only way to use a variant of a supported model with a different address width is to define a new compatible string and the corresponding chip data structure. Provide a flexible way to specify the size of the address pointer by defining a new property: address-width. Signed-off-by: Alan Chiang <alanx.chiang@intel.com> Signed-off-by: Andy Yeh <andy.yeh@intel.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Rob Herring <robh@kernel.org> [Bartosz: fixed the commit message] Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>	2018-07-25 09:17:24 +02:00
Masahiro Yamada	0fbe9a245c	microblaze: add endianness options to LDFLAGS instead of LD With the recent syntax extension, Kconfig is now able to evaluate the compiler / toolchain capability. However, accumulating flags to 'LD' is not compatible with the way it works; 'LD' must be passed to Kconfig to call $(ld-option,...) from Kconfig files. If you tweak 'LD' in arch Makefile depending on CONFIG_CPU_BIG_ENDIAN, this would end up with circular dependency between Makefile and Kconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com>	2018-07-25 09:14:47 +02:00
Martin Schwidefsky	6eedfaac71	s390: reenable gcc plugins Now that the early boot rework is upstream we can enable the gcc plugins again. See git commit 72f108b308707f21499e0ac05bf7370360cf06d8 "s390: disable gcc plugins" for reference. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-07-25 09:07:20 +02:00
Martin Schwidefsky	2a6777a118	s390: disable gcc plugins The s390 build currently fails with the latent entropy plugin: arch/s390/kernel/als.o: In function `verify_facilities': als.c:(.init.text+0x24): undefined reference to `latent_entropy' als.c:(.init.text+0xae): undefined reference to `latent_entropy' make[3]: * [arch/s390/boot/compressed/vmlinux] Error 1 make[2]: * [arch/s390/boot/compressed/vmlinux] Error 2 make[1]: *** [bzImage] Error 2 This will be fixed with the early boot rework from Vasily, which is planned for the 4.19 merge window. For 4.18 the simplest solution is to disable the gcc plugins and reenable them after the early boot rework is upstream. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> (cherry picked from commit 2fba3573f1cf876ad94992c256c5c410039e60b4)	2018-07-25 09:07:09 +02:00
Joel Stanley	b0ddc9106c	ARM: config: multi_v5: Enable ASPEED drivers This enables the devices used in the AST2400 family of BMC SoCs: - VUART - SPI NOR - LPC controller - LPC snoop (port 80) - Ethernet - GPIO - ADC - I2C - Random number generator - IPMI KCS - IPMI BT - Fan/Tach Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Joel Stanley <joel@jms.id.au>	2018-07-25 16:35:47 +09:30
Joel Stanley	fc2a325bbc	ARM: config: multi_v5: Refresh configuration This is the result of a make mutli_v5_defconfig && make savedefconfig. Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Joel Stanley <joel@jms.id.au>	2018-07-25 16:35:42 +09:30
Joel Stanley	20c90af9ea	ARM: config: aspeed: Update defconfig - Enable new support: hardware random number generator FSI and client drivers DRM GFX driver - Disable unwanted features: ARM_APPENDED_DTB ARM_ATAG_DTB_COMPAT BLK_DEV_RAM - Sync G4 and G5 with OpenBMC configurations BLK_DEV_LOOP, for updater mechanic CRYPTO_HMAC, for libsdbus features CRYPTO_SHA256 CRYPTO_USER_API_HASH - Enable security related features: SLAB_FREELIST_RANDOM STRICT_KERNEL_RW CC_STACKPROTECTOR_STRONG HARDENED_USERCOPY FORTIFY_SOURCE - Increase kernel log buffer size Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Joel Stanley <joel@jms.id.au>	2018-07-25 16:35:19 +09:30
Benjamin Herrenschmidt	375cac7010	fsi: master-ast-cf: Mask unused bits in RTAG/RCRC Then reading the RTAG/RCRC "registers" from the coprocessor after a command is complete, mask out the top bits, only keep the relevant bits. Microcode v5 will leave garbage in those top bits as a result of a performance optimization. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> ---	2018-07-25 16:04:53 +10:00
Jeremy Cline	e66565f3be	bpf: Add Python 3 support to selftests scripts for bpf Adjust tcp_client.py and tcp_server.py to work with Python 3 by using the print function, marking string literals as bytes, and using the newer exception syntax. This should be functionally equivalent and supports Python 3+. Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-25 07:46:48 +02:00
YueHaibing	2cc512c1fa	bpf: btf: fix inconsistent IS_ERR and PTR_ERR Fix inconsistent IS_ERR and PTR_ERR in get_btf, the proper pointer to be passed as argument is '*btf' This issue was detected with the help of Coccinelle. Fixes: 2d3feca8c44f ("bpf: btf: print map dump and lookup with btf info") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-25 07:43:31 +02:00
Daniel Borkmann	684cce1c31	Merge branch 'bpf-annotate-kv-pair' Martin KaFai Lau says: ==================== The series allows the BPF loader to figure out the btf_key_id and btf_value_id from a map's name by using BPF_ANNOTATE_KV_PAIR() similarly as in iproute2 commit f823f36012fb ("bpf: implement btf handling and map annotation"). It also removes the old 'typedef' way which requires two separate typedefs (one for the key and one for the value). By doing this, iproute2 and libbpf have one consistent way to figure out the btf_key_type_id and btf_value_type_id for a map. The first two patches are some prep/cleanup works. The last patch introduces BPF_ANNOTATE_KV_PAIR. v3: - Replace some more int_t and u* usages with the equivalent __[su]* in btf.c v2: - Fix the incorrect '&&' check on container_type in bpf_map_find_btf_info(). - Expose the existing static btf_type_by_id() instead of creating a new one. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-07-25 07:00:27 +02:00

... 115 116 117 118 119 ...

781771 Commits