For a long time the tick was aligned to clock MONOTONIC so that the tick
event happened at a multiple of nanoseconds per tick starting from clock
MONOTONIC = 0.
At some point this changed as the refined jiffies clocksource which is
used during boot before the TSC or other clocksources becomes usable, was
adjusted with a boot offset, so that time 0 is closer to the point where
the kernel starts.
This broke the assumption in the tick code that when the tick setup
happens early on ktime_get() will return a multiple of nanoseconds per
tick. As a consequence applications which aligned their periodic
execution so that it does not collide with the tick were not longer
guaranteed that the tick period starts from time 0.
The fix for this regression was to realign the tick when it is initially
set up to a multiple of tick periods. That works as long as the
underlying tick device supports periodic mode, but breaks under certain
conditions when the tick device supports only one shot mode.
Depending on the offset, the alignment delta to clock MONOTONIC can get
in a range where the minimal programming delta of the underlying clock
event device is larger than the calculated delta to the next tick. This
results in a boot hang as the tick code tries to play catch up, but as
the tick never fires jiffies are not advanced so it keeps trying for
ever.
Solve this by moving the tick alignement into the NOHZ / HIGHRES
enablement code because at that point it is guaranteed that the
underlying clocksource is high resolution capable and not longer
depending on the tick.
This is far before user space starts, so at the point where applications
try to align their timers, the old behaviour of the tick happening at a
multiple of nanoseconds per tick starting from clock MONOTONIC = 0 is
restored.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmSTUNQTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoQQjD/4+zHK7IxRK+1k1lIStxksEudVeNuqK
7RZKNTawZFSChrxGmv5uhmqYDK9/tudnZot+uyAXI81JHnnGSXbco+VrAA2DQ2v2
w/oxhAHIAobJWDzUuCqT2YsseDVNoTGMrTTPrL4klXqhVtShVOLTI/603+macEiw
Olqmd7XxlyhCloNqoMIh18EwSNIs78kP3YAQHUmq3NoRkJT5c3wWKvZy38WOxSJ9
Jjpw75tePfcwWte9fGVq+oJ8hOWkPAKN5hFL8420RrEgdIkxKlhVmuyG48ieZKH1
LE1brB5iaU8+M91PevZMRlD7oio8u/d6xX1vT3Cqz9UBe1NLj1I0ToX6COhwSNxm
50U0HJ5wGlpodu4IyBHXTG45UAVwpOg/EgDlFigeP3Nxl6h45ugrA010r6FqE1zV
KltjNGsk/OD5d8ywW+yManU2mySm9znqmWN2zbjRGm138Tuz7vglcBhFB6jEJFUF
V8vYPLwI4J2gqifx8NoFBIfLuZp/+D4LOQiJ7XRc9CVm3JLK8Xi8fxnHGTBwJHIn
45WU4hzmY7fjR/gGhAyG7lH6tlo7VTC9/lrO+YtAy2hL2sK14+m/ZH7836I5bAhw
EgSyVlcfR6tgO55I3JgkbNIyO5TUAZPZIAY1TlyR53xXvMkb9aZTh9UMm1xMJ3uX
+sQaLAuDFNUQSg==
=/uvx
-----END PGP SIGNATURE-----
Merge tag 'timers-urgent-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single regression fix for a regression fix:
For a long time the tick was aligned to clock MONOTONIC so that the
tick event happened at a multiple of nanoseconds per tick starting
from clock MONOTONIC = 0.
At some point this changed as the refined jiffies clocksource which is
used during boot before the TSC or other clocksources becomes usable,
was adjusted with a boot offset, so that time 0 is closer to the point
where the kernel starts.
This broke the assumption in the tick code that when the tick setup
happens early on ktime_get() will return a multiple of nanoseconds per
tick. As a consequence applications which aligned their periodic
execution so that it does not collide with the tick were not longer
guaranteed that the tick period starts from time 0.
The fix for this regression was to realign the tick when it is
initially set up to a multiple of tick periods. That works as long as
the underlying tick device supports periodic mode, but breaks under
certain conditions when the tick device supports only one shot mode.
Depending on the offset, the alignment delta to clock MONOTONIC can
get in a range where the minimal programming delta of the underlying
clock event device is larger than the calculated delta to the next
tick. This results in a boot hang as the tick code tries to play catch
up, but as the tick never fires jiffies are not advanced so it keeps
trying for ever.
Solve this by moving the tick alignement into the NOHZ / HIGHRES
enablement code because at that point it is guaranteed that the
underlying clocksource is high resolution capable and not longer
depending on the tick.
This is far before user space starts, so at the point where
applications try to align their timers, the old behaviour of the tick
happening at a multiple of nanoseconds per tick starting from clock
MONOTONIC = 0 is restored"
* tag 'timers-urgent-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/common: Align tick period during sched_timer setup