If Z_RAISE_BETWEEN_PROBINGS is null or undefined the probe is currently not raised by home_offset[Z_AXIS] and zprobe_zoffset.
But when different from 0 is.
If an undefined Z_RAISE_BETWEEN_PROBINGS expands to 0 (and it does) this is the solution.
A similar asymmetry exists with the newly introduced 'short-cut' in G28 - but its the rise before anything is probed - so should not make a difference.
`dock_sled()` is never called with offset parameter - remove it.
We move x only - so only that needs to be homed. Consequence is - we can home to z-min now with a sled probe!
Feedrates are set and restored in `do_blocking_move()`.
We already checked if the probe is deployed/stowed in deploy/stow_probe.
```
if (z_loc < _Z_RAISE_PROBE_DEPLOY_STOW + 5) z_loc = _Z_RAISE_PROBE_DEPLOY_STOW;
```
makes no sense - remove.
Now the raise is the same for deploy/stow -> move before the if.
Replace the if with a ternary.
Instead writing LOW/HIGH use the boolean `stow` we already have.
There is no reason for not using the sled probe in G29/M48 with 'E'.
It takes a while but works. (tested!)
and saving ~1k memory
by limiting the `#pragma GCC optimize (3)` optimisation to `ultralcd_st7920_u8glib_rrd.h`. These optimisation was and is not done for all the other displays, is the reason for the big additionally use of memory, because the complete 'ultralcd.cpp' and 'dogm_lcd_implementation.h' was optimised (sadly i did not observe a change in speed).
Unrolling the loop in `ST7920_SWSPI_SND_8BIT()`, what i expected the optimiser to do, by hand, saved some speed by eliminating the loop variable (i) compares and increases. Every CPU cycle in this loop costs at least 0.5ms per display update because it's executed more than 1k times/s.
The delays are now pre-filled with the calculated values for 4.5V driven ST7920.
A way to simply add __your__ timing into the configuration was made.
At 4.5V
1.) The CLK signal needs to be at least 200ns high and 200ns low.
2.) The DAT pin needs to be set at least 40ns before CLK goes high and must stay at this value until 40ns after CLK went high.
A nop takes one processor cycle.
For 16MHz one nop lasts 62.5ns.
For 20MHz one not lasts 50ns.
To fulfill condition 1.) we need 200/62.5 = 3.2 => 4 cycles (200/50 = 4 => 4). For the low phase, setting the pin takes much longer. For the high phase we (theoretically) have to throw in 2 nops, because changing the CLK takes only 2 cycles.
Condition 2.) is always fulfilled because the processor needs two cycles (100 - 125ns) for switching the CLK pin.
Needs tests and feedback.
Especially i cant test 20MHz, 3DRAG and displays supplied wit less than 5V.
Are the delays right? Please experiment with longer or shorter delays. And give feedback.
Already tested are 5 displays with 4.9V - 5.1V at 16MHz where no delays are needed.