Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAN bus host support for Marlin running MKS Monster 8 V1/V2 board (STM32F4) tech demo #27547

Open
wants to merge 9 commits into
base: bugfix-2.1.x
Choose a base branch
from

Conversation

rondlh
Copy link
Contributor

@rondlh rondlh commented Nov 26, 2024

Description

This is NOT a real pull request, just some code that might be useful for inspiration. I hope I don't break the rules by doing this, if so I will delete the PR asap. This is very experimental code, but it has been working well for me for quite some time. Unfortunately the code is hardware specific. I have converted 2 printers with this setup already, and consider to convert an IDEX printer too.

What is this code?
I have added CAN bus support to Marlin based on the MKS Monster8 V1/V2 board. This way the board can communicate with a BTT EBB42 V1.2 (STM32G0) board. The EBB42 also runs Marlin, but the client side of the code (will post it if there is interest).
The head board controls the hotend heater, auto fan, cooling fan, bed level sensor, neopixel, but NOT the extruder stepper (I could not get that working yet). The head reports back the extruder temperature, probe status, endstops, and filament detector status.
This works perfectly fine and stable, only the CAN cable (4 wires) and stepper cable go to the head. This might be reduced to only the CAN cable if I can manage to synchronize the head stepper with the host requirements.

I've made a protocol that can send gcode to the head via the CAN bus, to make this efficient I encode a lot of the gcode in the identifier of the CAN message:

CAN messages to head

The host filters the gcode, because only very few gcodes are relevant for the head. A gcode starts with an extended ID message, and additional standard ID messages are send if there are more than 2 parameters (currently up to 7 parameters are supported). The parameters values are encodes as floats and send via the data part of the CAN message (0-8 bytes of data).
The head converts the message back to gcode and executes it on the head.
This whole process is interrupt based, and quite fast.
On the head a Z-probe trigger causes an interrupt which creates an CAN message to inform the host, which triggers an interrupt on the host to report the probe trigger.
Hotend temperature reports are only send about twice a second, it's fully managed by the head, but the host "thinks" it is managing this.

Some things to consider:

  1. The CAN bus module must be enabled in platform IO to enable the required CAN libraries.

  2. MPC only - I have only implemented MPC control for the hotend (no PID), which works in a transparent way. If the host received a "M306 T"(MPC autotune) gcode, the this gcode is aborted for the host and forwarded to the head. The head will do the autotune task and then sends back a report to the host about it's findings. The host stores the MPC settings. The MPC (and other) settings are send to the host at startup.

  3. Thermistor selection is done by the head! The host thermistor selection is irrelevant! This is decided at compile time and cannot be changed by sending configuration gcodes to the head. For this reason I let the head send a message to the host to inform the user which thermistor is selected.

  4. BLTouch/3D touch - They are connected to the head, but Marlin has some "underwater" activities that need to be forwarded to the host. So servo activity is forwarded to the host. I use BLtouch on the head, but it should work with a 3d touch too. (Actually I have simplified the use of the BLTouch to act more like a 3D Touch by keeping it in Switch mode all the time).

  5. If the CAN bus is selected on the MKS Monster V1/V2 then the I2C bus is disconnected from the onboard EEPROM, so it will not be available anymore. Program memory can be used instead, or 2 patch wires can restore the connection to the EEPROM (I use PA8 and PC9).

  6. The EBB42 uses a STM32G0, which as FDCAN support (more advanced), which is quite different from the CAN implementation on of the host

  7. The STM32 CAN hardware has 2 FIFO buffers with only 3 positions for the CAN messages. This is not a lot, but sufficient (especially if the head stepper driver is not used). I use the CAN message filter to store the incoming messages alternating in FIFO0 and FIFO1 so effectively 6 buffers are used. For this reason I toggle 1 bit in the message identifier ID so the receiver forwards the message to the FIFO I target.

Requirements

MKS Monster 8 V1/V2 motherboard
BTT EBB42 V1.2 CAN bus head module

Benefits

Much easier wiring, only CAN bus (4 wires) and stepper (4 wires). The head has a TMC stepper driver on board, but I could not get it synchronized correctly with the host. It can control the hotend, auto fan, controller fan, neopixel, servo probe and report back the temperature, probe and endstop status.

Nothing really, it's a tech demo for inspiration.

Configurations

The whole configuration is available, I only made minimal changes to the source files.

Related Issues

#7735

Added experimental CAN host support to communicate with BTT EBB42 V1.2 CAN head board running Marlin
Fix typo in Configuration.,h for MOTHERBOARD
thinkyhead added a commit to MarlinFirmware/Configurations that referenced this pull request Nov 26, 2024
@thinkyhead thinkyhead force-pushed the bugfix-2.1.x branch 2 times, most recently from 16f4aea to 96cc5d3 Compare November 26, 2024 23:03
@thinkyhead
Copy link
Member

I cleaned it up and got it building so you can continue to test and refine. The handling of the probe probably needs to be more specific. Users need to be able to mix-and-match endstops and probes according to their designs, but we can also add options to enable specific CAN peripherals that have known protocols and hardware such as built in Z probes and sensorless XY endstops, etc.

@rondlh
Copy link
Contributor Author

rondlh commented Nov 27, 2024

I cleaned it up and got it building so you can continue to test and refine.

@thinkyhead Thanks, well done, you did a lot of good work there. You could have told me that I completely forgot to include CAN.h, sorry about that!

The handling of the probe probably needs to be more specific. Users need to be able to mix-and-match endstops and probes according to their designs, but we can also add options to enable specific

Agree, this needs some work, preparations are made, but it's currently only implemented for the Z-probe.
Virtual Endstops work like this:
On the head board the ENDSTOP_INTERRUPTS_FEATURE is enabled, which triggers and interrupt which sends a CAN message with the updated IO status to the host. On the host the incoming CAN message triggers, an interrupt which causes the CAN message to be reads from the CAN hardware FIFO, the read message updates CAN_io_state and "endstops.update()" is called.

Here are the masks to get the relevant bits stored in CAN_io_state:

  #define CAN_PROBE_MASK                   1 // Virtual IO bit for probe
  #define CAN_FILAMENT_MASK                2 // Virtual IO bit for filament
  #define CAN_X_ENDSTOP_MASK               4 // Virtual IO bit for X-endstop
  #define CAN_Y_ENDSTOP_MASK               8 // Virtual IO bit for Y-endstop
  #define CAN_Z_ENDSTOP_MASK              16 // Virtual IO bit for Z-endstop
  #define CAN_STRING_MESSAGE_MASK         32 // Signals the head sent a string message
  #define CAN_REQUEST_SETUP_MASK          64 // Signals the head requests setup information
  #define CAN_TMC_OT_MASK                128 // Signals the head signals a TMC Over Temp error
  #define CAN_E0_TARGET_MASK             256 // Signals E0 or E1
  #define CAN_ERROR_MASK                 512 // Signals the head encountered an error

One should be able to define which CAN virtual IO bits are relevant for the host, only these endstop results should be transferred to the host process. I'm not sure how I could transparently transfer the virtual endstop status into Marlin.
@thinkyhead Perhaps you can point me in the right direction...

CAN peripherals that have known protocols and hardware such as built in Z probes and sensorless XY endstops, etc.

I believe most Z-probes should work, they are configured on the head side. Servo commands are forwarded to the head, so BLTouch/3DTtouch and servo probes should work (I use a BLTouch).
Any inductive, capacitive or micro switch based probe will work, as they just rely on a digital pin status.
I'm note sure about MAGLEV4 probes, if they are triggered by gcode commands, then they should also work (relevant gcodes need to be forwarded).
BD_SENSOR will need some work, the head can send any data required to the host.

What CAN peripherals do you have in mind? Stepper drivers?

I have not considered sensorless XY endstops on the head board, I assumed to leave this on the host side. The board that controls X and Y should be the host.
For reference, here the pinout of the head board I use, it's a tiny 42x42mm board with only 1 stepper driver.

BTT EBB42-v1 1-v1 2

I would be quite interested to get the TMC2209 and ADXL345 3-axis accelerometer working.
The TMC2209 is fully supported of course, the problem is syncing the movements between the host and the head. I tried sending the extruder gcodes to the head, which works fine, but soon the extruder motion will go out of sync. I'm not sure how to handle this, perhaps I should send the motion blocks directly to the head, or some dedicated synchronization might be required.

@thisiskeithb thisiskeithb linked an issue Nov 27, 2024 that may be closed by this pull request
@oliof
Copy link

oliof commented Nov 27, 2024

you'll need clock synchronization between the can boards (and execute based on synchronized time). https://github.com/Duet3D/CANlib/blob/3.5-dev/doc/Duet3CAN-FDProtocol.md#time-synchronisation may be a useful read.

@rondlh
Copy link
Contributor Author

rondlh commented Nov 27, 2024

Just to clarify, here the current printer setup I'm running, 8 wires go to the head, 4 for the CAN bus and 4 for the extruder stepper. This is fully working already.

CAN Printer Setup

@rondlh
Copy link
Contributor Author

rondlh commented Nov 27, 2024

you'll need clock synchronization between the can boards (and execute based on synchronized time). https://github.com/Duet3D/CANlib/blob/3.5-dev/doc/Duet3CAN-FDProtocol.md#time-synchronisation may be a useful read.

Great tip, thanks. I had a look, they report that FD CAN is needed, because they want to transmit more than 8 bytes per message, which is not supported by CAN. The STM32F4xx does not support FD CAN. The STM32G0 (BTT EBB42 V1.2) supports FD CAN. Currently I don't see a real reason why time synchronization could not work over CAN because two 32bit timestamps will fit within the available 8 bytes, so a NTP style sync should work.
I will give that a try and see what numbers I find for time sync accuracy, turn-around time and drift based on Marlin's 2MHz stepper clock.

@rondlh
Copy link
Contributor Author

rondlh commented Nov 29, 2024

I'm testing a NTP like time sync, here's the code running on the tool head: The numbers are timestamps based on micros()

// NTP style time sync
  // t[0] = local time sync request time
  // t[1] = host time sync receive time
  // t[2] = host time sync reply time
  // t[3] = local time stamp receive time
  if (t[3] != 0) {
    t[0] += time_offset; // Adjust local time stamps with time_offset
    t[3] += time_offset; // Adjust local time stamps with time_offset
    uint32_t local_time_adjustment = (t[1] - t[0] + t[2] - t[3]) >> 1;
    time_offset += local_time_adjustment;
    uint32_t Round_trip_delay = (t[3] - t[0] - t[2] + t[1]);
    SERIAL_ECHOLNPGM("t0: ", t[0], " us");
    SERIAL_ECHOLNPGM("t1: ", t[1], " us");
    SERIAL_ECHOLNPGM("t2: ", t[2], " us");
    SERIAL_ECHOLNPGM("t3: ", t[3], " us");
    SERIAL_ECHOPGM("Local time adjustment: ", local_time_adjustment, " us");
    if (t[4]) {
      SERIAL_ECHOLNPGM(" after ",  ftostr42_52(float(t[0] - t[4]) / 1000000.0), " seconds");
      SERIAL_ECHOLNPGM("Drift: ", local_time_adjustment / (float(t[0] - t[4]) / 1000000.0), " us/s");
    }
    else
      SERIAL_EOL();

    SERIAL_ECHOLNPGM("Time offset: ", time_offset, " us");
    SERIAL_ECHOLNPGM("Round trip delay: ", Round_trip_delay, " us");
    SERIAL_ECHOLNPGM("Host response time: ", (t[2] - t[1]), " us");

    t[4] = t[0]; // Store previous time sync request time
    t[3] = 0;
  }

Here 2 logs:

t0: 150581211 us
t1: 150582363 us
t2: 150582365 us
t3: 150581446 us
Local time adjustment: 1035 us after 74.75 seconds
Drift: 13.85 us/s
Time offset: 2677282 us
Round trip delay: 233 us
Host response time: 2 us


t0: 210998246 us
t1: 210999201 us
t2: 210999209 us
t3: 210998484 us
Local time adjustment: 840 us after 60.42 seconds
Drift: 13.90 us/s
Time offset: 2678122 us
Round trip delay: 230 us
Host response time: 8 us

I'm not sure how accurate the time synchronization needs to be, anybody?
Based on these numbers I think an accurate time sync is possible, especially if the drift is taken into account actively.
I can foresee an issue.
Even if the tool head movement start time is correct, the movement finish time could be different because of the drift. The tool head could finish the movement early than the host, causing stutters, or later which could cascade to the next moves. So I guess the movement on the tool head actively needs to be adjusted (slowed down or accelerated) to counter this effect.
Any input on how to proceed is welcome.

@oliof
Copy link

oliof commented Nov 29, 2024

TL;DR if you look elsewhere, RRFs time-sync protocol is precise for up to 10us.

Maybe a couple notes here:

  • the round trip time of 233 us is awfully close to Klipper's maximum round trip time allowed of 250us

  • RRFs CAN-FD has way lower round trip times than Klipper (mostly because it doesn't layer on top of USB). I wil ltry to find where I saw the numbers.

  • A time drift of 1000us per minute is catastrophic for 3d printing. It's 0.1 seconds. In 0.1 seconds worth of deviation

    • extrusion at 30mm/sec will miss corners, and retracts and PA moves will just not happen at the right times
    • endstops and probes will report early or late, missing the mark.

Please also consider that it took RRF about 5 years to remove most, but not all limitations when using CAN-FD connected expansion boards, many of which relate to board-to-board communication delays or logic challenges.

There are some further differences between Klipper's CAN implementation and RRFs CAN-FD implementation:

  • Klipper's CAN is layered on top of USB and never can be better than USB for connecting MCUs for that reason
  • The Klipper host-to-mcu protocol sends step pulse trains; RRF sends complete moves
  • RRF has liveness checks for CAN-FD boards and you can (and must) manage when boards disappear or appear (DangerKlipper has a facility to mark boards as non critical, such as accelerometers. Usually any of the boards disappearing makes Klipper crash in an attempt to not run uncontrolled)
  • RRF allows you to assign (numeric) IDs to boards, which makes it easy to reference them for configuration. Marlin's mostly static approach to configuration probably also needs this for tunable parameters and error reporting. Trying to sort out boards by UUIDs is no fun.
  • Since I am already rambling, let me add that the bandwidth of CAN will likely be sufficient for simple setups like one main board and one or two toolheads. More complex setups like tool changers or machines with a higher number of axes that need to move in coordination (not necessarily 3d printers) may put a squeeze on CAN, and CAN-FD will give you more headspace here.

Added FDCAN tool head part
Added NTP style CAN bus time sync (test)
@rondlh
Copy link
Contributor Author

rondlh commented Nov 29, 2024

Thanks a lot for the detailed reply!

TL;DR if you look elsewhere, RRFs time-sync protocol is precise for up to 10us.

OK, I will target a time-sync below 10us.

Maybe a couple notes here:
* the round trip time of 233 us is awfully close to Klipper's maximum round trip time allowed of 250us

Some background. The used CAN bus speed is 1M bit, which is the max for the STM32F4 board, the STM32G0 board supports 2M bit. A Standard ID (11 bits) CAN message with an 8 byte payload (2 x 32bit timestamp) is already 115 bits (if I counted it correctly), and thus 115us for one way of the message. I could perhaps optimize that a little bit, but not by much.
I do not understand why the round trip time is so critical (max allowed 250us, why?), I don't think there is much influence on the time sync accuracy, unless the numbers get really big.

* RRFs CAN-FD has way lower round trip times than Klipper (mostly because it doesn't layer on top of USB). I wil ltry to find where I saw the numbers.

* A time drift of 1000us per minute is catastrophic for 3d printing. It's 0.1 seconds. In 0.1 seconds worth of deviation

The boards I use have low cost crystal oscillators which have a typical accuracy of +/- 20ppm. The drift I found is about 14ppm which is the difference between 2 boards (not a bad result!). This will be the same for any firmware that is running on the boards. The software will have to actively compensate for the drift. And a suitable resync interval can be chosen to guarantee that the maximal deviation stays below 10us. I think that is possible, but more testing is needed.

  * extrusion at 30mm/sec will miss corners, and retracts and PA moves will just not happen at the right times
  * endstops and probes will report early or late, missing the mark.

I am already using this setup for probing (BLTouch), which works very reliably. The probing process is fully interrupt driven (on both sides). The CAN bus just adds a relatively constant time delay of about 100us. As long as the time delay is constant there will be no effect on the probing results. Note that depending on your hardware and Marlin setup, probing will be done with an accuracy of 1ms (1000 samples/s), so the CAN bus should not be the limiting factor.

Please also consider that it took RRF about 5 years to remove most, but not all limitations when using CAN-FD connected expansion boards, many of which relate to board-to-board communication delays or logic challenges.

There are some further differences between Klipper's CAN implementation and RRFs CAN-FD implementation:

* Klipper's CAN is layered on top of USB and never can be better than USB for connecting MCUs for that reason

* The Klipper host-to-mcu protocol sends step pulse trains; RRF sends complete moves

That is interesting, I need to read more about this. Internally Marlin has move blocks. I would expect to send these blocks to the tool head with a start timestamp. The tool head probably needs to adjust the move block to compensate for the difference in clock frequency, but the step count must not be affected.

* RRF has liveness checks for CAN-FD boards and you can (and must) manage when boards disappear or appear (DangerKlipper has a facility to mark boards as non critical, such as accelerometers. Usually any of the boards disappearing makes Klipper crash in an attempt to not run uncontrolled)

* RRF allows you to assign (numeric) IDs to boards, which makes it easy to reference them for configuration. Marlin's mostly static approach to configuration probably also needs this for tunable parameters and error reporting. Trying to sort out boards by UUIDs is no fun.

* Since I am already rambling, let me add that the bandwidth of CAN will likely be sufficient for simple setups like one main board and one or two toolheads. More complex setups like tool changers or machines with a higher number of axes that need to move in coordination (not necessarily 3d printers) may put a squeeze on CAN, and CAN-FD will give you more headspace here.

Right, there is a lot of work to do, I try to start simple, then go step by step.
I just posted the tool head side of the software, so anybody can give it a try now.

@oliof
Copy link

oliof commented Nov 30, 2024

I just posted the tool head side of the software, so anybody can give it a try now.

I just want to say that I am absolutely positively excited about this and I just want to make sure CAN / CAN-FD in Marlin will be as delightful an experience as in RRF (-:

@rondlh
Copy link
Contributor Author

rondlh commented Nov 30, 2024

I just posted the tool head side of the CAN software, which is running on a BTT EBB42 V1.2 board (STM32G0). The board has FD CAN support at 2M bit/s, but because the host (STM32F407) only has CAN support, the FDCAN features cannot be used.

IMPORTANT NOTE:

It seems the used framework to compile Marlin for the STM32G0 does not have full support for the CAN callbacks. To solve this I needed to override a function, which means that I had to make a change to the framework. I talked to the developers about this a long time ago, but have not seen any action on it yet, perhaps it will only be done in a future version.

The framework change that is required (Windows environment in PlatformIO).
In \Users<username>.platformio\packages\[email protected]\libraries\SrcWrapper\src\HardwareTimer.cpp
Add "__weak" in front of "void TIM16_IRQHandler(void)"
So you get: __weak void TIM16_IRQHandler(void)

Note that this part "[email protected]" of the framework path could be different, depending on which frameworks you have installed already. An easy way to find where the framework files are located is by using the "Go to definition" function in PlatformIO (F12 or right click) on a function that is provided by the framework. For example you can right click on "HAL_FDCAN_GetTxFifoFreeLevel".

@oliof
Copy link

oliof commented Nov 30, 2024 via email

@thisiskeithb
Copy link
Member

I just want to say that I am absolutely positively excited about this

Me too. Fewer wires to the toolhead will be so nice!

ini/stm32g0.ini Outdated Show resolved Hide resolved
@oliof
Copy link

oliof commented Nov 30, 2024

just a note that I think the deadline timeout of Klipper likely is 250msec not usec. Polling rate of USB is 125usec already, so reply/response hits those 250usec in an ideal stream without any processing. Sorry for mixing up the millis and the micros!

rondlh and others added 3 commits December 1, 2024 00:21
…vironment

Remove unneeded line from the build_flags: "-DTIMER_SERIAL=TIM4"
... and reorder so CAN is primary
@thisiskeithb
Copy link
Member

For the changes I pushed in dd5a2ff, I meant EBB42, but same difference 😄 Since CAN will be more common than the filament extruder configuration, I reordered them.

I also ran a make format-pins -j to format the MONSTER8 pins in 73bb491 which takes care of the failing "Validate Pins Files" test.

Fixes an issue in the env:BTT_EBB42_V1_1_FDCAN build_flags
@rondlh
Copy link
Contributor Author

rondlh commented Dec 1, 2024

Here some new data which shows that the host and tool head time can be calculated accurately based on a few time sync cycles. Note that the earlier and shorter test periods are less accurate. Later we are within 1us even after 3 minutes.

t0: 65082290 us
t1: 65082522 us
t2: 65082532 us
t3: 65082534 us
Predicted time adjustment: 114.82 us
Local time adjustment: 115 us after 8.19 seconds

Drift: 14.04 us/s
Time offset: 2676043 us
Round_trip_delay: 234 us
Host response time: 10 us

================================

t0: 79930405 us
t1: 79930729 us
t2: 79930742 us
t3: 79930651 us
Predicted time adjustment: 208.44 us
Local time adjustment: 207 us after 14.85 seconds

Drift: 13.94 us/s
Time offset: 2676250 us
Round_trip_delay: 233 us
Host response time: 13 us

=========================

t0: 259642612 us
t1: 259645234 us
t2: 259645250 us
t3: 259642862 us
Predicted time adjustment: 2505.40 us
Local time adjustment: 2505 us after 179.71 seconds

Drift: 13.94 us/s
Time offset: 2678755 us
Round_trip_delay: 234 us
Host response time: 16 us

@rondlh
Copy link
Contributor Author

rondlh commented Dec 1, 2024

I would like to add that there are CAN-FD modules for RRF in stepstick or EXP1/EXP2 format that add CAN-FD via SPI. I believe @thinkyhead might have gotten one in the past, and if there is interest I could try to get you one.

Thanks for your kind offer, I guess you mean one of those MCP2518FD based boards. That is perhaps something to look into at a later time. I believe FDCAN is not required for this task, CAN has the needed speed and bandwidth. FDCAN can run faster and can send more data (64 bytes instead of 8 bytes) per message. That would make things easier of course.

just a note that I think the deadline timeout of Klipper likely is 250msec not usec. Polling rate of USB is 125usec already, so reply/response hits those 250usec in an ideal stream without any processing. Sorry for mixing up the millis and the micros!

I'm not sure if we are talking about the same thing. The data log are only about a NTP like time sync protocol I'm testing.
t0 = (client) time sync request send time
t1 = (host) time sync request receive time
t2 = (host) time sync request reply time
t3 = (client) time sync request response time
Round trip time = ((t1 - t0) + (t2 - t3)) / 2

@thisiskeithb Thanks again for the great support.

In my view there are 2 main task to get synchronized motion working:

  1. Time needs to be synchronized between the host and the tool head (target better than 10us). This seems possible (see previous post).
  2. The movement data needs to be transmitted from the host to the tool head.
    There are a lot of questions to be answered here. To be continued.

@rondlh
Copy link
Contributor Author

rondlh commented Dec 2, 2024

Some thoughts on motion synchronization.

Even if we can accurately start a movement, the movement duration will not be the same because of the clock rates difference. We can try to adjust the movement speed, but I believe there is not enough resolution to make the required tiny adjustments.
If a tool head movement takes longer than the host movement, then the next tool head move will also start later, which can cascade to become a bigger issue. If the host is aware of this issue, then it can actively compensate for it.

Could the following approach work?

  1. The host processes a gcode, for example "G1 X10 Y20 Z30 E10 F3000", which is converted into movement blocks.

  2. The gcode and the start time of the first related block are transmitted to the tool head, which based on the gcode creates the same movement blocks and adds the movement start time to the first block (or perhaps all blocks).

  3. If a movement block has a start time then we wait until the start time has come. If the block has no start time then the block is processed as usual. This mean that the first host movement has to be delayed a bit to allow the tool head to get ready.

I will investigate the motion blocks to get more insights.
Any input is very welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FR] CAN bus
4 participants