
Daily robotics journey update (part 2)
Ok, it was time for a shitty day again
Thursday, 3rd of July
Today I kicked off training for an ACT and SmolVLA model. In the early evening, I tried out the first checkpoints and was disappointed. The 20k step checkpoint of the SmolVLA model doesn't work at all, and the 80k ACT model (this was the furthest checkpoint at the time) isn't much better.
SmolVLA model trained for 20k steps at a 8 sample batch size:
ACT model trained for 80k steps at a batch size of 8:
I need to wait for the final results, but I'm not quite sure what’s best to do if the final models aren't better tomorrow. I'm gonna think out loud:
- First, I should do a systematic evaluation of the checkpoints to understand:
- Which of the two models is working better, and is it by a clear margin?
- Does longer training (later checkpoints) lead to better results? → If yes, then train a model beyond 100k steps.
- The 100 samples where I only move the rooks could be confusing to the model. Does a model trained without this data do better?
- I could train another 500 episodes, train on the joint dataset, and see if this improves the model.
- I could do a literature review on how others have trained imitation learning models and see if there are any tricks to get this done.
Ok, I just revisited this video I had seen 1–2 months ago of someone else who built a chess robot. I noticed the description saying it’s trained on 1500 episodes! I currently have 500 — or 400 if you only count episodes from chess games — so maybe I just need to collect more data. The robots in the video have a better setup:
(1) The robots are in a fixed position relative to the chessboard, and
(2) Their camera is pointing at the pieces from below, such that the pieces are in view from the beginning.
Maybe the solution really is just collecting more data.