Inspired by a piece shown in class where the scene changes based on voice input, I wanted to explore a similar interaction—controlling the environment by drawing instead of speaking. My idea was to create a small drawing area where users can sketch, and the background would switch between day and night depending on what they draw. Since this was my first time working with DoodleNet, I decided to keep it simple: the system only reacts to sun, moon, or star drawings.

In my first draft, the program simply looked for the label with the highest confidence from the model and checked if it was one of the three keywords. However, this didn’t work very well—most of the time, DoodleNet recognized my doodles as unrelated objects (like “banana” or “circle”).

image.png

To improve it, I changed the identification logic:

  1. The model outputs many possible labels (e.g., “sun,” “flower,” “star,” “circle,” “moon,” …).
  2. The code filters out everything not in a whitelist: only sun, moon, and star/stars are kept.
  3. Among these, it picks the one with the highest confidence.
  4. If this label’s confidence is above a certain threshold (default 0.5) and it’s been at least 800 ms since the last switch, the system updates the background:
  5. If none of the whitelist labels appear or confidence is too low, nothing changes.

This logic makes the interaction more stable—only clear, confident sketches trigger transitions. It still struggles a bit with moon recognition (probably because the crescent shape has fewer unique features), but sun and star work smoothly. Overall, I’m happy with how this final version performs

38ff0935a8b459c7547cac787e7d10b2.mp4