software engineer student @ polytechnique montreal
mote is like a calculator for your photos
i've been exploring intuitive interaction models for image-to-image generation. users select an image A and image B, representing structure and style respectively. then they "add" them together to produce a new composition.
demo video of the prototype interaction, playing automatically.
im currently exploring a new interaction: division. the idea is to allow users to divide long-form videos into shorter, curated clips to generate aesthetic edits. for example, dividing a 5-minute video using a reference image as a guide could produce a 45-second TikTok-style edit that visually aligns with the source aesthetic.
Update 2025-03-26 : OpenAI's GPT4o image model generation is actually very good. we won't need IPAdapters, or controlnets, or loras, or comfy workflows, or segmentation models. it'll be one model to rule them all. one question is now how long until we get a capable all-purpose model (like4o) that is open source and runs on consumer GPUs.
it's only going to get better, and it will apply to more and more mediums - audio and video are next.