A common theme in high thinkies is that I want to record my thoughts in a way that doesn’t get in my way of my thoughts. For example, writing down notes has been historically difficult and when the thinkies are coming in at blistering speed, it is difficult to keep up. So I started to just talk and record what I said. Audio files are super small. I also started streaming to also document what I was working on. All of this came about because I was learning how to say “fuck it” to being a perfectionist and started to just stream my consciousness.
The problem is, this content is pretty opaque. In order to gain insights into what I was talking about, I would need to listen back to many days of audio and video at this point. So to solve this, I integrated whisper.cpp into this platform so I can drag and drop files and get a transcription. Great, now I get to _read_ days of audio and video. AI summary of the transcript? Kind of helpful, but still missing something.
There is a tool out there that does what I want, Descript. If you haven’t seen it, it is pretty incredible. You do what I just did of getting a transcription for your audio/video, but when you edit the transcript, it also edits the video. The feature that really blew me away was the “remove um’s and ah’s”, and it really does work. This is what is missing from my app. I have thought about this feature for so long. I have had ChatGPT write me the function I need for it.
The blog editor is in an OK state. I still need to get image upload working, I have some ideas. The transcription editor would be so freaking cool. I need to have it.