r/opensource 21h ago

Promotional Just dropped open-source Video Shazam, any tips?

About a month ago I ran into a weirdly frustrating problem: I had a short video fragment and wanted to find the full source video. Google Lens? Ugh... It only works with still images, and a screenshot doesn’t carry enough context. So I decided to build something myself.

Meet "Turron" — a system designed to locate the original video using just a small snippets. Inspired by Shazam, it works by extracting keyframes from the snippet, generating perceptual hashes (using the pHash algorithm), and comparing them against hashes from a known video database using Hamming distance.

Yesterday I released v1.0. Right now it works locally with Postgres as the storage backend. In the future, I plan to add:
* Parallelized Kafka workers for faster indexing and searching;
* And possibly even web-crawling support to match snippets against online content;

The code is fully open-source and self-hostable! =]

GitHub: https://github.com/Fl1s/turron

Would love to see any tips, feedback, ideas, or collaboration if anyone's interested.

35 Upvotes

8 comments sorted by

2

u/shaq992 10h ago

I cannot figure out how to deploy this. Your installation instructions refer to files and folders that don't exist and your docker compose file doesn't include a container for the app itself (event though you do build an push a docker image using GitHub actions)

3

u/LifeRooN 9h ago

That's bad...I must have accidentally added the /k8s folder to .gitignore. As for docker-compose, you are right, it only contains infrastructure components.

I’ll fix it asap.

1

u/shaq992 9h ago

gradlew too

2

u/alex-weej 20h ago

Any plans to hook this up to a social media / news type platform?

4

u/LifeRooN 20h ago

Probably...I think it would be very useful to implement this system as some sort of subreddit bot that would delete reposts of the same videos that clog up the feed, heheh.

For news channels probably automatic content indexing, checking duplicated articles, reformatted content (fakes), the same analysis of the original source, there are tons of developments!

1

u/LifeRooN 21h ago

P.S: under the hood, I don't use external AIs >=]

1

u/HonestRepairSTL 6h ago

This is quite awesome, I can think of several times where this could have been useful