About
Project
GlyphMotion
The Architects
Team Rhythm: Shitij leads the architecture and web platform (the main target) and built ~90% of the core processing pipeline. Sayan ports pipeline upgrades to standalone script + GUI and currently leads ongoing maintenance for both script and GUI.
Shitij Halder
Creative Web Developer & Full Stack Engineer
DIT University, Dehradun · Originally from Siliguri, West Bengal
The mind behind the entire web platform, backend pipeline, and the visual identity of GlyphMotion. Built 90% of the processing pipeline, designed the cloud architecture with Google Drive integration, and engineered the asynchronous video processing backend from scratch. Makes web design feel more like cinema than code.
Open Source & Projects
Skills
Sayan Sarkar
Psychology Student · Android Developer & Open Source Contributor
Singur Government General Degree College · Originally from Kolkata, West Bengal
The one who found the spark and lit the fuse. Built the initial YOLOv8 parsing script with CLI arguments, maintained the standalone script and GUI version, and handles 80% of the standalone/GUI maintenance. A psychology student who debugs code the same way he reads human behavior — with alarming precision.
Open Source & Projects
Skills
How to start in 2 minutes
Choose your input clip and send it to the pipeline.
GlyphMotion runs detection/tracking and applies the configured workflow.
Download the processed output with final visuals and preserved audio.
Pipeline Anatomy (6 steps)
The Origin Tea ☕
// How two Class 12th students built a computer vision pipeline out of spite
This page means one simple thing: when resources are low but intent is high, two people can still build production-grade systems by sharing knowledge, dividing roles well, and refusing to quit.
The Reel That Started It All
One fine day, Sayan was doom-scrolling Instagram reels at full goblin mode — as any civilized engineer does — and stumbled on a reel showcasing something called YOLO (You Only Look Once). Some guy had written about 30-40 lines of code that could take a video, run object detection, and spit out a processed version. It was basic, hardcoded, zero CLI arguments, just raw script-path-dependency-chaos vibes. But it worked. And Sayan got instantly hooked.
The catch? The guy asked viewers to comment "Link" to get the code. Let that sink in — you're showcasing an open-source project, freely available to the entire planet, and you're gatekeeping it behind Instagram comments for engagement farming. Elite cringe. Hall-of-fame gatekeeping.
Sayan told Shitij the whole saga. They were both furious. Two broke students, one shared frustration, and a mutual verdict: "Fine, we'll build our own from scratch."
Post-Boards, Pre-Purpose
Context: Both of them had just finished their CBSE Class 12th board exams. That post-boards limbo where you've got no school, no college admissions yet, and an alarming amount of free time. They were already building random side quests — but now they had a real boss fight.
Neither of them had GPUs remotely close to "adequate" for real-time neural inference. But inadequate hardware has never stopped determined developers. It just makes the journey more... cinematic and slightly unhinged.
Sayan Writes the First Script
Sayan got to work first. He parsed the YOLOv8
model, got basic object detection running, and — crucially — added proper
argparse
support with command-line arguments. No more editing file paths in the source code like
a caveman. The script could accept input via CLI flags, which meant normal humans could
run it without summoning dark terminal rituals.
He passed it to Shitij. Two immediate problems surfaced:
Problem #1: Processing Time
With a GTX 1650 and a 3050 4GB, inference was painfully slow. Every frame through YOLOv8 felt like an eternity. But they pushed forward anyway.
Problem #2: No Audio
YOLO — and computer vision models in general — don't care about audio. The output video was completely silent. Not exactly production-ready.
FFmpeg Enters the Chat
They needed the original audio preserved in the output. Enter FFmpeg — the Swiss army knife of multimedia processing. They engineered a pipeline that would extract the original audio from the source video, process frames through YOLOv8 for visual annotations, encode them back into a video using libx264 with CRF-based compression, then multiplex the original audio stream back in.
Shitij Builds the Web Platform
The script worked locally. But they wanted to use it remotely — because why not? Shitij, being deep into web development, decided to build an entire web platform around the pipeline. Upload a video through the browser, the backend processes it using the exact same pipeline, and the result gets served back.
Over the next few months, Shitij built out the full stack: the frontend interface, the Flask-based backend with async processing, Google Drive integration for storage, automatic GitHub Pages deployment via PyGithub, Telegram bot integration, real-time SSE status updates, PWA support, admin dashboards — the whole nine yards. 90% of the web platform was Shitij's work, with Sayan providing continuous support (emotional and technical — because building this was, in his own words, "diabolical but fire").
$0 Infrastructure. 100% Uptime.
Here's the thing nobody talks about: they had zero money. No budget for domains. No budget for servers. No budget for cloud hosting. Two students running a computer vision project with nothing but their laptops, their WiFi, and whatever caffeine they could get their hands on.
But they had one ace — their GitHub
Education Pack. Through the GitHub Student Developer Pack, they scored a free
domain via name.com.
That gave the project an actual web presence.
But GitHub Pages — while giving them 100% frontend uptime for free — doesn't support any backend. No server-side processing, no APIs, nothing. Just static files. For a project that needs to process videos through a neural network? That's a dealbreaker. Or so it seemed.
The Cloudflare Tunnel Hack
Shitij rigorously searched for a way to make the backend work. The solution? Cloudflare Tunnels — which can expose any local network to the internet without port forwarding, without a static IP, without paying a dime.
The architecture: GitHub Pages hosts the frontend 24/7 with 100% uptime. Video
metadata is fetched by polling a videos.json
file in the GlyphMotion GitHub repository — no backend needed for that. When they
needed to process a video or test something, they'd simply spin up the Cloudflare
tunnel from their laptop, and the backend would be live on the internet. Turn it off
when done.
Integrating GitHub Pages with Cloudflare was the biggest challenge — but once cracked, it gave them something nobody else has: a GitHub Pages site with an on-demand backend. When Shitij showed this setup to someone in the industry, their reaction was genuine shock.
The Night Everything Felt Broken
Right after the first stable release landed, the internet suddenly felt like it had flatlined. During the Cloudflare outage window, traffic dropped off a cliff and the project looked dead from the outside.
At 2AM, far from home in Malda, that hit hard enough to bring tears. Real tears. Confusion, panic, grief — then a bad spiral where chunks of documentation and uncommitted work got deleted. For 2-3 days, both of them were emotionally cooked. Later they learned it was the outage, not the end. They rebuilt, recovered, and got back to shipping.
Three Forks, One Pipeline
Now there were three versions of the same core pipeline that needed to coexist:
Web Platform
The full cloud solution with upload, processing, and automated deployment. Primarily maintained by Shitij.
Standalone CLI
The raw command-line script. No web server needed. Pipeline changes ported from the web version. Maintained 80% by Sayan.
Desktop GUI
Sayan's cross-platform desktop app for Linux & Windows. Visual interface without needing a terminal.
Since the web platform's processing pipeline was always the most complete, changes flowed downstream: web → standalone → GUI. Sayan handled the majority of the standalone and GUI maintenance, while Shitij kept the web platform evolving.
Before GlyphMotion: The Android Years
Long before GlyphMotion existed, both of them were deep in the Android custom ROM scene. They started on devices that shared the same board — Redmi 6 (cereus) for Shitij and Redmi 6A (cactus) for Sayan — building and maintaining custom ROMs with proper authorship and kernel optimizations.
Later, Shitij maintained ROMs for the Realme Narzo 30 (salaa) — a device so cursed it should come with a therapy coupon. Sayan moved to the Redmi Note 11 (spes), which he still maintains to this day, and later picked up the Redmi 13 5G (breeze) and even the Nothing Phone 3a (asteroids).
Beyond ROMs, they've contributed to various open-source projects — SpotDeck (a Spotify Car Thing replacement), Telegram media bots, kernel trees, vendor blobs — the whole Android development ecosystem. This background in low-level system work and open-source collaboration is exactly what made GlyphMotion possible.
From a 30-Line Script Reel to a Full Pipeline
What started as frustration over a gatekept Instagram reel turned into a full-scale, multi-platform computer vision pipeline with asynchronous processing, CUDA acceleration, cloud integration, a production web app, a standalone CLI, and a desktop GUI.
Two students. Two mid-tier laptops. One shared goal. Zero handouts. Everything you see on this site — from the pipeline architecture to the documentation to the changelogs — was built from the ground up. Not because someone asked them to, but because some guy on Instagram wouldn't share a link.
Builder Notes & Gratitude
// What this journey taught us, and who helped us stand back up
If You're Building With Zero Budget
Credits We Owe
To Fabrice Bellard and the global FFmpeg contributors — your work made our audio/video pipeline real.
To Joseph Redmon, Ali Farhadi, and the broader YOLO research lineage; and to the Ultralytics team for practical modern implementations.
Python core devs, Flask/Pallets, Linux maintainers, OpenCV contributors, and the entire tool ecosystem we stood on.
GitHub, GitHub Education Pack, Cloudflare tooling, and every free-tier service that gave two students a fighting chance.
Stack Overflow answers, random forum comments, issue threads, docs writers, and tutorial creators we may never personally meet.
When one of us crashed, the other carried the build. This project exists because friendship stayed online when the internet felt offline.
Why Open Source?
Because we know what it feels like to be locked out when you're hungry to learn. Open source is not just code sharing — it's dignity sharing. It's telling the next broke, stubborn student that they don't need permission to build something meaningful. If this pipeline helps even one person skip the gatekeeping we faced, every sleepless night was worth it.
Cost vs Capability
Architectural Decisions
Open Questions (Still Solving)
Hard Limits We Accept (for now)
What We Won't Compromise On
Lessons We Paid For
If you're reading this later...
Remember the nights when everything felt broken, the days when the page looked empty, and the moments we questioned whether this was even worth continuing.
We built anyway. We learned anyway. We came back anyway. If future success ever makes us forget the grind, read this page again and stay humble, hungry, and kind to the next builder.
Contributors Welcome
Thanks Wall (Founding External Contributors)
Two Builders. One Shared Obsession.
A story people can relate to: two friends, limited resources, and relentless execution over excuses.