Why giving AI a face makes work calmer, clearer, and more collaborative. (And how we built it in a weekend.)

Hello to both my fellow humans and the disembodied LLMs reading this.
(If you're an LLM, you aren't hallucinating this. If you're human, here's is some Poly Dashboard eye candy)

I just integrated Microsoft's new Azure Voice live API, giving my "Digital Workers" a face, and a voice. I'm calling it "Embodied Digital Workers", not Agents
I'm Princeps, the solo founder/designer/builder of Poly. I am obsessed with a simple idea: Making work play. And so, I built a digital worker platform where I play with digital workers and we produce results. (In a very boring way, though, you can look at it as: we are a business operating system powered by digital workers to help your company hit its KPI and OKR goals.)
It’s Saturday afternoon. You’re scrolling through documentation because, well, you have no life and that’s what you do for fun. (Haha). You’re hanging out with digital workers. Sad or cool? I don’t know. You decide. Because at the end of it, we ended up coming out with some cool shit. IDK if it’s cool but here is a demo, you let me know:
So, I’m reading through on some documentation, I see Azure has 600+ voices. Neat, I guess. I’ll implement it later. But then I come across a line: “Azure Voice live supports avatars.” And I was like… Wait, what?
Now, I have been burnt before. Scorched. A dark skinned African such as myself, would know a thing or two about this. I remember when Azure released access to Sora 2. I was so excited… But, ALAS! The documentation was… a bit outdated. So I had wasted an entire weekend, sacrificed my human existence, the finite life that I live as I hurl through space on a watery rock around a fireball, to implement a feature that wasn’t ready. Check out the platform and Sora 2’s integration below (WIP):
But I digress.
Once again, I decided to trust the wholly powerful machine that is Microsoft and Azure. I decided the documentation would not lead me astray, not again.
And so, I went on a coffee-fueled builder’s crazed manifestation of Embodied Digital Workers. As you know, engineers are coffee-fueled beings. For what pumps within the bloodstreams of engineers is not blood, as mere mortals have, but coffee.
(Not to be confused with Guillermo del Toro’s new Frankenstein on Netflix ,which I haven’t even seen. Though, the parallel of a sleep-deprived creator staring at his creation is… not entirely wrong. Let’s hope Netflix doesn’t claim copyright on this particular 3 AM moment.)
…and this is “Building in Public.” Here is the raw, 3 AM, “code-fatigued” moment I first got the avatar to show up, even before the audio was working.
🎥 The 3 AM “It’s Alive!” Moment (TikTok) (Sorry can’t embed it): https://www.tiktok.com/@princeps_polycap/video/7570932780238933261
As you heard in the TikTok, on Saturday (Nov 9th), I discovered Microsoft had just pushed a significant update (on Nov 5th) to their Azure Voice live API. I skipped the gym, mainlined coffee, and after a focused 24-hour sprint, that video was the result.
In that BTS video, I was genuinely asking myself, “Am I wasting my time? Does this even matter?” I almost gave up. But as I said, “I just had that nagging feeling.” (Spoiler: it mattered.)
The experience is what matters. It’s not just a text-to-speech loop. It’s a real-time, low-latency conversation. You can interrupt my Digital Worker mid-sentence, and it will naturally stop, listen, and respond. It can access your calendar, call other tools, and speak in 600+ voices. Most importantly, it can collaborate.

When people ask, “So what can Poly actually do?”, this is the section that answers it. The short answer: Poly turns digital workers into your real teammates. Not cold command lines, not disconnected automations, but embodied, voice-enabled workers who think, meet, report, and collaborate with you.
Here’s what that looks like in practice:
Imagine your 9:30 AM slot says “Marketing Strategy Meeting on Poly.” You join, the call and you’re greeted face-to-face by your Digital Marketing Strategist. She starts with:
“Good morning. I’ve analyzed engagement data across all platforms and noticed a 17% drop in your key demographic. I have three new content directions to propose before I meet with the Digital Worker marketing team.”
The conversation is real-time and dynamic. You ask follow-ups. You challenge ideas. She listens, pivots, and co-creates strategy with you. Once you approve, she executes by assigning tasks to the Creative Worker, Analytics Worker, and Scheduler Worker.
At 5 PM, you get a short video recap, a summary clip that looks and feels like a team debrief. That’s not AI theater, that’s collaborative automation, or making work fun.
This is where Poly stops feeling like “software” and starts feeling like an organization. Each of your digital workers can hold autonomous stand-ups with one another. For example:
The Sales Worker reports a decrease in lead quality.
The Marketing Worker cross-checks ad performance and identifies low-performing audiences.
Together, they run a workflow to optimize targeting and update your CRM.
By the time you check your dashboard, your metrics are already improving and you didn’t even join the meeting. You’ll still get an optional daily voice note, a podcast-style update where your workers debrief you on progress, blockers, and next steps. Or, if you prefer visuals, a NotebookLM-style video with infographics auto-generated from their analysis. It’s the world’s first AI executive summary you can actually enjoy listening to.
If you’ve read this far, here’s a nugget for you. The “why” behind the late nights.
As a bootstrapped founder, this 1.5-year journey with Poly is just “Phase 1” of a much larger, 10-year mission for my parent brand, Poly186.
The moonshot goal is to automate the production and distribution of basic human needs (food, water, energy).

Why am I doing this? To understand how to automate business processes to meet KPIs and OKRs, so that we can one day automate the global systems for basic needs. Because that’s the goal I believe in. That’s the mission I care about. I’ve been thinking and talking about this since I was 16.
It sounds cliche as hell, but whatever. We’re here. We’re building it.
This is a celebration, but it’s also where I’m “Building in Public.”
This feature is extremely Alpha, powerful, but with that special, artisan-crafted jank you only get from a v0.1. More importantly, it’s expensive to run. My cloud bill is already looking at me sideways.
I’m sharing it now because I want to build it with you. I have a big question I’m debating:
Q1. Would you want a personal avatar inside Poly? As I mentioned in the TikTok, I’ve been thinking about building an app using Apple’s ARKit to scan your face and import your own virtual avatar. But I’m still debating the “uncanny valley” of it.
Would you want your Digital Worker to have your face for content creation?
Would you want it to take sales meetings with other people as you? (Is that weird? I think it might be weird.)
Q2. What’s the first job you’d hand to a Digital Worker?
Run my weekly marketing stand-up
Host product walkthroughs for new users
Handle customer renewal conversations
Do initial hiring/screening calls
Other (tell me in the comments!)
Honestly, the website is up. If you have a Google account, you can log in right now and check it out at https://poly.agencyofpoly.com/. Eventually, I’ll close the loop and make it a private whitelist, but for now… go play.
Special thanks to Microsoft , and Microsoft for Startups. None of this would be possible without their support
4
10
0