Noryen

Started:April 2026

Completed

Tech Stack:

GolangNextJSTypeScriptPostgreSQLSupabase

🌐 Visit 📁 Code

When I was building Biometryx, I ran into a problem I didn’t expect.

In a health related context AI can become dangerous fast.

The model can:

suggest medication
mention dosages
give advice that looks correct but isn’t

I wouldn’t know unless I manually tested every edge case. And even then - I wouldn’t catch falty output before my users.

It was not scalable. And definitely not safe.

How is my AI actually behaving in production?

Most of the time we focus on prompts, UX, speed … But we rarely ask the question - how does my AI behave in real-world scenario?

Are we going to be digging through raw JSON logs? Debugging by copy-pasting prompts into ChatGPT? Or … do we simply cross fingers and hope for the best ?

I looked into existing solutions.

They were:

heavy
expensive
built for large teams

I just needed something simple:

log outputs
flag risky ones
compare models

So I built it.

1 of 3

AI isn’t deterministic. You can’t fully predict what it will say. And if you’re building in sensitive areas, “probably safe” is not good enough.

You can try Noryen here