Inventor: Implementation Details

A metal tube. 8mm in diameter. Aviation grade.

By the time it reaches the end of the line, it carries a dozen stamps across its surface. Material certificate. Heat treatment. Dimensional check. Final approval. Each one a record that a specific step happened — and passed.

The inspector’s job: read all of them. Reliably. In real time. While the tube is rotating in their hands.

Small text. Curved metal. Digits that look almost identical to each other. And somewhere nearby, a paper checklist to cross-reference against.

That’s not an edge case. That’s the daily reality for thousands of small manufacturers — the ones who can’t afford a six-figure machine vision setup, and don’t have an engineering team to maintain one.

Large factories solved this years ago. They have computer vision, AI recognition, whole departments dedicated to it.

Small manufacturers have a worker, a flashlight, and a checklist.

That gap is exactly what Inventor was built for.

The OCR Problem

There’s no shortage of OCR models. The problem: almost all of them were trained on paper.

Metal is different. Curved metal is harder still. Glare, scratches, shallow engravings, stamps pressed at odd angles — none of it looks like a scanned document.

After several weeks of testing, two directions emerged.

Custom model — built from scratch for metal surfaces.

Tuned for a narrow symbol set (a one vs. a seven matters more than full sentences)
Handled recorded video well
Real-time on iPhone: not reliably enough

Apple OCR — it worked. Almost too well.

Picked up everything in frame, including things visible for a fraction of a second
Adjacent stamps merged into one word
And yet: curved surfaces, glare, scratches — none of it slowed it down
Ran flawlessly on iPhone and iPad

So we kept Apple OCR and taught it to look in the right direction. Added a layer of intelligence on top.

That’s the foundation the pipeline is built on.

Pipeline: Frame → Zone model → Detection → OCR → Layout → Result

Zones

A tube isn’t one flat surface. It has sections. It rotates. The same stamp can appear in frame three times in a single pass.

The zone model handles this. It segments the shape of the tube, separates distinct sections, and gives the detection pipeline a structured surface to work with — instead of a continuous blur of metal and markings.

Getting there was straightforward. What slowed things down was data.

The Data Reality

Clients say they have a lot of data.

Usually they mean gigabytes of process video. That’s not nothing — but it’s not the same as useful data.

For a model to work, the dataset needs variety:

Different tubes
Different surface conditions
Different edge cases

Mid-tier manufacturers — our core audience — often can’t deliver that. Small staff. Limited active orders. The parts you need to balance the dataset simply aren’t on the line that week. Or someone would need to spend half an hour filming, and there’s no one free to do it.

We adjusted. Built storage and annotation into our own product cycle so models can stay operational while new data comes in gradually — on the client’s schedule, not ours.

Error Detection

Finding stamps is one problem. Catching errors in them is another.

Negative examples — defective or incorrectly marked parts — are rare by nature. You can’t build a reliable error-detection model from production runs alone. There simply aren’t enough failures to learn from.

There’s also the duplicate problem. A tube rotates. The same stamp appears in frame repeatedly. But sometimes the tube also carries genuine duplicates — or stamps that look nearly identical to each other. The system has to tell the difference.

The answer was multiple levels of analysis. Broad zones down to individual stamps, down to individual characters within a stamp. Layer by layer. Combined with a layout model — trained specifically on the spatial arrangement of objects — this is what produces the accuracy the inspection actually requires.

Not one model doing everything. A cascade of models, each responsible for a specific level of the problem.

The OCR Problem

Zones

The Data Reality

Error Detection

Start a conversation.