YOLOv8 · Custom Training Pipeline · 5 Data Sources

Training Dataset
& Collection Pipeline

A 3,500+ image corpus assembled from five distinct sources — lab-captured photos, MVTec-AD benchmark images, Roboflow augmentation, programmatic synthetic collages, and hard-negative mining. Together they teach YOLOv8 to reliably locate screws, nuts, and bolts across varied lighting, angle, and background conditions.

3,500+ Training Images

5 Data Sources

94.5% mAP@50

100+ FPS Inference Speed

Annotated

MVTec-AD

Synthetic

Negative

Custom

Annotated

Collection Pipeline

How we built the dataset

Step 01 📷 Lab Capture Photographed real screws, nuts, and bolts under varied lighting and angles using smartphone cameras in a controlled lab setup.

Step 02 🌐 Public Datasets Downloaded MVTec-AD screw images (320 normal samples) and sourced additional labeled images from the Roboflow universe.

Step 03 🏷️ Manual Annotation Labeled every bounding box in YOLO format (.txt files) — one annotation file per image with normalized xywh coordinates.

Step 04 🔀 Augmentation Roboflow auto-augmentation tripled the custom set: random flips, brightness jitter, and rotation kept the model view-invariant.

Step 05 🧩 Synthetic Collages Generated 250+ multi-object scene images by compositing hardware onto diverse backgrounds to improve multi-instance detection.

Step 06 🚫 Negative Mining Added hard-negative images (bolts, nails, pins — similar-looking non-targets) to cut false positive rates during detection.

Data Sources

Five-source training corpus

S1 · MVTec-AD

MVTec Anomaly Detection Dataset

320 normal screw images · industrial benchmark

The MVTec AD screw category provides high-resolution, studio-lit images of real industrial screws — all defect-free normal samples used as the EfficientAD training set. These images set the baseline appearance the anomaly model learns to expect.

S2 · Custom

Lab-Captured + Roboflow Augmented

20 raw → 60+ augmented images · self-collected

Original photographs taken in our lab, then uploaded to Roboflow for automated augmentation. Each real image generated 3× more training examples through flips, brightness shifts, and rotations — covering real-world lighting variations our camera setup encounters.

S3 · Roboflow

Roboflow Universe Dataset

Pre-labeled hardware images · community dataset

Sourced additional annotated screw, nut, and bolt images from the Roboflow Universe — a community repository of machine-vision datasets. These pre-labeled images accelerated annotation and provided diverse backgrounds and hardware variants not present in our lab collection.

S4 · Synthetic

Synthetic Multi-Object Collages

250+ programmatically generated scenes

Python-generated composite images placing multiple hardware objects onto varied backgrounds. These synthetic scenes push the model to handle crowded frames with several screws, nuts, and bolts simultaneously — the primary challenge in real industrial inspection trays.

S5 · Negatives

Hard Negative Mining

Bolts, nails, pins — visually similar non-targets

Hard-negative images of objects that look similar to screws and nuts — bolts, nails, and pins. Training on negatives without bounding-box labels teaches YOLO to suppress false detections on confusable classes, directly reducing the false-positive rate during inspection.

Model Performance

YOLOv8 training results

Precision

94.3%

Box-level on val set

Recall

89.4%

Box-level on val set

mAP@50

94.5%

Mean avg. precision

Inference

4.3ms

Per image · GPU

Throughput

100+ FPS

Real-time capable

YOLOv8n — Training Setup

We selected YOLOv8n (nano) for its ideal balance of speed and accuracy for edge-deployable inspection. The model was trained from COCO pretrained weights for 100 epochs with a 80/20 train-validation split across our 3,500+ image corpus.

YOLO outputs bounding-box coordinates, class labels (screw / nut / bolt), and per-box confidence scores in a single forward pass under 5 ms — feeding directly into the EfficientAD anomaly stage without any inter-model bottleneck.

YOLOv8n COCO Pretrained 100 Epochs 3 Classes 640×640 Input PyTorch

YOLO Annotations

Bounding box visualizations

Each image in the training set has a paired .txt annotation file containing normalized YOLO coordinates for every bounding box. The visualizations below are rendered from the raw label files — boxes drawn directly onto the training images to verify annotation quality.

img_000001 · annotated

img_000002 · annotated

img_000003 · annotated

img_000004 · annotated

img_000005 · annotated

img_000006 · annotated

img_000007 · annotated

img_000008 · annotated

img_000009 · annotated

img_000010 · annotated

img_000011 · annotated

img_000012 · annotated

Anomaly Detection

EfficientAD — anomaly model

Student-Teacher Architecture

EfficientAD uses a teacher-student distillation approach: a pre-trained teacher network extracts features, while a compact student network learns to replicate them only on normal samples. At inference, high discrepancy between teacher and student responses flags anomalies — no defect labels needed.

Trained exclusively on 320 normal MVTec-AD screw images, the model builds a statistical model of what a healthy screw looks like at the pixel level, then flags anything that deviates from that learned distribution.

96.98%

Image AUROC

97.89%

Pixel AUROC

Two-Model Pipeline

YOLO and EfficientAD are designed to be complementary, not competing. YOLO handles what and where — detecting and locating each hardware component. EfficientAD handles whether it's defective — scoring the full image for surface anomalies.

The pipeline runs both models on every inspection image: YOLO provides bounding boxes and class labels, while EfficientAD's full-image anomaly map is sliced per bounding box to generate a per-component defect score and heatmap visualization.

EfficientAD-M MVTec AD · Screw No Defect Labels Pixel-level Maps

Model Output · MVTec AD Screw · Inference Visualization click to enlarge

EfficientAD inference result — anomaly map and segmentation mask

Input Image Ground Truth Mask Anomaly Heatmap Overlay Predicted Mask

Reading the output

Input Image

Raw grayscale screw photo fed into the model — same format as MVTec-AD training images.

Ground Truth Mask

Human-annotated defect region — the white blob marks where the real surface defect is located.

Anomaly Heatmap

EfficientAD's raw output — red/yellow = high anomaly score, blue = normal. Generated without any defect labels.

Overlay

Heatmap blended onto the input — shows exactly which part of the screw the model flagged as anomalous.

Predicted Mask