How to Build a Fracture Detection AI with Teacher-Student CNN & MoE

Learn how to build a fracture detection AI using a Teacher-Student CNN with MoE fusion to achieve 99% accuracy in X-ray image analysis.

🧠 Introduction: What is FractureGuard?

Fracture detection using AI has come a long way — and with the right deep learning techniques, it’s now possible to detect bone fractures in X-ray images with over 99% accuracy. In this tutorial, you’ll learn exactly how to build your own AI-powered fracture detection system using a Teacher-Student Convolutional Neural Network (CNN) architecture enhanced by a Mixture of Experts (MoE) fusion layer.

By the end, you’ll have a working model — we call it FractureGuard — that can distinguish between fractured and non-fractured bones with clinical-level precision. And don’t worry, I’ll walk you through it step by step — perfect for beginners, students, or anyone looking to apply deep learning in medical imaging.


📦 Step 1: Load and Explore the Fracture Detection Dataset

We used an augmented X-ray image dataset with two classes: Fractured and Non-Fractured.

import pandas as pd
import numpy as np
import os

base_path = "/kaggle/input/x-ray-images-of-fractured-and-healthy-bones/X-ray Imaging Dataset for Detecting Fractured vs. Non-Fractured Bones/Augmented Dataset/"
categories = ["Fractured", "Non-Fractured"]

image_paths, labels = [], []
for category in categories:
category_path = os.path.join(base_path, category)
for image_name in os.listdir(category_path):
image_paths.append(os.path.join(category_path, image_name))
labels.append(category)

df = pd.DataFrame({"image_path": image_paths, "label": labels})

We verified:

  • No missing or duplicate data
  • Perfectly balanced classes (4,650 images each)
df['label'].value_counts()

🖼 Step 2: Visualize Sample X-rays

Let’s peek at a few X-ray samples:

import cv2
import matplotlib.pyplot as plt

num_images = 5
plt.figure(figsize=(15, 12))
for i, category in enumerate(categories):
category_images = df[df['label'] == category]['image_path'].iloc[:num_images]
for j, img_path in enumerate(category_images):
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.subplot(len(categories), num_images, i * num_images + j + 1)
plt.imshow(img)
plt.axis('off')
plt.title(category)
plt.tight_layout()
plt.show()

🧪 Step 3: Preprocessing and Transforms

We apply data augmentation for training robustness:

import torchvision.transforms as transforms

train_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.RandomCrop(224, padding=28),
transforms.RandomHorizontalFlip(),
transforms.ColorJitter(0.2, 0.2, 0.2),
transforms.RandomRotation(15),
transforms.ToTensor(),
transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0) == 1 else x),
transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

test_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0) == 1 else x),
transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

🧱 Step 4: Build a Custom Dataset Class

from torch.utils.data import Dataset
from PIL import Image

class FractureDataset(Dataset):
def __init__(self, df, transform=None):
self.df = df
self.transform = transform
self.label_map = {'Non-Fractured': 0, 'Fractured': 1}

def __len__(self):
return len(self.df)

def __getitem__(self, idx):
img_path = self.df.iloc[idx]['image_path']
label = self.label_map[self.df.iloc[idx]['label']]
image = Image.open(img_path).convert('RGB')
if self.transform:
image = self.transform(image)
return image, label

🧠 Step 5: Define the Model Architecture

🔧 MoE Fusion Module

class MoEFeatureFusion(nn.Module):
def __init__(self, input_dim, hidden_dim, num_experts, top_k=1):
...

This module combines multiple expert networks and a shared one using a soft attention-like gating mechanism.

🎓 Teacher-Student CNN with ResNet18 Backbone

class TeacherStudentCNN(nn.Module):
def __init__(self, num_classes, hidden_dim, num_experts):
...
  • Uses a pre-trained ResNet-18.
  • Only layers 2–4 are fine-tuned.
  • Separate branches for teacher and student networks.
  • MoE handles feature fusion before final classification.

⚙️ Step 6: Training Setup

from torch.utils.data import DataLoader
from sklearn.model_selection import train_test_split

train_df, test_df = train_test_split(df, test_size=0.2, stratify=df['label'], random_state=42)
train_dataset = FractureDataset(train_df, transform=train_transform)
test_dataset = FractureDataset(test_df, transform=test_transform)

train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=128)

🚀 Step 7: Training the Model

def train(model, train_loader, optimizer, scheduler, num_epochs):
model.train()
for epoch in range(num_epochs):
total_loss = 0
for images, labels in train_loader:
...
print(f'Epoch {epoch+1}, Loss: {total_loss / len(train_loader):.4f}')

You’ll see this kind of output:

Epoch 1, Loss: 2.1966
Epoch 2, Loss: 0.2839
...
Epoch 5, Loss: 0.0366

📈 Step 8: Evaluate the Model

def evaluate(model, test_loader):
...

Results:

  • Accuracy: 99.09%
  • Confusion Matrix: luaCopyEdit[[920 10] [ 7 923]]
  • Classification Report: sqlCopyEditPrecision, Recall, F1-score: 0.99 for both classes

🧠 Why It Works So Well

  • Teacher-Student training helps transfer knowledge across branches.
  • MoE fusion smartly integrates complementary features.
  • Transfer learning with ResNet-18 makes it data-efficient.
  • Augmentation improves generalization.

🧰 Tech Stack Recap

  • 🧠 PyTorch for deep learning
  • 🏗 ResNet-18 for backbone
  • 🔀 Data Augmentation with torchvision
  • 📈 scikit-learn and seaborn for metrics/plots
  • 📦 torchinfo + torchviz for architecture insights

✅ Conclusion

FractureGuard isn’t just a model — it’s a powerful step toward automated, AI-driven medical diagnostics. With a smart architecture, robust training, and a streamlined implementation, it’s a great showcase of how deep learning can impact healthcare.

Want to build more projects like this?
👉 Explore more tutorials on the Ossels AI Blog


🔄 Next Steps

  • Try it on real-world hospital datasets
  • Extend to multi-class classification (e.g., hairline vs compound fractures)
  • Deploy using FastAPI or Streamlit

Posted by Ananya Rajeev

Ananya Rajeev is a Kerala-born data scientist and AI enthusiast who simplifies generative and agentic AI for curious minds. B.Tech grad, code lover, and storyteller at heart.