Literature Survey

Research Domain

Background gathered across IoT threat detection, ML-based intrusion detection, and edge-cloud hybrid architectures — grounded in the SafeNode IDS implementation.

The IoT ecosystem's rapid expansion has introduced severe attack surfaces. Devices are constrained in compute and memory, making traditional security mechanisms impractical. The primary threat categories — Mirai botnet variants, DDoS amplification, replay injection, ARP spoofing, and port scanning — form the attack taxonomy that SafeNode is designed to detect.

  • → Mirai exploits default credentials to enlist devices into botnets
  • → Replay attacks re-inject captured MQTT packets to manipulate sensor state
  • → Spoofing attacks forge source MAC/IP to bypass access controls
  • → DDoS floods overwhelm broker capacity, causing denial of service

Existing IDS solutions for IoT operate either entirely at the cloud (high latency, bandwidth-intensive) or entirely at the edge (limited model complexity). No widely adopted system fuses a lightweight edge ML model with a deep learning cloud model in a real-time hybrid pipeline using live MQTT network traffic from actual IoT hardware.

How can a multi-layer intrusion detection system, combining lightweight gradient-boosted ML models at the network edge with a deep residual neural network in the cloud, accurately detect and classify IoT-targeted cyberattacks — including replay, botnet, spoofing, and DDoS — in real-time without exhausting the computational resources of IoT edge nodes?

  • → Design a Scapy-based 61-feature flow extractor deployable on Raspberry Pi
  • → Train and deploy four parallel ONNX LightGBM models for edge inference
  • → Develop a ResNet-style deep learning model for 5-class cloud classification
  • → Implement winner-takes-all confidence fusion across all edge models
  • → Build a SOC-style real-time dashboard with verdict badge visualization
  • → Validate with live traffic from Arduino UNO + ESP32 sensor nodes

SafeNode follows a three-phase methodology: (1) data collection and feature engineering from CICIoT2023 and BoT-IoT datasets; (2) model training with progressive noise augmentation to prevent feature dominance; (3) end-to-end pipeline integration with live hardware validation.

  • → Edge: tshark capture → FlowExtractor.py → 4× ONNX LightGBM inference
  • → Cloud: MQTT relay → DLInferenceService.py → FastAPI fusion → Dashboard
  • → Verdict badges: ML Only | DL Only | Both | Clean
  • Edge ML: LightGBM, ONNX Runtime, Scapy, paho-mqtt
  • Cloud DL: TensorFlow/Keras (ResNet architecture), FastAPI, PostgreSQL
  • Hardware: Raspberry Pi 4, Arduino UNO + W5500, ESP32, HC-SR501 PIR, Smoke Detector
  • Dashboard: TypeScript, React, Axios, JWT Authentication
  • Broker: Mosquitto MQTT (MQTTv311) on Raspberry Pi
  • Data: CICIoT2023, BoT-IoT (Final_5_Class_Cleaned.csv)
System Overview

System Architecture

End-to-end pipeline from IoT edge node to cloud inference and SOC dashboard.

SafeNode System Architecture Diagram