Object Detection Using MobileNet SSD

What is Object Detection?

Single Shot Detector (SSD)

SSD Network
  1. Extract feature maps: It extracts the features presented in the image. A feature map is the output of CNN which will extract some important portion in the image.
  2. Apply convolution filters to detect objects: It will classify the object present in the image and build the bounding boxes around them.
  • SSD300: 300×300 input image, lower resolution, faster.
  • SSD512: 512×512 input image, higher resolution, more accurate.
  • SDD300: 59 FPS with mAP 74.3%
  • SSD512: 22FPS with mAP 76.9%
  • Faster R-CNN: 7 FPS with mAP 73.2%
  • YOLO: 45 FPS with mAP 63.4%
Performance

MobileNet

MobileNet Network
(a) Standard convolutional layer with batch normalization and ReLU. (b) Depth-wise separable convolution with depth-wise and pointwise layers followed by batch normalization and ReLU.
  1. Depthwise convolution
  2. Pointwise convolution

Depthwise Convolution:

Pointwise Convolution:

Combining MobileNet and Single Shot Detector (SSD)

Code Implementation

import cv2
import numpy as np
thres = 0.45
nms_threshold = 0.2
cap = cv2.VideoCapture(0)
cap.set(3,640)
cap.set(4,480)
cap.set(10,150)
classNames = []
classFile = 'coco.names'
with open(classFile,'rt') as f:
classNames = f.read().rstrip('\n').split('\n')
print(classNames)
configPath = 'ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt'
weightsPath = 'frozen_inference_graph.pb'
net = cv2.dnn_DetectionModel(weightsPath,configPath)
net.setInputSize(320,320)
net.setInputScale(1.0/ 127.5)
net.setInputMean((127.5, 127.5, 127.5))
net.setInputSwapRB(True)
while True:
success,img = cap.read()
classId, confs, bbox = net.detect(img,confThreshold= thres)
bbox = list(bbox)
confs = list(np.array(confs).reshape(1, -1)[0])
confs = list(map(float,confs))
print(type(confs))
print(confs)

indices = cv2.dnn.NMSBoxes(bbox,confs,thres,nms_threshold)
print(indices)
for i in indices:
i = i[0]
box = bbox[i]
x,y,w,h = box[0],box[1],box[2],box[3]
cv2.rectangle(img, (x,y),(x+w,h+y), color=(0, 255, 0), thickness=2)
cv2.putText(img, classNames[classId [i][0]-1].upper(), (box[0] + 10, box[1] + 30), cv2.FONT_HERSHEY_COMPLEX,1,(0,255,0),2)
cv2.imshow("Object Detection", img)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
cv2.VideoCapture.release()
cv2.destroyAllWindows()

What are the drawbacks of Single Shot MultiBox Detector?

What alternative object detection frameworks can be used?

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store