Real-Time Object Detection with pre-trained YOLOv3 and OpenCV

August 01, 2020

Goal

In this article, we will perform real-time object detection with YOLOv3 (You Only Look Once) which is the current state of the art, real time object detection system. I want to reference .

Topics

How YOLO works
What YOLO can detect
Concept of Transfer Learning
How to install YOLO
Code
Output

How YOLO works

Traditional, R-CNN family of algorithms uses regions to localize the objects in images. High scoring regions of the image are considered as object detected.

YOLO follows a completely different approach. It applies a neural network to the entire image to predict bounding boxes and probability of being some object. High probabilities of the image are considered as object detected.

What YOLO can detect

YOLO can be applied to:

Image File
Webcam feed
Video file

Pre-trained YOLOv3 can detect 80 objects, such as:

Person
Bicycle
Car

Concept of Transfer Learning

Transfer learning is a technique to reuse the weights in one or more layers from a pre-trained network model in a new model by:

Keeping the weights
Fine tuning the weights
Adapting the weights entirely when training a new model

How to install YOLO

Create a folder. I will name my folder Object-Detection-YOLOv3.
Download weight file and configuration file based on the frames per second (FPS) or mean Average Precision (mAP) from pjreddie and place it in the Object-Detection-YOLOv3 folder.

enter image description here

Download name file - coco from github and place it in the Object-Detection-YOLOv3 folder.
Install OpenCV 3.4.2 or above pip install opencv-python

Code

import cv2
import numpy as np

net = cv2.dnn.readNet('yolov3.weights', 'yolov3.cfg')
classes = []

with open('coco.names', 'r') as f:
    classes = f.read().splitlines()

#print(classes)

img = cv2.imread('image.jpg')
height, width, _ = img.shape

# cv2.imshow('Image', img)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

blob = cv2.dnn.blobFromImage(img, 1/255, (416, 416), (0, 0, 0), swapRB=True, crop=False)

# for b in blob:
#     for n, img_blob in enumerate(b):
#         cv2.imshow(str(n), img_blob)
#         cv2.waitKey(0)

net.setInput(blob)
output_layers_names = net.getUnconnectedOutLayersNames()
layerOutputs = net.forward(output_layers_names)

boxes = []
confidences = []
class_ids = []

for output in layerOutputs:
    for detection in output:
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]

        if (confidence > 0.5):
            center_x = int(detection[0]*width)
            center_y = int(detection[1]*height)
            w = int(detection[2]*width)
            h = int(detection[3]*height)

            x = int(center_x - w/2)
            y = int(center_y - h/2)

            boxes.append([x, y, w, h])
            confidences.append(float(confidence))
            class_ids.append(class_id)
            
# print(type(boxes))
# print(type(confidences))
# print(boxes)
# print(confidences)

indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
# print(indexes.flatten())

font = cv2.FONT_HERSHEY_PLAIN
colors = np.random.uniform(0, 255, size=(len(boxes), 3))

for i in indexes.flatten():
    x, y, w, h = boxes[i]
    label = str(classes[class_ids[i]])
    confidence = str(round(confidences[i], 2))
    color = colors[i]
    cv2.rectangle(img, (x,y), (x+w, y+h), color, 2)
    cv2.putText(img, label + " " + confidence, (x, y+20), font, 2, (255,255,255), 2)

cv2.imshow('Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output

Detection