YOLOv2 object detection algorithm using Darkflow

This story introduces how object detection can be done. It will be helpful if you plan to build an application which benefits from object detection. YOLOv2 algorithm is used behind the scene. However, I have used open source implementation, “Darkflow”, so you don’t need to worry about the detail.
You need 4 steps to perform object detection in short. There are some more details, so if you actually want to use my codes please visit my github repository for this story. There is the jupyter notebook for this story.

Installing Darkflow

Actually, you can visit Darkflow’s github repository and explore the installation guide. I am just going to leave the instruction as simple as possible here.
$ pip install Cython
$ git clone https://github.com/thtrieu/darkflow.git
$ cd darkflow
$ python3 setup.py build_ext --inplace
$ pip install .

Downloading Weights

There are two ways of downloading the pre-trained weights. First of all, you can download it from the official YOLO project webpage. Second, you can download it from here which Darkflow author’s own trained version. I decided to use the second method since I am using Darkflow implementation. You can give a shot for the first method if you want.
Red circle indicates the weight file you need to download

Building the Model

As you can see below, it is quiet simple to build the model. First, you need to define options object. Then, you need to instantiate TFNet class object with the options.
The options is a specification of the model and its environment. The screenshot below is the full list you can specify. I am going to use “model”, “load”, “threshold”, and “gpu” options for this story.
Darkflow Model Options
The model option asks you to specify what model you want to use. There are pre-defined model, and I used yolo.cfg here. This file contains a full of description of the model’s architecture.
The load option is for specifying which weight file you want to use. As I suggested to download yolo.weights from here, I specified it. If you have your own pre-trained weight files, this is where you let the model knows (like after you trained custom objects, your system will produce the specific weight file).
The gpu option is used when you have GPU on your system. It will boost the speed of predicting task.
The threshold option is the bottom line of confidence probability value for keeping detected objects. I set it to 0.1, and it seems pretty low. However, I chose this value to experiment how many object the model can detect, and I could filter later on.
After the model is fully loaded, you will have the output like above.

Detecting Objects and Draw Boxes on a Picture

The one line of code above is everything you need to do to detect objects. The only specified parameter is an image represented in numpy array style. If you print out the results, you will see there are a list of the object shown below. It looks very straight forward what each key means in the object.
I defined boxing function below so that I could reuse for testing on an image and video as well. I used opencv-python module for putting boxes and labels on an image.
With the boxing function, you can display the resulting image like below.
Or you can perform the same task on a video as well like follow.
* Please check supported video output format here depending on your operating system and set it on cv2.VideoWriter_fourcc function. In my case, I have tested on Windows 10, so I specified *’DIVX’.

Results and Evaluation

Here are some of the results I have experimented with.
It looks nice on various kinds of pictures. However, I see some drawbacks on the second and fourth pictures.
For the second picture’s case, It looks like the YOLO doesn’t expect a dog being in the middle of a picture. I assume that lots of pre-trained pictures had dogs on the ground.
YOLO is actually very good at detecting people. However, for the fourth picture’s case, people in a dynamic unexpected pose doesn’t look like being easily detectable. Also, when too many objects are overlapped, it looks like difficult to detect every single one of them.

Comments

Popular posts from this blog

Maxpooling vs minpooling vs average pooling

Generative AI - Prompting with purpose: The RACE framework for data analysis

Best Practices for Storing and Loading JSON Objects from a Large SQL Server Table Using .NET Core