Utilizing Video Analytics to Solve the Bark Beetle Infestation in Sweden

5 min readSep 24, 2021

Bark Beetle Infestation

Sweden is a country dominated by forests, with forestry forming the backbone of the national economy. The European Spruce bark beetle or Ips typographus, measures only about 4 millimeters but has damaged more wood in 2018 than the record wildfires that ravaged Sweden. According to the Swedish Forest Agency, in 2018 itself, these species ravaged about 3–4 million cubic meters of lumber, denting the national economy by around 625 million dollars [1][2][3]. Hence minimizing the damage from bark beetle infestation is of paramount importance.

Wipro has taken the initiative and set up an experiment wherein we use object detection to detect infected trees for treatment and track forest health to enable proactive action. In this experiment, our primary goal is to detect Spruce trees that are infected with Bark beetle pests. Healthy trees are green, infected/weakened trees are pale yellow, and trees at the terminal stage of infection are white in comparison. While conceptualizing and designing the cognition engine of our video-analytics pipeline, we experimented with a multitude of frameworks, like OpenVino, MediaPipe, Enterprise cloud pipelines, OpenCV based pipelines, Rocket, etc. However, only Rocket could fulfill our requirements and provide a variety of novel features. We elucidate Rocket, and its novel features below.

What is Rocket?

Object detection is one of the most fundamental and vital computer vision tasks. It has resulted in a large influx of detection models (eg. Yolo) in the last few years but with large computing and network costs [4]. An inference model is a crucial part of the puzzle, although there are other key components like video ingestion, frame preprocessing, resource management, modular serverless deployment, insights, etc. [4]. Moreover, there is a rapid increase in video analytics use-cases requiring efficient inference at the edge (with regards to computing and network resources) whilst maintaining adequate throughput with minimum latency — an issue for most current video analytics frameworks. Rocket, [5] pioneered by Microsoft Research, solves the aforementioned problems and much more. Rocket is an end-to-end cascaded pipeline for efficiently processing and analyzing live video streams.

Why Rocket? Some Salient Features:

Rocket offers some significant advantages over its peers and is a clear winner for live video analytics due to the following features [4]:

Computing resources on the edge/cloud that can be efficiently used in a cascading pipeline of lightweight and heavy-duty models.
A variety of pipeline pre-sets and deployment options are available for various use-cases ranging from object counting, object detection, object alert, etc.
The capacity for Rocket to be deployed as a container on the edge with REST Endpoints for data downstream consumption and upstream return. Microsoft also provides the ability to augment/customize/enhance current video pipelines for a customized solution with the required model architecture.
The ability to be used in conjunction with Azure IoT Hub and other Azure services to procure, analyze and store the analyzed metadata of the detected objects for obtaining further insights, once deployed on the edge.
Microsoft is also working on novel methods like Focus and Ekya to incorporate fast querying and continuous learning respectively. Continuous learning is an essential component as any machine learning model created to emulate real-world scenarios will suffer from decreasing accuracy over a while.
The Basic pipeline is made up of video preprocessing ➪ background subtraction ➪ cheap CNN and then a heavy model on the cloud or edge.
For example, in the object counting pipeline:

(a) Initially, background subtraction uses image processing to detect movement in each video frame. It is a low-cost filter and can give real-time results, even in CPUs.

(b) On detecting a change in the area of interest, a cheap CNN (i.e. a low-cost model with less accuracy and computation time) is used to detect the object of interest.

(c) In case the accuracy of the cheap CNN falls below a predefined threshold, a heavy CNN is called (highly accurate model) to confirm the results. This model may be run on the edge or cloud depending on the required latency and resources. Pipelines can also be split on CPU and GPU to invoke cheap models on CPU and computationally heavy models on GPU.

(d) The ability to segregate CPU and GPU computation across various edges is beneficial for optimization, efficient resource usage, scalability, and data privacy.

How does Wipro utilize Rocket?

Wipro uses Rocket as a vital component of our cognition engine for the bark beetle infestation experiment described above. We have integrated Rocket in our video analytics workflow and created an end-to-end insight-driven pipeline wherein we:

Orchestrate various video analytics pipelines using Airflow for containers.
Trigger the Rocket container on our simulated Nvidia edge to initiate the object detection pipeline and fetch the drone video stream.
Use the Yolov3 [6] object detection pipeline 6 to detect objects as per the required parameters and then pass on the inference JSON data downstream to IoT Hub.
Capture the Inference JSON from the IoT Hub using stream analytics to filter, clean and store it.
Obtain deep insights to interpret the inference data using the Power BI dashboard with interactive graphs and a Q&A prompt to get answers using natural language.

Issues with Rocket:

Despite its brilliant and novel architecture, there are still some cons you should know before you start working with Rocket:-

Rocket is intertwined with the Azure Ecosystem making translating it to another stack strenuous.
Despite the detailed setup instructions, creating the containerized pipeline is not user-friendly.
Some of the features like Focus and Ekya, are still under work and not made open-source. With concept drift being an imminent issue in real-life ML/DL systems, continuous learning is essential.
Like many other video-analytics frameworks, Rocket fails to provide any explanation for the outputs of its detection models.
The default pipeline comes with models trained on the COCO dataset, albeit it supporting custom model deployments. However, the custom model needs to be deployed as an ONNX model, a task our team found painstakingly difficult and counterintuitive even after considering the benefits of ONNX. Furthermore, custom model deployments required changes in a rather esoteric source code.

So what do you think about Rocket? Are you interested in trying it out? Write your thoughts in the comments below!

References:

[1] https://www.bloomberg.com/news/articles/2020-09-15/spruce-bark-beetles-are-ruining-swedish-forests-at-a-record-pace

[2] https://www.bloomberg.com/news/articles/2019-05-07/this-tiny-bug-could-put-a-625-million-hole-in-sweden-s-forests

[3] Swedish Forest Agency (2021): Forest Damages — Spruce Bark Beetle 1.0. National Forest Data Lab. Dataset.

[4] https://www.microsoft.com/en-us/research/uploads/prod/2019/05/MobiSys19demo_EVA.pdf

[5] https://github.com/microsoft/Microsoft-Rocket-Video-Analytics-Platform

[6] https://arxiv.org/abs/1804.02767