Tracking Library for the Web

The story about how I built a Computer Vision library for the web in JavaScript during master's degree.


My first bachelor's degree was in Electrical and Telecommunications Engineering. I enjoyed learning all the math, but I never worked in the area. I started working with software very early, and never stopped until now. After a while, working with real computer scientists, I always felt I was an outsider — perhaps a manifestation of Impostor's syndrome.

So, back in 2012, I decided to apply for an academic Master's program in Computer Science in one of the best Universities in Brazil. I called one friend that was finishing the Master's program at the time and asked: What is the brightest professor at this University? He answered, without hesitation, Silvio de Barros Melo, a Computer Graphics specialist. Promptly, I scheduled a meeting with the professor and presented my proposal.

He was excited by the amount of experience building software for the web I already had at the time, and he mentioned they were trying to solve a tough problem, building a Computer Vision library for the web using Flash with C/C++ bindings. I told him I could build it, and not only that, I could make this in pure JavaScript. I was secretly counting with the Media Capture and Streams — still a draft at the time, to be able to capture frames from the user's camera and process them in a canvas element in real-time.

Example 1 — Face tracking controlling 3D rendering at 44 FPS in the browser
Example 2 — Face detection using Viola Jones in the browser

After a strict selection process, I got accepted into the full-time program. I was working for Liferay as a Software Engineer. I needed to make money to pay my bills and couldn't stop working. I had to do both a full-time academic program and a full-time job. I had long three years ahead of me. Fortunately, I finished the program with honors. After the thesis presentation, one of the board members made the following comment: This thesis is the difference between good and great. I am not sure if this translates well to English, but I was proud of myself when I heard such a compliment.

Eduardo Lundgren, Alex Barros, Silvio de Barros Melo, Veronica Teichrieb, João Marcelo, after thesis presentation — August 2013

The work resulted in a JavaScript library called tracking.js. I no longer actively maintain it, though it got its popularity, 8,300 stars at GitHub, and around 140,000 results at google. In this post, I am publishing the final thesis as a PDF embedded iframe, so then people can use it as a reference.

Screenshot of
Tracking.js presentation tools — August 2013

The JavaScript implementation is open-source and available on GitHub, see the links below for the implementation of the different algorithms.

If you are interested in how I built this library, take a look at the thesis down below or download it here, it explains all the details.

“In this disseratation, I have designed and implemented a tracking library for the web that brings different Computer Vision (CV) algorithms and techniques into the browser environment without requiring third-party plugins installation. By using new HTML5 specifications, it enables users to do real-time color tracking, face detection, and much more — all that with a lightweight core (~7 KB) and intuitive interface. It provides a common infrastructure to develop applications and to accelerate the use of CV techniques on the web in commercial products. The library involves the of different CV algorithms and technologies into the browser environment. Between the several methods available, some algorithms can be used for various applications, such as, detect faces, identify objects and colors, and track moving objects.”