A Little Help Here? AI Comes to the Rescue of Video-Overloaded Construction Projects

A big construction project is like a celebrity. It gets a lot of attention. Everyone takes pictures and videos. For stakeholders, it would be nice if all those images could be stored in one place, and be labeled and shared. It would be even nicer if someone could study the collected images and videos and be able to monitor their progress; check on key milestones; spot potential problems, including safety issues; and so on. 

Figure 1: Smartvid.io spotted workers not wearing safety colors in this photo. It will recognize objects and features at a construction site and label them automatically. (Image courtesy of Smartvid.io).

This may be the initial plan at many construction sites, but the sheer amount of video or pictures taken on a big construction project can quickly get out of hand. So, Josh Kanner, CEO and founder of Smartvid.io, has come up with what he thinks is the best idea of all. Instead of depending on people to pore over all the imagery, why not have the images analyzed by computer? Smartvid.io has developed a “smart photo and video management platform” that uses synthetic vision and deep learning to tell you important things about your project. For example:

  • Safety related: Are people on the job site wearing hard hats or safety glasses?
  • Management: You can monitor quality and track projects. 
Figure 2. A camera on a drone can take a lot of video, all of which can be uploaded to the Smartvid.io platform where AI can apply “smart labels” to detect construction-related objects. (Image courtesy of Smartvid.io.)

I could find no limit to the amount of imagery Smartvid.io can accept on its website, so I’ll assume you can throw all the photos and video at it from multiple drone missions, GoPro cameras, your smartphones … whatever. You can see the results of the imagery you’ve uploaded on a mobile device. All the number crunching (AI shape recognition) is done in the cloud.

Deep learning is used by Smartvid.io to recognize things like hard hats, ductwork and other things found on construction sites. Smartvid.io applies “smart tags” as its algorithms analyze images, so a particular stakeholder can easily find imagery related to their trade or interest—such as all the bits of video that show ductwork.

A compute-intensive operation, deep learning requires some serious horsepower—of the kind offered by banks of GPUs. Smartvideo.id makes use of NVIDIA GPUs. The company got some recognition as a deep learning practitioner, having been recently declared runner-up to the top prize at GPU Technology Conference (GTC) 2017, NVIDIA’s recent annual developer meeting.

“We got a $125K prize out of that,” said Kanner.

The Autodesk Connection

Smartvid.io works with Autodesk’s BIM 360 Field. So enamored with Smartvid.io was Autodesk that it invested in the company, making Smartvid.io only one of the 10 Forge partners in which it took a financial stake.

“We see what Josh is doing with Smartvid.io as the future of technology in construction,” said Sarah Hodges, Autodesk’s director of AEC products. “The amount of data that is created on construction sites has exploded in the last couple of years with the use of smartphones and mobile devices. Also, this is very much aligned with our research in big data.”

Along with the undisclosed financial investment Autodesk has made, Smartvid.io will now have access to Autodesk’s research labs, noted Hodges.

Autodesk had been making its own stab at mining construction project data using machine learning and analytics with its Project IQ, although the data it was mining does not appear to include video. Smartvid.io may now let Autodesk complete the picture (pun intended).

(Kanner’s previous venture, Vela, was sold to Autodesk and went on to become BIM 360 Field.)

The Future of AI Image Analysis

“The library of shapes it can recognize is growing,” said Kanner. “We’re developing that as fast as we can.”

The technology to find things in ordinary photos is already pretty advanced. “Did you know you can find every picture of a dog on your iPhone?” asked Kanner

I did not. I ask Siri. Sure enough, 50 pictures of pugs pop up.

Asked if future editions of Smartvid.io will use 3D scanning such as Google’s Tango, instead of or in addition to 2D imagery, Kanner said that is the plan. He may be being polite, but emboldened by a vendor who appears to be taking requests, we press on. Can Smartvid.io point out intruders, or if company personnel are in restricted areas? I understand this would require facial recognition, but it doesn’t hurt to ask, right?

Figure 1. Smartvid.io will recognize objects and features at a construction site and label them automatically. (Image courtesy of Smartvid.io video.