What is CV?

Computer Vision Technology Explained

A couple of years ago I moved into a new apartment. I was renting at the time, and the place came semi-furnished. I was near the end of the walk-through with my landlord when I was told to do my best to keep the houseplants (that came with the place) alive. Simple if you know plants I guess, but I didn’t and that’s all the info I was given. I wouldn’t say I’m not a plant person, but I have never really kept houseplants, so I was a bit nervous about that requirement.

To keep my nerves in check, I started googling to find out what kind of plants they were. This included the (now hilarious) searches of phrases like: “houseplant with teardrop shaped leaves”, “plant with teardrop shaped leaves -vine”, “common houseplants”, and the list goes on. These searches narrowed it down a bit after about 20 minutes (okay maybe an hour). After my rigorous research, I was still unsure of how to take care of them and when to water them.

It was at this point I remembered an app I had seen on my phone about this exact situation. An app on the iPhone Appstore called “PictureThis – Plant Identifier” [1] was then downloaded. I simply opened the app, followed the instructions to take a photo of the plant and the app would handle the rest. Within 2 minutes, the app showed me about 3 plants that it had identified mine as. From that point I was able to choose which one matched perfectly and learn about the plant and how to care for it. I was amazed and then ashamed of my lackluster google skills to find this on my own. Jump forward to now, after doing some research, I found that this technology is called Computer Vision.

[2]

(Side note, I have since

moved out of that place,

but the plants live on!)

[3]

Computer Vision (CV) is not new technology. It was pioneered as an offspring of some university research into Artificial Intelligence (AI) in the late 1960s. Yes! Long before the internet was in our hands. A form of AI, CV aims to mimic the way we see things as humans so that computers or “robots” can develop intelligent behavior. This is both extremely interesting and terrifying depending on your outlook. Anyway, now that I have you thinking about an army of killer robot AI that can see you and analyze you with their eyes, let’s jump into how and why CV is actually used today.

Without getting into extreme detail, which will follow in part 2 of this blog, I will try to explain how and why CV is used today. Before researching the topic more, I was definitely ignorant about many of the processes CV is used in. It can simplify many aspects of our lives. The biggest one that comes to mind is self-driving cars like Tesla. They utilize a form of CV that recognizes and reads road signs and objects picked up by the 360 degree camera views on their vehicles. More commonly, CV is used in simple applications that can make every day (or maybe every other day...) life simpler. Things like PictureThis - Plant Identifier app can save you or your pets from eating toxic plants (provided of course, you have cell service).

CV can utilize machine learning techniques, a colossal topic in and of itself. With a machine learning approach, these programs are developed to identify patterns in pictures or videos. The machine learning algorithm (the calculation path that the code follows) sorts the media and detects objects within.

Another technique CV utilizes is called deep learning (another colossal topic). With a deep learning approach, apps are developed to behave like a neural network. These applications work by identifying patterns in given examples. The more examples given, the more accurately the program can identify what it is looking at. This is the typical approach for app development because it is easier to deploy.

The steps CV follows in its process are as follows:

Recognition/Detection

The computer works to identify and interpret what it is "looking at" This could be road signs, words, plants, household items, etc.

Reconstruction/Generation

Using the visuals that the computer is analyzing, it detects shapes, motions, perspectives and more to map models

Registration/Classification

The computer puts different sets of media/data into one classification system. This is the "meat and potatoes" of how CV recognizes what it is looking at in the future. The more registered and classified data it has or has learned from, the better it will be at recognition and detection.

Reorganization/Similarity Estimation

In this step, the computer works to reorganize an image into items or objects that it recognizes. This helps with removing background "noise" in an image and only analyze what the user wants it to analyze. I.E., two objects are pictured in the image you want CV to analyze, it separates the image and classifies either both or the more important/showcased object.

Demonstrations of everyday use of CV

Unfortunately, I don't own a Tesla or know anyone who does so I will focus the following section on demonstrating a few applications I can simply download on my iPhone.

#1. PictureThis - Plant Identifier

The PictureThis - Plant Identifier app uses CV to analyze photos you take of plants. The app guides you through the process and it is very simple. The real work is being done on the backend. AI is actively analyzing your photo and comparing it to database results to try and identify what plant it is looking at. Once the AI narrows down what it is looking at, it pulls information about the resulting plant from it's database and supplies it to the application for the viewer to interact with.

#2 Google Translate - Word Lens

Google Translate Word Lens app uses CV to translate text on the fly. It starts by recognizing the words it "sees" and pastes them onto the camera image the user looks at. The user can then select the language the sign is in (or auto-detect) and the language the sign or text should be translated to. As long as the displayed text is clear, this use of CV is very precise, due to Google's large translation database.

#3 Neural Vision

Neural Vision app uses CV to detect objects in images and then classify them. The object detection and image classification models can be swapped out to whatever machine learning model they would like to test out, making this app more of a testing or developer platform.

As a conclusion to this blog post (remember it's 1 of 2 posts on CV), I have included a video below describing some more aspects and uses of CV. Thanks for reading!