Sometimes, as humans, we take all of our sense for granted. For instance, we can see the blue ocean waves crashing down on a brown sandy beach, all the while feeling the cold sea air wrap around us while we hear our loved one's play and laugh.
Computers are not so lucky. They only understand numbers, and we have to use mathematics to replicate the sensations we experience for them. Of course, there are also some very cool AI techniques involved!
For now, let’s only focus on how we allow computers to see and make sense of things.
Computers do not have eyes, therefore we need to upload pictures of the world for it to interpret what is going on. This is where the maths come in. A picture saved on a computer is actually just a bunch of numbers stored in rows and columns. Each of these numbers represents the intensity of the specific pixel associated with that row and column. We call these rows and columns of numbers a matrix.
So with this knowledge, we now know that a computer does not really see an image, but sees its matrices. Because we are working with these matrices, we can apply heaps of linear algebra, calculus, and AI techniques on this to create a new single row of numbers called a vector. The techniques used will ensure that images (matrices) that look the same will have similar output vectors. We can then compare these new output vectors to find similar images or classify what an image is. It is very common to refer to these vectors as the embedding of the image. A very fancy fingerprint I must say!
In the world of embeddings, we only care about distances. If one embedding is very close to another, we say these embeddings are the same and vice versa if the embeddings are far apart.
Luckily, this is very easy to do. Let's imagine that each of our embeddings we created consists of only two numbers. We can then plot these embeddings on a graph, as shown below. By just inspecting the graph it is clear that John and Alexander are more alike than John and Ben for instance. But for a computer, it's not so obvious. A computer has to calculate the distance between embeddings using techniques such as the Euclidean distance. This also only works if the embeddings you generate for each image is an accurate reflection of what is going on in the image.
All of the methods above are exactly what Scouts’ groundbreaking AIIS engine is built to do. The AIIS engine is a custom-build embedding generation engine, that uses very sophisticated techniques from the AI world to create embeddings for all types of data and not just images. We create a fingerprint (embedding) of all our Scouts so that we can find each brand the perfect match.
Seeing as AIIS is an AI engine, it teaches itself how to generate the perfect embedding. Below we show AIIS learning how to generate embeddings that can accurately represent the tweets made by a handful of celebrities. The plot shows the embeddings created by AIIS, coloured according to which celebrity made the tweet.
Below we see the final embeddings that AIIS learned for our celebrities. Now all that we still need to do is generate a fingerprint for any brand and then we will be in the state to find the perfect Scout, all thanks to AIIS!
In the next blog as part of the series, we will go deeper into these AI technologies to make anyone understand the AIIS engine
- Ruan Van der Merwe: Master Machine Learning Engineer at Alphawave Scout.