INTRODUCTION

In the previous post, we talked about the significance of numbers and their representations in AI. If you haven’t read it yet, I strongly recommend that you do because this post solidifies the same idea. It’s going to explain why numbers and their geometric representations can be applied to anything – words, images, text, audio, statistics – anything.

THINK BACK TO THE PREVIOUS POST

You might recall seeing the following image:

What we have here is 2D geometry.

  • There are (x,y) points here – (5,1), (10, 2) etc.
  • There is a line that marks a pattern of interaction across all the points. 

If we’re using geometric representations, then all geometric operations should make sense, right? For example, distance between two points. That should make sense. 

Consider 2 points from the image, say (5,1) and (10,2).

  • (5,1) = service score: 5, food score: 1.
  • (10,2) = service score: 10, food score: 2.

These are not physical positions. 

(5,1) does not mean that you went 5 units in the x direction (horizontal direction) and then 1 unit in the y direction (vertical direction) to get to the coordinate

So what does distance between two points mean then? What does distance between one (service, food) point and another (service, food) point mean? 

At first, it seems like distance between 2 points means nothing in this case. But, when you look closely, distance does have meaning here. Yes, (service, food) points are not coordinates in the most natural sense. But there is a notion of distance.

For example, between (5,1) and (10,2):

  • There is a distance of 5 in the x direction (10 – 5).
  • This simply means that x = 5 (service score: 5) is about 5 steps away from reaching x = 10 (service score: 10).
  • This is distance.
  • Similarly, there is a distance of 1 in the y direction (2 – 1).
  • This means that y = 1 (food score: 1) is 1 step away from reaching y = 2 (food score: 2).
  • Again, this is distance too.

Thus, we see that even when (x,y) don’t actually represent physical position coordinates, there is a notion of distance between points. 

Just like distance, other geometric operations also have meaningful notions for almost anything. Therefore, it does make sense to represent almost anything on a geometric plane. 

IT DOESN’T STOP THERE

Recently, someone used a geometric representation to demonstrate equivalence between (Germany, Berlin) and (France, Paris). 

It seems as though there is no way that distance or any other geometrical concept could apply to country names and capital names. But, it did.

The algorithm’s representation would have captured something like this: Germany is the same distance away from Berlin as France is from Paris. So, without any concept of what a country or a capital is, the algorithm was able to establish equivalence between (Germany, Berlin) and (France, Paris). 

This is just one example of how numbers and their geometrical representations can be used to meaningfully model anything — country names, capital names, food scores, service scores — you name it. 

LET’S DO ANOTHER EXAMPLE 

Suppose you want to decide whether you have disease A. Let’s say disease A depends on two factors: Fever and virus A. Note here that you don’t care about the degree of the fever. You just care whether there is or isn’t a fever. Similarly, you don’t care for the degree of virus A. You just care whether virus A is or isn’t there. 

Logically, you are only interested in the following possibilities:

  1. Fever = False and Virus A = False –> Disease A = False.
  2. Fever = False and Virus A = True –> Disease A = False.
  3. Fever = True and Virus A = False –> Disease A = False.
  4. Fever = True and Virus A = True –> Disease A = True.

So, the only way that disease A is found is if both fever and virus A are found together. 

Now, here’s the kicker. People model this problem geometrically too! They use the following diagram:

Notice that the only way that x + y can be greater than (>) 1.5 is if both x and y are 1. Therefore, it is the same as saying that the only way for the disease to be found is if both x (fever) and y (virus A) are 1.  

But what about distance? What does it mean to say distance between (0,0) and (1,0)? Note that (0, 0) here means that both fever and virus A were not found. Similarly, (1, 0) means that fever was found, but virus A was not. Is there a valid notion of geometric distance in this case?

Well, yes there is. When you think about it, fever not found is 1 step away from becoming fever found. That is why there is (0,0) and (1,0) are 1 step apart in the x direction. Similarly, fever found and virus A not found is 1 step away (in the y direction this time) from becoming fever found and virus A found. 

Thus, even with a problem definition such as if (fever = true and virus A = true), then disease A = true, a geometric representation like x + y > 1.5 makes  perfect sense.  

OPEN RESEARCH AREAS

There are very few things in the world to which numbers and their geometric representations cannot apply at all. At least not yet. For example, what is the distance between blessings and film? As of yet, no geometric space has been found that can sensibly model x, y, z etc dimensions on a geometric plane such that the difference between blessings and film is captured. Such examples come under open research areas in AI. 

SUMMARY

In this post, we tried to solidify the importance of numbers and their representations in AI, especially in terms of near universal applicability. The moral of the story is that you can capture almost any kind of relationship using geometric representations of data.