[Epistemic status: I’m sure that this happens in me. Not so sure why; take all my speculations and extrapolations (to AI especially) with a grain of salt.]

Ever since learning more about history in the past year or two, I’ve started having a really weird sensory experience. I can see history. What I mean by this is that when I hear a date when something happened, I can place it in my field of view. The timeline also has different levels of zoom on a sort of logarithmic scale.

I think this experience/model all started when I learned the US Presidents using Anki. For every president, I learned the years they served, their picture (picture → name), and the party they belonged to. I did this not because I’m some crazy history buff, but just because I wanted to test how well Anki works. I found that Anki works remarkably well. I was able to memorize all the presidents and becuase of the nature of the spaced-repititon algorithm, still remember most (>95%) today.

As a side effect of memorizing all the presidents, I developed this weird timeline sense. I (semi-)concously tried to place the presidents in the timeline when remembering them. I think that visualizing this timeline in my head every day cemented it. I started to place other dates and historical facts that I learned on the same timeline. It became a self-organizing way to dump all my historical knowlege.

Here’s how it works for me: When thinking in the context of the 20th and 21st centuries, which are the 2 most common centuries for me to think about, the 20th century starts at the left of my field of view. Maybe at 70° W (if N is straight ahead). The turn of the 21st century is at around 20° E. But I can also zoom in or out. For example, when I zoom in on my timeline to just the 21st century, I see the year 2000 at the left of my field of vision and 2023 at the far right. I can also zoom out from 0 AD to today, with similar locations. What I can’t do, and I find this pretty interesting, is zoom in on the 5th century. Like sure, I can just pretend to visualize a timline of it in my head, but it does not feel the same as visualizing latter centuries. I think the reason for this is because I don’t know many events that can ground the timeline in the 5th century. I know many more events that are in the latter centuries, so that could be a possible reason why I can visualize their timeline in more detail.

I wonder how much of this sense I should attribute to Anki. Anki has clearly both provided facts to populate the timeline and also a motivation for my brain to form this timeline. In fact, I’ve been thinking about the idea that intelligence is just knowlege compression. I’m pretty sure that one of the reasons that my brain created the timeline was to make it easier to memorize the dates that I was learning. Learning the presidents in order is actually the perfect application of a timeline. Every point on the timeline is filled in, for the most part, the years of service follow a simple pattern (4 or 8 years), and I linearly added to the end of the timeline starting from 1789. This year in school I took AP US History, where I learned a LOT more dates with Anki (I made at least one flashcard for everything my teacher said (I have 1586 cards tagged apush or history)). And sure enough, I placed these on the timeline as well. My brain found a really good way to compress temporal, factual, knowlege (which humans aren’t that good at remembering), with visual knowlege (which humans are really good at dealing with).

How this relates to AI

[This is where the epistemic status applies the most, I don’t know if anything I’m saying here is true.]

AI is not just metaphorically trying to compress knowlege; LLMs are literally trained to compress the entire internet into their weights. Given that Gradient Descent is probably better than the process my brain used to make my timeline, I suspect that in a multimodal model, if any mode is slightly more effective at storing knowlege than other modes (like vision is for humans) a lot of knowlege that is not even in that format will be stored and accessed in that format because it is more efficient for the model to learn it in that format. Furthermore, we could validate this by doing mechanistic interpritability reserach. If see that a model is using the same algorithms to access two totally different formats of data, it would be evidence for this hypothesis. I don’t really know much about multi-modal models or how they work, but I’m assuming that they are the next logical step from LLMs, so this hypothesis could be testable soon.

Contact me

If you have this same sense or something similar to it, please email me (jacoblevgw at gmail). I’m really curious how this manifests in others. Humans have really diverse patterns of thought.