Aton Kamanda

I recently completed my master’s degree at Mila and am currently enrolled in various research projects.

My current goal is to advance my expertise in machine learning engineering. I believe that the ability to apply cutting-edge deep learning research will play a pivotal role in the forthcoming technological revolution and that is what I am focusing on currently.

Feel free to contact me if you are interested in collaborating.

About me

Applying state-of-the-art research in production.

  • My primary strength lies in my profound and extensive comprehension of the literature in deep learning. Ranging from low level optimization of neural networks computations to theoretical machine learning passing by reinforcement learning, bayesian probabilities or computational neuroscience. I can quickly gain an in-depth understanding of any paper from top machine learning conferences (NIPS/ICLR/ICML) and be able to reproduce it or implement it efficiently in production which I think is a very rare and valuable skill.

What I believe

The advantage of digital intelligence over biological intelligence.

  • One significant distinction between digital intelligence and biological intelligence is that biological neural networks weights are physical, the hardware is tied to the software, that is humans and animals are mortal computers. This distinction underscores the unique advantage of digital intelligence in its ability to run identical weights on diverse physical substrates, providing interesting properties unattainable by biological counterparts

  • This property is very important since it means that all individual copies of the model can share the same weights allowing them to communicate what they have learned from their individual training data by sharing gradients. Current large neural networks leverage this property via parallelism to process vast amount of data and acquire tremendous amount of knowledge , that is how something like GPT-4 is able to read "everything that is on internet" something that no human today is able to do.

  • Illustratively, consider an artificial neural network radiologist who has analyzed tens of thousands of patients, compared to a human counterpart who has encountered only a few thousand. The artificial neural network radiologist will attains a significantly higher level of understanding and efficiency. This mirrors the paradigm of AlphaGo excelling as the premier Go player through extensive gameplay, surpassing what any human could achieve in a single lifetime.

  • A common myth is that current artificial neural networks are more statistically inefficient than humans, This misconception often arises from unfair comparisons, such as pitting the learning abilities of a university undergrad against a blank neural network, without accounting for the wealth of prior knowledge humans possess. Recent studies, conducting a more rigorous comparison, and the demonstrated ability of recent models in few-shot learning refute this myth, showcasing that artificial neural networks are just as statistically efficient as humans.

About a theory of intelligence and cognition.

  • The free energy principle is currently the most convincing theory of cognition, it is very powerful in that it doesn't only try to explain what cognition is, it explains the mathematical rules that any agent doted with cognition should follow in any possible world. The brain can be viewed as a particular solution nature found to implement approximate bayesian inference in biological organisms and AI research can be viewed as finding the best ways to implement it into machines which what Solomonoff's algorithmic probability theory is about.

  • A lot of complex mental phenomena previously thought to be mysterious make a lot more sense under this paradigm, such as perception and action, consciousness, its ineffability, emotions, and selfhood, or mental disorders such as schizophrenia, addiction, and maybe even depression.

  • These theoretical works among other lead me to think that we have a decent understanding of the principles underlying cognition and intelligence. The current focus appears to be shifting towards the engineering aspects, specifically addressing the practicalities of constructing these systems. We are in a situation similar to understanding the principles of electromagnetism and the nature of a photon but still needing to figure out how to build a laser.