If we understand learning to be the developing of neural connections in the brain, then necessarily there cannot be true aha moments (or more accurately, every moment is an aha moment).
Lets suppose that a child has a (flawed) model of how something works. Each time they are presented with information, they build new connections between neurons in their brain, while also occasionally (usually while they sleep) removing connections that are not used. Over time, this gradually results in a child having competing models for understanding how something works.
At some point, the model that works best is the one that becomes used, and the neural connections that represent the unused model are eventually severed, and we might say that the child is exclusively using the new model.
At no point does a child suddenly flip from only using one model exclusively to using an entirely new model because this would require making many, many different neural connections simultaneously for the new model to function. This sudden flipping is what is commonly called an "aha" moment, and it doesn’t exist. Learning is not the sudden acquisition of new models of understanding the world, but the gradual shifting between competing models, none of which probably completely describe the world.
Aside: No one has a perfect model for understanding the world, because that requires a complete set of all possible "true" states of the world.
Different possible stimuli of the same basic information could lead to different models being used. For example, suppose I asked students to solve 4 + 5 verbally as compared to representing this symbolically in writing. It could be that students use one of their competing models to answer the verbal form of a question, and a different model to answer the written form, and in some cases arrive at different results. It could even be true that different people asking the same question results in different models being used!
When my son was in grade one, I participated in a student led conference in which he laboriously demonstrated using a regrouping method how 3 + 9 = 11. In the context of his classroom and with a written question, he answered the question with one model. I did nothing to offer feedback on his model at the time. 10 minutes later, we were driving in the car, and we played a number puzzle game where we took turns saying numbers and trying to figure out how to get that number using arithmetic operations. I said 12, and my son responded with 1 + 11 is 12, 2 + 10 is 12, 3 + 9 is 12, and so on. My son used a different model, and arrived at a different result.
If this theory is correct (and to be clear, it is just a theory), then it has implications for instruction. The first implication is that in order to help students develop models, we need to introduce all them to both different representations of ideas (ie. representations of other people’s models for understanding), from different people (ie. teachers, students, and parents), and different modes of processing that information (verbal, written, symbolically, manipulatives, etc…). It also suggests that we should not assume that because a child can respond in a way that suggests they have a solid model of understanding once, in one context, that this means that they actually have such a model. It also means that what children know how to do, or do not know how to do, is unlikely to be successfully captured by a system that assumes binary understanding of concepts (concepts are not known or unknown, we have models which seem to work in some contexts, and may not in others).
Note: It is probably worth noting that my use of the word model is a simplification of the set of neural connections we use in our to process and store information, and is almost certainly an incredible simplification of those processes.