A semantic gap is the difference between a thing being modeled and the
model's representation of that thing.
Models, by their very nature, are simple systems constructed to hopefully ease understanding of a more
complex system -- the semantic gap between them is the
complexity that's lost in the model. Models with a low semantic gap produce wonderfully
realistic results, but may take vast amounts of time to implement and end up being too complex themselves to understand. Models with a high semantic gap are much easier to devise and test, but may give results that are wildly inaccurate or even complete
nonsense.
As an example of a well balanced semantic gap, a relatively high level economic equation might, given the population, unemployment rate, and average annual income, predict how much money will be spent during the Christmas season. The semantic gap occurs between the population representation, an integer, and the population being represented, millions of unique, unpredictable human beings. Even with such a numbingly huge difference between the model and that which is modeled the equation might predict within fifteen percent accuracy if it was well researched.
Working to lower the semantic gap of a given model is the aim of all true science, and work done to raise it is what allows the layperson to understand their reality. These are important considerations since many of the systems we wish to understand (weather, brain function, etc.) are of nearly infinite complexity, but we as humans are limited to complexities we can understand. With a semantic gap that's too high we run a risk of generating results that are completely unlike reality, like Freudian psychoanalysis or the "World" section of your daily newspaper. Conversely if we try to lower the gap too much, the complexity of our model will quickly escape comprehensibility leaving us with little insight gained, as with some flavors of particle physics and neural network models.
One classic semantic gap bears further mention, since most Computer Science types associate it exclusively with the term. In the mid-1970's hardware manufacturers were concerned that there was too much difference between the machine-level instructions a computer could execute, and the high-level language that would be compiled to execute them. That is, there was an ugly semantic gap between the high-level language representation of the program, and the program itself as executed by the machine. With optimized compilers still in their infancy, it was thought that with a simple compiler and a high-level instruction set the programmer could be freed from the burden of optimization.
So, the original CISC's (Complex Instruction Set Computers) were born, and the semantic gap was for the most part bridged. Digital's VAX architecture led the pack, with single instructions whose function would've required ten or twenty instructions on Digital's earlier PDP architecture. Intel also years later took a pseudo-CISC tip, and the x86/Pentium architecture was the result. CISC proponents realized that these giant instructions would be slower to execute, but justified the expense with the amount of smaller instructions that wouldn't be needed.
Unfortunately, they didn't realize that no instruction set -- no matter how large -- could cover every possible situation. That is, some high-level code couldn't be represented by a machine-level operation, and would have to be compiled into sets of instructions that could. This result was called the semantic clash, and spelled death for CISC architectures. It turned out that even though the compilers were choosing the best instructions for the job at hand, the instructions still had too much overhead of unused features to execute quickly. Obviously, ten slow and powerful instructions, each doing the job of a dozen fast ones, just can't execute as quickly as 50 fast ones put in the right order by the compiler. Hardware designers realized this slowly, but when they did modern RISC (Reduced Instruction Set Computing) was the result.