What is the math behind natural language processing?

If I remember from Foundations of Statistical Natural Language Processing, a lot of stochastic methods, probability theory, and maybe some linear programming. Discrete maths lurks everywhere in the background, like it does in computer science in general. But yes, it’s much more about statistics than anything else.

