What is the most minimal language?

Artificial languages are where you’d look of course, and there are much simpler languages than Esperanto. Basic English was renowned for having a small vocab. My own favourite, with a comparably small vocab and a much tighter grammar, is Interglossa (as opposed to its revival Glosa).

Natural semantic metalanguage has an extremely small number of concepts, but is parasitic on natural language grammar, and is not meant for communication but for definitions. So it’s not quite the same thing.

Is there a graph showing the percentage of usage covered by the X most common words in a language?

I think what you are asking for is graphs illustrating Zipf’s law. Google that.  The links at the bottom of the Wikipedia page give graphs from various languages using online corpora.

Regrettably the Zipf’s Law topic here doesn’t have any content yet.

Not sure the graphs would look essentially different, whatever the register of language is: the tail drops off pretty steeply anyway, which is the point of it being a logarithmic distribution.

In Indo-European languages using a Latin alphabet, what’s up with these two letters “ch” that are pronounced (phonetics) so differently?

Roman alphabet digraphs were invented with the digraphs Latin used to represent Greek aspirated letters: <ch th ph>. So <ch> was available very very early on to languages using the Roman alphabet, to represent new sounds.

Palatal sounds are notoriously unstable phonologically: once /k/ goes to [c] (as it did in late Latin), it can then move on to any of [tɕ, tʃ, ʃ, s].

As a back consonant, <c> could be used to convey anything velar or palatal, or even palatoalveolar, given the possible targets of phonetic change for a fronted /k/.

  • So <ch> could end up being conscripted as something velar—like a velar fricative /x/, in German.
  • Or it can be used to mean that the velar is velar and not palatalised, like <chi> in Italian.
  • Or it can be used to mean something palatal instead of velar—like the palatal stop /c/  in Old French.
  • Or it can be used to represent any phoneme that the unstable /c/ ends up sounding like, including the palatoalveolar /ʃ/ in Modern French, or /tʃ/ in English and Spanish.

Can you say anything using a vocabulary of 100 words?

The claim of Natural semantic metalanguage is that you can with around 60. It was a party trick of Australian linguistics undergrads to speak in NSM; it becomes very stilted very quickly, but in principle you can define a lot of notions with a limited vocabulary, as the asker alludes to. NSM is of course a definition language, rather than a communicative language, but that seems to be what OP is after.

Basic English tries with 850 words, and xkcd’s Up Goer Five English that Robert Collins mentions seems to be of the lineage of Basic English.