I finally worked this out, by reading half of Ancient Greek accent – Wikipedia. (Reading the other half confirms it, but I’m still proud of myself.)
The answer is: the second element if acute, the first if circumflex.
Let’s take this slow.
The explanation of the distinction between acute and circumflex in the Wikipedia article is based not on contours on a vowel, but on high/low contrast on Morae (what a long vowel has two of, and a short vowel has one of). And I gotta admit, that’s the first time an explanation of Greek accent has made sense to me.
So. Let’s ignore grave. Short vowels can only take an acute. That is a high pitch on a single mora:
έ = ˥e.
A long vowel can take an acute. That is interpreted as a high pitch on the second mora:
ή = ɛ˥ɛ
μή = mɛ˥ɛ
You’re going from neutral pitch to high pitch. That will of course sound like rising pitch.
A long vowel can instead take a circumflex. That is interpreted as a high pitch on the first mora:
ῆ = ˥ɛɛ
In context, a circumflexed vowel is a neutral pitch mora, followed by a high pitch mora, followed by a neutral pitch mora; e.g.
καλῆτε = ka.lɛɛ.te
That will sound like a circumflex: rising then falling.
So. Diphthongs involve two short vowels. (There’s also long diphthongs, which are the things with iota subscripts.)
Two short vowels are two morae.
So it’s the same. αί has high pitch on the second mora (i.e. second vowel):
αί = a˥i
αῖ has high pitch on the first mora (i.e. first vowel):
αῖ = ˥ai
Now, your question was, if we use contour tones rather than pitch peaks, how do we transcribe it in IPA?
At that point, I myself would prefer to just go with convention, and put the contour tone symbol on the second letter, because that’s what Greek does. But the point here is that the contour tone, in both cases, starts on the first vowel = first mora. So arguably putting it on the first vowel is more accurate.