The Decalogue of Nick #2: I’ve trained as a linguist, and I have done computational linguistics stuff

For Audrey Ackerman and Brian Collins and Zeibura S. Kathau.

Ask a Greek what they can tell you about Byzantium, and they won’t tell you what the millennium of the East Roman Empire achieved. They won’t tell you about the Palaeologan Renaissance, or the ambivalence about the Classical past, or the edifices of Roman Law, or the architectural marvels.

All they’ll tell you is that the Empire fell. And they don’t even pick the right instance when it fell. (The Empire at 1453 wasn’t worth saving.)

Well, so it is with my linguistics career. Scratch me just a little bit, and I will lament the defining woe of my life, that I did not become a professional linguist: Nick Nicholas’ answer to What is your personal experience with obtaining a linguistics degree? (If I’m feeling prudent, I’ll admit that the outcome was the right one: Nick Nicholas’ answer to What are your 3 worst mistakes? Would you fix any of them if you could go back in time? But only if I’m feeling prudent.)

What’s harder for me to do, as a glass half empty kinda guy, is admit how much I gained in the experience.

Linguistics gave me a sense of purpose when I had none. Linguistics gave me friends and companionship and stimulation. Linguistics gave me the place where I could act as glue between my peers—that trait that Clarissa Lohr continues to find approval-worthy in me. Linguistics gave me the opportunity to teach—and once I’d gotten to teach, nunc dimittis: I could have died a happy man, even if it was just three semesters.

And in truth, Linguistics gave me the opportunity to turn away from it, to say that no, I deserved better than to be strung along, and to regain myself even at the cost of losing myself.

I sidled into linguistics and out of electrical engineering towards the end of my undergraduate degree. Through a masters I did enough of the bridging undergraduate courses that they let me in to the PhD programme. They recognised, I guess, that I had some talent.

The Master’s was in discourse theory: Rhetorical Structure Theory, to be exact. Discourse structure and implicature was my early fascination in linguistics, and RST was all the rage back then (1994) in text generation. (Do people even call it text generation any more?)

When time came for me to pick a PhD topic, though, I wanted to go back to historical linguistics. Actually I wanted to go forward from discourse theory to historical linguistics: grammaticalisation gave a way for implicature to motivate language change that intrigued me. While I initially wanted to work on Tibetan (because Squiggles), I had a conversation with the doyen of grammaticalisation theory, Elizabeth C. Traugott, during which she asked, wouldn’t it be fascinating to look at how grammaticalisation interacted with diglossia.

Two years in to my thesis, I’d worked out that no, that was a dumb idea. But I was too far into my work by then. My topic was the development of the modern relativiser pu, and how it had diversified in meaning. I’d intended to work backward, and unearth all sorts of awesome instances of implicature and analogy from Early Modern Greek.

But theses don’t go as you’d planned. On the way towards internal reconstruction, I became captive to the diversity of Greek dialect—nature’s historical linguistic laboratory: they have a common starting point with dozens of divergent endpoints, so you can get an amazing sense of what is possible in language change. By the time I’d worked out what was happening in all the dialects, and detoured into what was happening in the rest of the Balkans, I’d run out of time. And space: the Balkan chapter ended up on the cutting room floor.

From the thesis, I’d gained an encyclopaedic familiarity with modern Greek dialect; a good knowledge of Early Modern Greek anyway (which I put to use in my later coauthored monograph, An Entertaining Tale of Quadrupeds); and a smattering of Balkan linguistics. I’d planned to use my knowledge to write a reference grammar of Early Modern Greek by the time I was 50. That isn’t happening; and the guys who were working on it (Greek Grammar to fill the gap) have run out of funding and have retired.

What I did not get is any Ancient Greek; I don’t have any formal training in that, although once you’re a linguist, you can make sense of a grammar book just fine. (And I picked up what I needed to later.)

I also picked up a fair bit of linguistic typology from a decade of working as a research assistant, mainly under John Hajek. It was a rocky relationship, as you can well imagine from someone with my ego in a second fiddle role. But it was a good schooling too. And working on a phonological survey of Papua New Guinea, I got at least some of the phonology I did not get from the department.

I wrote a bunch of papers after I finished the PhD. Some got published. Some got submitted at the time journals got switched over from paper to electronic submission, and got lost in the mail. It was fun to write the papers; but it was also writing in a vacuum. I didn’t really have a network of peers to care about what I was writing (part of the problem of not being in Europe), and the problems I was working on seem to have been too obscure to have stimulated any interest anyway. In fact, the papers that generated the most interest were about social history (the Greek colony in Corsica). I have 8 finished unsubmitted papers, and 8 more incomplete, from when I stopped writing in 2008. I’m not strongly motivated to do anything with them.

I got more interaction, if anything, out of the Ἡλληνιστεύκοντος blog I used to do (and will do again, if Quora disappears in a puff of smoke). And some of my favourite questions on Quora are when I do my own detective work, to solve a linguistic problem I don’t already know the answer to.


For Amy Dakin.

I have also done some computational linguistic stuff. Most of it has been at the Thesaurus Linguae Graecae, where I had worked from February 1999 through to June 2016.

I’ve been reluctant to go publicly into the specifics of why I’m no longer employed there, until now. But then again, my time at the TLG should not suffer the same fate as Byzantine History: what I achieved (what *I* achieved) is more important than the way I ceased to.

I have very high regard for my fellow programmer of 13 years Nishad Prakash; and if anything even more regard for my fellow programmer of 4 years, John Salatas, who is still working there. They are far better craftsmen than I am. And I don’t mean to take anything away from their achievements by what I’m about to say.

But anything you see at the TLG that involves linguistics? Me. Anything that involves stylometrics? Me. Anything that involves Natural Language Processing? Formatting? Peculiar sigla? Comparison of texts? Me.

There’s a lot of computer science things that I’m proud of working out while there. Some algorithmic refinements to recursive Longest Common Subsequence detection, to work out common phrases between passages. Some fiendish DFA and NDFA work, to deal with the quirky ASCII encoding of Greek we have in character-by-character and wildcard search. A lot of cleverness in contextual grammatical disambiguation, that I’m not confident will ever see the light of day (or will be highlighted for users if it does).

And my crowning work: the morphological analyser of Greek. It originated in Perseus’ Morpheus, but I have stretched and pulled and broadened and narrowed and reranked it over the past 15 years, to deal with all the stages of Greek the TLG has thrown at it, from Homer through to misspelled 17th century Cretan land deeds—and to still yield some semblance of order. In the process, I dare say I have developed as intuitive a sense of what grammatical wackiness Byzantine authors could indulge in as anyone living: I’ve had to deal with it all.

I didn’t get to write the reference grammar of Early Modern Greek. But the morphological analyser I curated, with all the proper names of Athenian courtesans and Albanian chieftains, of Egyptian decans and minor saints, with all the mangled Byzantine optatives and grammarians’ fictional conjugations, with every last utterance of Sappho accounted for, and as much of Theodore Metochites’ as I could disentangle: that has been just as great an achievement.

Which I now no longer can contribute to.

But those of you with access to a TLG subscription: click on some words’ analyses, and do some parallel text comparisons, and look up some of the online lexica. And taste some of the joy of the Greek language and the Greek literary corpus, that I got to savour in my time.

And you’re welcome.

What do linguists think of the movie Arrival?

You have waited a long time, Hansolophontes, for me to answer this A2A. I did not read any spoilers. I did not read any of the other answers (which may make this look silly this late).

I finally watched Arrival last night. Very well made movie: great sense of atmosphere, and fear, and awe. I was annoyed at the plot twist: it’s annoying and cheap whenever it shows up in science fiction (it was a letdown whenever it was used in Star Trek). But given that it was going to happen, I have to say, it was handled poetically by the movie. As long as you don’t think about the plot holes (and associated plot laziness) too closely.

What did I think about it as a linguist?

  • They fast-forwarded the best part, how Louise worked out the language past the first two words. They got the start of the process, but not the heart of the process. But that’s OK: not many people would have found it cinematic.
  • The start of the process of working out the aliens’ language was beautifully handled.
  • The whole non-linearity thing about the aliens’ language? Shoehorned in to connect to the plot twist. It wasn’t explained so as to make sense: all I could see was a circle with a bunch of words in it, I wasn’t persuaded there was anything intrinsically non-linear going on.
  • Movies with any degree of complexity have an obligatory whiteboard scene. The whiteboard scene was well done: the questions Louise was raising about what basic concepts they had to establish were rattled through rather quickly, but they all made sense, and were well thought through.
  • The derision of Ian wanting to talk to the aliens in maths was silly. And mercifully, the guys in Australia did not think it was silly. It’s been accepted for decades that if you want to confirm alien sentience, you use maths that does not occur in nature. (Although that means primes, not the Fibonacci sequence.)
  • What sort of a linguist was she? There’s hints she’s an historical linguist (she knows some Sanskrit, and she knows both the anecdote about kangaroo = “I don’t understand”, and the fact that when someone bothered to record the language where Cook landed, it turned out not to be true). But what historical linguist has a photo of fricking Chomsky at her desk? Chomsky is a big part of the reason why historical linguists don’t get jobs.
  • That was probably a freshman lecture on linguistics; the textbook she was cradling certainly looked like Linguistics 101. But what Linguistics 101 course dedicates a whole lecture to Portuguese (outside the Lusosphere)? And who the hell explains Portuguese by saying that the mediaeval Galicians thought language was art? You don’t say Onde é o banheiro? in Portuguese as an act of art.
  • I’m amused that Ian brought up the Sapir-Whorf hypothesis, and Louise didn’t immediately start guffawing. My own opinion is that there is a little bit (a little bit) to the hypothesis. But derision of Sapir-Whorf within linguistics is universal, and in fact is something of a shibboleth: it is ideologically driven because of how linguistics currently thinks of language. It’s only non-linguists who take Sapir-Whorf seriously.
  • Oh, the army guy dumping the tape recording and saying “translate this”?! Come on. Even grunts aren’t that silly…

The director has obviously talked to linguists, and has certainly read up on linguistics. The details (such as the university setting) did look to have come from someone who wasn’t that clear about how university linguistics actually works. The linguistically challenging bits were swept under the carpet. But the core scenes about establishing communication (including the whiteboard scene) were right.

OK, now to read what everybody else said…

Was Greece created by Germany?

Minority view here, and I’m astonished noone’s picked up on it.

The Modern Greek state was established in 1829; and while Greeks like to think they won the Greek state with their sword, the Greek War of Independence had pretty much been quelled by 1827. It was the Great Powers’ intervention at the Battle of Navarino that guaranteed an independent Greek state, because the Great Powers thought that would be handy. Don’t take my word for it: First Hellenic Republic – Wikipedia.

The Great Powers were Britain, France and Russia. The major political parties of independent Greece, for decades, were the British party, the French party, and the Russian party. There was no Germany, and Germany did not create independent Greece.

There was, however, Bavaria, and the Kingdom of Greece was established in 1832 under a Bavarian king, Otto of Greece. Otto brought Bavarian administrators with him, and they ran the country for the first five years of the kingdom. Per Wikipedia:

During the early years of his reign a group of Bavarian Regents ruled in his name, and made themselves very unpopular by trying to impose German ideas of rigid hierarchical government on the Greeks, while keeping most significant state offices away from them. Nevertheless, they laid the foundations of a Greek administration, army, justice system and education system. […]

The Bavarian Regents ruled until 1837, when at the insistence of Britain and France, they were recalled and Otto thereafter appointed Greek ministers, although Bavarian officials still ran most of the administration and the army. But Greece still had no legislature and no constitution. Greek discontent grew until a revolt broke out in Athens in September 1843. […] Power then passed into the hands of a group of politicians, most of whom had been commanders in the War of Independence against the Ottomans.

So Germany did not create Greece; but the Kingdom of Greece was certainly initially set up by Bavarians.

How offensive is the word “cunt” in Australia?

Just to round off what others have said: yes, it is mostly a more vulgar counterpart of the Australian term bastard, and it almost always refers to men rather than women. (The reductionist misogynist use of cunt to refer to women is unknown here. I only discovered it a few years ago)

Just like bastard, if it is qualified by an adjective, it is typically informal, jocular, or dismissive, rather than outright offensive, in “lower” social contexts. (Australia does have classes, but it also has a lot of mobility between class registers: the new money millionaire can float between low and high class discourse. Old Money doesn’t, but Old Money isn’t as prominent as it used to be.)

Used on its own, though, it is still vicious. When someone called me a cunt because my dog crapped on his nature strip? He was getting ready to punch me, the roid rage rising to his head, the fists clenching; and cunt was the most hostile term he could spit out at me.

And you do have to judge your registers for appropriateness. There is a jocular, low register with ribbing and swearing and no actual harm done. But that’s not 24/7, even for the so-called lower socioeconomics.

How do I tell a girl she has a nice rack?

When I was 12, I found in my local library a copy of Brush up your pidgin.

It’s a textbook of Tok Pisin, the pidgin of Papua New Guinea, played for laughs. It is hardly a serious textbook: the protagonists are a clueless British missionary and his sex starved wife, the Tok Pisin is respelled to look more familiar to English speakers, it pokes fun (though not, from memory, sneeringly) at the local culture.

Even though it was played for laughs, I actually learned a lot from that book. You could tell, even from that book, that Tok Pisin is a language with its own internal genius, which is quite far removed from English — even if its vocabulary is deceptively English baby talk. It may well have gotten me started as a linguist.

The final dialogue of the book introduces an Australian pilot, who flies the couple into the interior. Up to that point the dialogues are bilingual, British English and Tok Pisin. With the pilot, Australian English is also introduced.

And the pilot sees fit to comment to the missionary’s sex starved wife as follows:

  • Australian English: Geez, you got a beaut pair of norks!
  • Tok Pisin: Mi laikim susu bilong yu.
  • British English: … You have a lovely blouse.