Meaning, Authorship, and Originality: Insights from Forensic Linguistics Presentation to Association of Document Examiners, November 8, 2009

When does a lawyer need a linguist?

More basically, what is a linguist?

There are many different kinds, dozens of areas of emphasis.  Ultimately one becomes one’s own kind of linguist, depending on where one’s interest and preferences lead.

I’m always interested in real data – not interested in voguishly Chomskian sterile theorizing.  My honors thesis was a study of African-American dialect in the fiction of Richard Wright, James Baldwin, Ralph Ellison, and others.

My doctoral thesis was an analysis of actual speech data (Hawaiian English), recorded, as spontaneous as it could be with an observer present.  The social sciences are a clear case of observer influence – you don’t have to resort to quantum physics to see it.

And in my speechwriting career, I was concerned with the speaker’s style – again, based on the data,  I’ve always called upon, by the methods of scientific inquiry, to LOOK AT THE DATA.

As Roger Shuy, one of the most pre-eminent forensic linguists, has observed, the interpretation and application of the law are overwhelmingly about language.

Thus, there are many situations in which the expertise of a linguist – someone trained in the precise description and analysis of language (but not necessarily a person who knows many languages) – can make substantial contributions to a case, providing evidence one way or the other or simply clarifying the linguistic principles, problems, and processes that the case involves (many examples to follow).

Varieties of forensic linguistics

The term “forensic linguistics” covers a wide variety of disciplines, some of them focused on the language of the courtroom (e.g., how the judge’s instructions to the jury are understood)… the veracity of witness statements… the meaning of legal interrogations.  Other sub-disciplines involve computers and statistics.

The qualitative analysis described here, although perhaps not as well known as handwriting and document analysis in the popular imagination or within the legal community, has a long and honorable history.

According to Gerald R. McMenamin (Forensic Linguistics: Advances in Forensic Stylistics, pp. 86-7),

…hundreds of studies – in the form of journal articles and books – have been done on style, stylistics, and questioned authorship. German studies of Old Testament authorship date back at least to the middle of the 19th century.

In addition, evidence has been presented in multiple court cases, and numerous judicial opinions have been documented based on evidence of forensic stylistics.

These cases date as far back as the 1728 Trial of William Hales in England and the 1846 Pate v. People case in the US.

Although I had been occasionally consulted by attorneys during my professor years, I decided to focus on it in 1998, right after Shakespeare prof Don Foster correctly identified the author of the anonymous novel Primary Colors.

Upon reading his book Author Unknown, I realized that he’s NOT doing forensic linguistics, but literary forensics – e.g., often speculating on what the writer may have been thinking or where he/she might have picked up a certain image or piece of information.


Ways of accessing the data


I stick to the data.  What the attorney is relying on is my experience with literally thousands of documents, examining and analyzing them with tools and methods, techniques and concepts of linguistics.

Thus, if I say that a certain grammatical feature is rare or indicative of a foreign language speaker, my statement is based on my extensive experience in such matters.


Of course, I don’t rely entirely on my brain — I do Internet research (very useful in establishing the linguistic status of a word or phrase in copyright/trademark infringement cases).  I occasionally use electronic plagiarism checkers.  I often look in the multiple dictionaries to determine the conventional meaning of the word – in other words, I augment my brain in any way necessary, but the idea is always to stick to and interpret the data.


Thus, forensic linguistics, though it is not as sensational as other forensic disciplines that fire the popular imagination, can be highly revealing.


I have been consulted in two-profile criminal cases the so-called The Son of Sam murders of the 70s and the West Coast Zodiac killings of a roughly similar nature. These cases do not hinge on writing, but in the Son of Sam case, I opined that David Berkowitz was the author of more than one of the letters, and in the Zodiac case, that there was more than one author of the letters sent to various people.

Though I have yet to appear in a CSI episode, I have had some very interesting cases over the years.


Weight Watchers?


One of my earliest consultations had to do with an alleged copyright infringement by someone who wanted to call his restaurant the Weight Watcher’s Restaurant.

Of course, he was sued by the international organization that’s been around for many years.


What do you think?  Was there trademark infringement?


The phrase is a very common one, grammatically: object noun + agent noun, e.g., computer operator.  The key here is that weight watcher was not a phrase in the lexicon UNTIL it was copyrighted.


So no, the gentleman is not allowed to call his restaurant “Weight Watcher’s.”

Some other examples:

  • Expert opinion on plagiarism of song lyrics (copyright litigation involving musical group The Who).
  • Authorship analysis of e-mails in Florida internal union dispute.
  • Expert opinion on plagiarism of online home-study course. Here the copier had tried to disguise his work by adding and rearranging material, but the fact that he had the same questions and the same multiple choice answers – well, that’s too much similarity to be accidental.
  • Authorship analysis of anonymous letters of complaint to a corporation’s Board of Directors.
  • Expert opinion on the semantics of trademark infringement in litigation by an apparel firm.
  • Authorship analysis of anonymous letters (possibly written by disgruntled employees) for major Midwestern corporation.
  • Authorship analysis of derogatory emails to website of a “cult deprogrammer.”
  • Expert opinion on linguistic similarities between plaintiff’s and defendant’s trademarks.
  • Authorship analysis of defamatory emails written to an individual in a corporation.
  • Authorship advice on a possibly forged stock transfer document
  • Interpretation of contract language regarding the disposition of acquired corporate entities.
  • Interpretation of equipment rental contract language.  (I’ll discuss this one in more detail a little later.)


For specific legal areas to which qualitative stylistics is applicable, see When a Lawyer Needs a Needs a Linguist elsewhere in this collection of articles.


These cases break down into five categories:


(1) DOCUMENT INTERPRETATION.   I examine contracts, wills, or other binding documents.  I analyze specific words, phrases, clauses, sentences, and other units, including the entire document, to offer informed judgments on clarity, comprehensibility, and (un)ambiguity.


For example, a man suffered damages from defective rental equipment; he did not know that he had released the company from liability by signing a contract that was, in my judgment, too complex to understand.


(2) PLAGIARISM.  I examine texts to determine the likelihood of plagiarism.

Example: The creator of an online course found it stolen and being offered by someone else. On the basis of similarities between the two courses phrased their questions, I helped substantiate his charges.


Some observations:


A plagiarism charge is often leveled as part of a fishing expedition or to “get” someone by impugning their honesty.


Plagiarism requires the intent to deceive – inept quoting doesn’t count. (Example: A college student was – inappropriately, in my view – accused of plagiarism because his footnoting mechanics were erratic.  He know he had to attribute; he just didn’t do it perfectly.  Somewhat less common is honest error: the person was unaware that he/she got an idea somewhere else.


Some cases of paraphrase may be evidence of plagiarism. Though most plagiarists aren’t skilled enough to create complete paraphrases..  Others may be examples of independent creation (it does happen) or separate borrowing from same source.


(3) PROTECTABILITY OF COPYRIGHT/TRADEMARK I help attorneys define the semantic points at issue, and I offer informed judgment on the genericity, specificity, and/or protectability of contested material. I also assess the similarity of names/marks to offer informed judgments in infringement litigation.


For example, by comparing phonetics, orthography and frequency of consonant clusters,  I demonstrated that another marketer’s brand name was, on several linguistic levels, similar to that of the Plaintiff.


(4) LIBEL. I evaluate the accuracy and appropriateness of language. In one case, a candidate’s campaign materials referred to his opponent as a “convicted racketeer,” and his lawyers produced complex legal precedents to show how the words could actually apply to the other politician, who was a “convicted racketeer” only by a most implausible stretch of the conventional meanings of those words.  I argued that the real issue is: how will the phrase be understood in context, by native speakers of English?


(5) AUTHORSHIP. I analyze two or more language samples — the grammar, lexicon, and other features (MANY other features) — to offer informed judgment on the authorship of anonymous, forged, or otherwise disputed documents.

I was once asked to give my opinion on the authorship of anonymous letters of complaint to a corporation’s Board of Directors. In one of the anonymous letters and in a sample from a suspect writer, I found the same feature – a feature which simply does not occur in American English. Unfortunately, such egregious clues are rare.

Typical questions:

  • Which of n authors wrote the anonymous document?
  • Did Person X write it (client typically supplies samples of X’s writing)?
  • What does the writing reveal about the writer?

I examine every possible similarity and difference at every level of language.

As you have no doubt noted, there are many ways to write the date and salutation in a letter.  This is just the beginning.  A text, such as an impromptu, minimally edited or unedited email, involves a maelstrom of choices, many of which have to be made simultaneously – what word to choose, how to spell it, how to write a date or time, whether to connect two sentences with and or simply end the first and start with a capital letter…the list goes on and on…and on.   One writer was identifiable because of her habit of referring to the addressee somewhere in every email, e.g., Bob, I really think…

Every one of these choices helps define the writer and reveals itself to the linguist’s eye.

Quantitative measurements

I may also do some basic calculations, e.g., “lexical density,” the ratio of words with dictionary meaning (as opposed to grammatical function words)…to the total words.

This, as well as the percentage of unique words, is a measure of the sophistication of the writer’s vocabulary.

Words per sentence is another.

There’s a host of other features – misspellings, apostrophe usage, peculiarities of idiom and vocabulary (one writer used PUKE as noun, verb and adjective), and much more.

I also like to look at the author’s methods for demarcating sentences, whether sentence-break (and I note how the break is punctuated), conjunction (number of different conjunctions is also of interest), or subordination (the most sophisticated).

Why does forensic stylistics work?

A forensic linguist must be exquisitely sensitive to nuances of text.  Where a synonym exists, the very choice of a word represents a decision on the part of the author.  Superimposed upon that is the way the word is spelled, hyphenated, abbreviated or capitalized. Truly, a text is a tangle of choices made on the fly.

Sometimes these choices are both involuntary and strikingly obvious.

Because linguists have learned to examine and describe language at a level of exactitude unimaginable to the layperson – who is generally not even aware of his/her linguistic habits at all. That’s why I never tell anyone what features I look for – unless it’s the attorney for whom I’m writing a report.

In the past 20 years, I have encountered an astonishing number of ways in which people can come into conflict and practice deception through language.  That is one reason why they resort to lawyers.

They are willing to pay to find out who wrote that nasty note.  They don’t even need certainty – they are often willing to settle for “maybe,” if it supports their case.

One thing has remained constant, for me: I am one of the lucky few people in the world who are actually happy to hear from lawyers.