Names can be such fickle things.  A couple of years ago, I published a paper, and in the acknowledgements, thanked a colleague: “ … and J. Smith provided valuable comments.”  I was surprised to hear back from the copy editor before receiving the proofs, demanding that I provide Ms. Smith’s middle initial, as I did with all the other people in the acknowledgements.  I explained that Ms. Smith did not have a middle name (a rare, but not unheard of case), whereupon I was told that her first name would have to be spelled out in entirety.  The acknowledgements then listed people like this: “A.B. Jones, C.D. Williams, Jane Smith, and W.B. Yeates”.  Quite odd, to say the least.

In last week’s Nature, Antoine Louchart pointed out that French aristocratic names are often misinterpreted by non-Francophones.  Names like “Pierre Teilhard de Chardin”, she points out, are often cited as “de Chardin”, rather than “Teilhard de Chardin”.  And in an age where citation counts (for good or ill) are viewed as a metric worth noting by grant, tenure, and promotion committees that can be a problem.

But rather than rant about the problems with counting citations (both in the way it’s done, and why it’s done in the first place), I thought I’d talk about names, since the phenomenon described by Louchart isn’t unique to les Français(e).

An obvious comparison is Spanish (and Portuguese).  A researcher named “Alejandro Martínez de Silva” could be shortened to “A.M. Silva”, “A.M. de Silva” or “A. Martínez de Silva”.  The latter is the correct version, as “Martínez de Silva” is the last name.

The same can be said of Portuguese, where Alex da Sousa would be “da Sousa, A.”, and “Alex Rodrigues da Sousa” would be Rodrigues da Sousa, A.”.

In Dutch, things are fairly straight-forward.  Jan van Smith would usually be abbreviated “J. van Smith”, and sorted under “v”, but this is by no means universal (think Ludwig van Beethoven).  I’ve been informed by a reliable source that, in many cases where Dutch families move to English-speaking countries, the spaces in name can disappear, resulting in names like “van den Berg” changing to “Vandenberg” (below, I also discuss the Anglicisation of Spanish names).

In Russian, another issue pops up – some Russian letters don’t have a single Roman equivalent.  That’s why we have Nikolai and Nikolay – the same name, just different transliteration.  But the Russian letter Я, for example, is the equivalent of “Ya”, so often gets abbreviated to “Y” (though the Russian for “Y” is “У”, which is entirely different).  Same goes for “Ю”, which is pronounced “Yu”.  So that’s 3 letters all Anglicised to “Y”.  This is undoubtedly the case whenever a non-Roman alphabet is used

And these are just a few examples.  Think, too about diacritical marks (accents: the French “e” is different from é, è, ê, and ë), or the German letter “ß” (pronounced “sz”, and which is isn’t really a letter, but a ligature of a long s, “ſ” and “z”)*.  There’s also the habit of replacing “ü” with “eu” (Müller to Meuller) in English.  And then throw in “ø” or “å” in Scandinavia, or the Polish “Ł”, to say nothing of informal names that can cause initial confusion (Edward/Ted, Anthony/Tony, William/Bill), or everyone’s favourite: the voiceless postalveolar fricative in several languages in eastern Europe and the Balkans (better known as “Š”, pronounced as “sh”), and it’s cousin, the voiceless postalveolar affricate consonant (“Č”, pronounced as “ch” like in “chocolate”).

And, as pointed out in 2008 by Nalini Puniamoorthy (or rather, by Naliini), many south Indian authors have only one name.

Given the increased Anglicisation of research, what’s the solution?  Authors could indicate their “last name” using capital letter when they submit a manuscript (e.g., “Alejandro MARTÍNEZ DE SILVA”, and journals could include a recommended citation on the first page of the journal article (or make sure the names are properly arranged in cases where readers can download a citation for common bibliographic software).

In programs like Endnote, this can be dealt with by using the format “Lastname, Firstname” – the comma separates the two.  If I enter “Alejandro Martínez de Silva”, then Endnote will just assume that the last word is the last name, and I’ll get “Silva, A.M.d.”.  But if I enter “Martínez de Silva, Alejandro” then Endnote will produce “Martínez de Silva, A.”.**

An important part of keeping this all straight is being consistent with how you wish to be identified professionally.  I made a conscious decision to be listed as my full first name, middle initial, and last name on all my publications, even though I go by an abbreviated form of my first name in everyday life.

Lastly, I think there are two issues at play here: how non-Roman names (or non-English forms) are transliterated/translated, and how those names are shortened / cited / abbreviated.  In Spain, colleagues of mine have dropped one of their first names, or one of their last names, opting to abbreviate their name professionally (like the aforementioned Mr. Martínez de Silva going by “Alejandro Martínez”, or a researcher named “Maria Manuela Jiminez Alonso” dropping “Maria”, and hyphenating Jiminez-Alonso).  The situation is also tricky for Chinese, Japanese, Korean, or Arab researchers (at least Russian and other Cryillic languages map pretty closely to a Roman alphabet with a few exceptions, as I pointed out above).

You can also check out this post at The Scholarly Kitchen that implores authors and journals to use full names.  But as one of the commenters pointed out, there can be a sex (and perhaps even race) bias in evaluating the quality of a paper based on the perceived race and gender of the author.  Some believe that the solution to author ambiguity is to use an ORCID, much like most articles have a DOI.

So as others have pointed out, a common researcher ID, equivalent to a Digital Object Identifier (DOI) is probably the way to go.  Such a system would need to be flexible enough to accommodate all the various issues I’ve pointed out above, and those raised by others.  But given that others have called for this approach since at least 2008, how close are we?



Thanks to various Spanish (DXS), Portuguese (PLM), and Dutch (A-LK) colleagues for clarifying the intricacies of names in their respective languages.

*The long s (ſ) is often mistaken as a lowercase F (“f”) in older English texts, in the same way as “Ye” is a corruption of “þe” (thorne), an abbreviation for “the”, and not pronounced with a modern y-sound at all.  In fact, þ is still used in Iceland, like in the last name þorisson (“Thorisson”).

**Note: this trick can be used for “corporate authors” – entering “Government of Canada,” (i.e., with a comma at the end) fools end note into thinking that the “last name” is “Government of Canada” and that there’s no first initial.

Fun fact: in Spanish, the accent doesn’t imply a different sound as it does in French, but indicates the stressed syllable.  Same for the apostrophes that can crop up in transliterated Russian names (e.g., the famous Russian ornithologist, and author of “Birds of the USSR” G.P. Dementiev is often written as “G.P. Dement’ev”, pronounced “deMENTyev”), where it replaces ь, the “soft sign”.