UPDATE: Be sure to read the comments below, and my response
As a newly-minted PhD student, I was talking with a friend about writing papers. “Use LaTeX”, he said. I thought he meant the rubbery material commonly found in lab gloves. But apparently not. LaTeX (pronounced “lay-tech”) is typesetting software that he used for writing papers.
Eager to be on the cutting edge of scholarship, I spent a few days learning how LaTeX worked, how to insert symbols, figures, and tables. I even produced my thesis proposal with it. But my supervisor used Word exclusively, and I had no compelling reason to use LaTeX over Word, so I switched back.
Fast-forward a few years. Now, everyone should be using markdown in a plain text editor, doing statistics in R, uploading versions to github or figshare, and managing citations with JabRef, BibTex or Mendeley. Apparently, Word, Excel, Endnote, and SPSS are things of the past. Special sessions at the 2013 Ecological Society of America meeting seem to be the nail in the proverbial coffin. Some are even calling these new tools essential pieces of software for students.
There is a movement afoot to move the process of writing science out of Microsoft Word, and into other “better” formats like LaTeX, or Markdown with the argument that “researchers shouldn’t waste time on formatting, just the text of what they’re writing”. They can then keep version control using something like GitHub, and invite collaborators to do the same. This also keeps science open, since scientists aren’t beholden to a proprietary file format.
But in my mind, there are two arguments: the practical (A is tangibly better than B), and the philosophical (A is better than B because of ethical, moral, or philosophical reasons). These are both important discussions to have, but in this post, I’m going to focus on the first.
Learning Curve
I’ve used Word for my typing needs since about 1997 (prior to which, I used Clarisworks, and WordPerfect, two functionally similar programs). I know how to easily insert commonly used non-Roman letter symbols (like β), and most of my work (>95%) doesn’t extend beyond simple mathematical symbols or diacritical marks (like ±, Σ, or é). I use minimal formatting in Word (bold, italics, line numbers, maybe changing the font size of the title), and after almost 20 years, I’ve gotten pretty good at Ctrl-B (or, in the last 10 years, command-B).
Coauthor inertia
The vast majority of my work is collaborative to some degree. Whether it’s a supervisor or boss, or a larger group of other researchers, someone’s going to read, comment on, revise, and critique any paper I write before it goes to the journal. Word is ubiquitous, while these other methods are not. And like me, my coauthors are most familiar with Word, and use its Track Changes feature to make suggestions, comment on text, and insert their own edits.
Reference integration
This is really the deal-breaker for me. Since 2005, I’ve used Endnote to manage my reference papers, and I use the “Cite While You Write” feature in every paper. Basically, this means I can write something like “Birds have feathers, and can fly (Gill 2007)”, and Endnote will drop the full citation (in the specified format) in the Literature Cited section. How cool is that? It also makes reformatting for different journals relatively easy. Yes, there are other types of programs that can do that for you (e.g., BibTex), but there’s a learning curve, and many hours updating citation keys so that there aren’t 4 “Jones2007”s.
Cost & Access
Word (and to a lesser extent, Endnote) are readily available at most research organizations, or are relatively cheaply obtained (let’s say a maximum of $200). If you want to keep your projects private, GitHub will run you $7/month (or about $200 over 2 years), while the rest are free. Word and Endnote are perpetual licenses. True, universities and research organizations pay for these, but it’s unlikely that will change since the programs are used by non-academic staff, too.
Academic hipsters
The following was just tweeted from the 2013 ESA conference
Do it MT @ucfagls @recology_ : “throw away MS Word and pick up Markdown” – great advice in the reproducible research workshop #ESA2013
— Andrew MacDonald (@polesasunder) August 5, 2013
@thelabandfield But some of us want to have reproducible research so embed R or Python for the analysis in paper @polesasunder @recology_
— Gavin Simpson (@ucfagls) August 5, 2013
The implication, whether intended or not, is that those of us still using Word aren’t doing reproducible research.
Now before folks get their open sources all in a knot, I’m not just being a Luddite. I use R regularly. I’ve also used LaTeX for one manuscript. I’m not advocating against using any of these tools if they’re the right tools for the job. What I’m saying is don’t use them for the sake of using them–a form of what I could call academic hipsterism.
Feel like I should write an R package. I don’t have anything that needs doing, it just feels like it’s what all the cool kids are doing now.
— Steven Hamblin (@BehavEcology) August 5, 2013
Case in point.
My experiences with other early-career researchers, collaborators, supervisors, and grad students is that 99% of them will keep their data in Excel, write the manuscript in Word, and some will integrate references using Endnote (important point: the same applies to non-Microsoft products like Apple’s Pages and Numbers, OpenOffice etc.).
And for a good chunk of the statistical analyses I do, or that are in papers I read, review, and co-author, it doesn’t matter if they were done in R, or SPSS, or SAS, or Minitab, or JMP, or many other common statistical programs.
Are there issues with all these pieces of software? Yes. Are there issues with any piece of software? Yes. Has a manuscript in ecology/zoology been rejected because the authors used a particular program to compose their text? I don’t think so.
Jeremy Fox at Dynamic Ecology wrote about how he keeps on top of the literature. His point was that his system works for him, and yes, there are other systems out there. The interface that I set up on my computer between Word and Endnote when I started my MSc aeons ago still works for me. It also works with my coauthors, all of whom use Word as a primary text editor for manuscripts, and it works for journals, all of which accept submissions in Word format, or the easily-generated PDF.
Are tools like markdown, LaTeX, and github useful? To some, they are. But they’re not yet useful to me. If they look useful to you, check them out – they just may be. But don’t feel beholden to adopt the latest software trend.
30 years ago, John Weins wrote in The Auk on the perils of word processors:
Has word processing improved how science is disseminated? Of course. Perhaps we could say the same for the current crop of new tools in manuscript writing and statistics. But not for me, at least not yet.
I’m not saying these new pieces of software are terrible and useless. I’m saying that I’m not inclined to use them because I don’t see how they are materially better than my current system. Sometimes, it seems like the argument from the non-Word proponents is that “our way is better than yours in every case” (see the quote tweets above), which isn’t the case.
For what it’s worth, I’m going to have a lengthy skype chat with Andrew MacDonald later this month about the advantages of Markdown, and integrating it with BibTex. I might even try it. I’ll let you all know how it goes.
— — —
As a quick note, I’m off to the Society of Canadian Ornithologists meeting in Winnipeg, and won’t be as quick to approve new commenters, or respond to comments. Thanks for your patience. -AB
manuelinor said:
Great post! I’ve been grumbling various monologues of this nature to myself for a while and was planning a post about it too! I personally don’t care who uses what program for what, but I do think we should be able to choose what we want to use. As you say, what works for one person might not work for another – everyone has different levels of technological capability, and “Technology” really isn’t an essential part of some scientific disciplines…and it definitely shouldn’t become an indicator of the “quality” of a researcher or their research output.
Falko Buschke said:
Of course, I use MSWord, but I do so ironically. I get all the benefits of the easy-to-use interface while still shunning the mainstream: hipster-heaven!
But seriously, I can only imagine that LaTeX is more efficient if the time you spend formatting your text cuts into your writing time. This isn’t true for me. I spend most of my time staring at a blinking cursor while paralysed by writer’s block… if LaTeX can help with that, then I’ll make the switch immediately.
Lastly, thanks for pointing out the correct pronunciation of LaTeX. Now I’ll be able to sound knowledgeable when I talking to the other cool kids.
Falko Buschke said:
Also, I noticed this on the Instruction to Authors page for Oikos:
We as well as reviewers have problems in handling LaTex files, please avoid this format.
Pingback: Friday links: gaming article-level metrics, academic hipsters, and more | Dynamic Ecology
Tim Lucas said:
The problem with the combination of ‘use what works for you’ and ‘try and use what your collaborators use’ is that you end up with fairly unaviodable inertia. I agree there’s pros and cons to all software, but even if one piece of software turned up that was better across the board, this inertia still makes uptake of the software extremely difficult.
Two more specific points. In reply to Falko Buschke, LaTeX can easily produce a pdf. If reviewers can’t handle a pdf then there’s an issue. Oikos not being able to handle the .tex file is more understandable but at the end of the day they are a publisher and possibly we could expect them to try and stay up to date with changes in software.
Finally, some even go so far as to pronounce it LARtech, but that’s just silly.
Alex Bond said:
Thanks for pointing out the .tex -> .pdf issue. I guess it depends on the journal/publisher? Perhaps journals pay more to be able to accept .tex files? Ah the mystrious world of journal-publisher arrangements. Small society journals, like those I’m involved with editorially, don’t have the capacity to take .tex files, so .doc(x) and .pdf it is.
The challenge is defining what’s “better across the board” because that implies that it’s recognized as such. I don’t see Markdown/Pandoc as better than Word, just like I don’t see Word as better than Markdown/Pandoc – they’re just different means to the same end (hopefully a published paper!).
Robert Flight said:
I agree, people should use what works for them. However, given the way analyses have been done (data into Excel, hand manipulate, copy paste results to Word, etc), and the interest in making papers more reproducible and keeping errors to a minimum, other approaches can be very useful.
For example, the combination of R and markdown or latex makes it much easier to insert the results of calculations (whether values, tables, or figures) directly into the text, without copy pasting. If there is an error in the programming / scripting, it is possible to go back, correct it, and update the manuscript relatively easily, vs going through the manuscript and manual changing every instance. This is where these other methods really start to matter to me. If collaborators are using Word, Pandoc can do some nice conversion of HTML to Word as well.
In my own work, we had actually done a conference submission in Word, and I decided to convert it to R / Markdown to see how hard it was (actually wasn’t really too bad). In the process, I discovered at least 2 errors in numbers that had gotten misquoted during going from data analysis to writing.
Finally, systems like Github with a text submission are enabling open peer review that looks like this: https://github.com/cwcon/push/pull/2
ucfagls said:
OK, I’m going to push back on this, as one of the “hipsters” you refer to. (This is the *first* time I’ve ever been linked with the word “hip”!)
You read into my reproducibility tweet something that wasn’t there – I was specifically referring to the “us” among the community that want to do reproducible research in such a way that embedding our code/data analysis makes sense for us. You read into the tweet all that rubbish about if your aren’t using new tool X you are doing reproducible research. That’s on you. Clearly there are scales of reproducibility which extend from documenting methods and experiments via notes in your lab notebook, through the data input, manipulation and analysis stage, through to presentation. However, there are workflows that are more immediately reproducible than others; we have to recognise this, and that it would be *fantastic* (if somewhat unrealistically idealistic) if we all used these more efficient workflows or means to achieve reproducibility.
Your suggestion of placing code to reproduce a paper’s results in supplementary materials is fine. I don’t think it is the best way to do this but I’d far rather see people do that than not show anything at all.
The problem that you don’t seem to appreciate is that to do what you are doing (or suggesting) is inefficient. I *do* know how to use these other tools and they *do* allow me to be more efficient in my workflows. And that in turn allows those that also know how to use these tools to be equally as efficient in working with my research outputs. By embedding the data analysis code in my papers or blog posts or whatever, I avoid the need to go through my text replacing results, tables figures and uploading a new R script, just because I tweaked something in the analysis or my data changed. You *would* have to do all that if you followed your current workflow. I find that inefficient and error-prone, in my experience(!) and I’m very happy that I have the tools, and know how to use them, that allow me to be more efficient.
If one doesn’t know how to use these tools or even that these tools exist, it is very easy to view them as akin to magic. When I, and others I interact with, tweet or blog about our workflows, we are *not* doing it to beat up on our colleagues for their using “old fashioned” approaches. We are doing it to try to demystify the process, to dispel this idea that using these tools or workflows is hard. And yes, we’re trying to advocate for tools and workflows so that colleagues know they exist. I fully agree with your point about using what works for you. But if you are never exposed to new ideas or tools or workflows how will you ever be informed enough to find out “what *really* works for you”?
OK, so much for my push back on the “evangelism” aspects of your post. Now to the inaccuracies. Somewhere along the line using MS Word or EndNote involves a cost, and considerably more than you estimate. I’d be very surprised if the total cost to you *and* *your* institution for using Word *and* Endnote is $200. The perpetuity issue you raise is wrong. How many of us are still using Windows 3.1, or 95, or Office XP (the last version I bought myself as a student licence a long while ago)? Try working today with Office XP when your colleagues are using Office 2010 or 2013 and sending you docx files. Even if you can get Microsoft’s add-on to read these new, native formats in old versions of Word, as soon as your colleagues uses one of the new Office features you are screwed. Buying into the whole Word ecosystem locks you into a way of working that implies that you and your colleagues/collaborators need to regularly spend $100s just to keep up with the Joneses. Whilst you might have a licence to use your copy of Office XP indefinitely, in a practical sense “indefinitely” only extends a few years. Add on top the fact that Word and Endote aren’t that cheap once you are no longer a student or lucky enough to work at an institution that can negotiate and pay for a Campus Agreement or similar “site” licence and your argument on cost is looking decidedly dodgy.
EndNote may work for you, but what about the person using ReferenceManager? Or Mendeley? Or Papers? None of these things works well together; I know, I’ve worked with lead authors using EndNote that leave a pile of gibberish in fieldcodes when I view those files in Word (or latterly LibreOffice).
I have no idea where the issue of rejection came from! No-one I’ve ever spoken to about these tools has, to the best of my knowledge, ever intimated that we should reject a paper for not using LaTeX or MD. This is just plain FUD and certainly not something relevant to this discussion nor the tweets that sparked your post.
Finally, I’d really love it if you would point out in the discussion that sparked your post where I or Andrew or someone else said “our way is better than yours in every case”. I’m pushing back, at length (sorry, I know!), because you’ve misrepresented my position on these new OpenScience tools, conflated a whole pile of issues (many of which I’m not addressing here), and dragged in FUD which just muddies the waters, (potentially) antagonises people, *and* which presents to your readers an real misrepresentation of what we OpenScience/Open Source advocates are trying to convey.
I like to view my advocating OpenScience as similar to Maslow’s Hammer; if the only tool you have is a hammer, everything looks like a nail. When we advocate for these new tools and ways or working, we’re trying to make people aware of the whole array of tools out there that could be in ones individual toolbox. Far from advocating a “one workflow for all”, I see what we do as actively promoting a heterogeneous ecosystem of workflows. Importantly, however, I see this as an ecosystem of tools and workflows that fosters interoperation and easy, efficient collaboration, something that Office- or EndNote-use doesn’t and, one could reasonably argue, actively operates against.
Alex Bond said:
Thanks for the great & detailed response. Sorry if I’ve mis-perceived some things, but perception is half the battle, no?
I think that Markdown and LaTeX look like fantastic tools. Sorry if I read into your tweet something that wasn’t there, but that’s how it came across (and how some others have felt), and was one in a number of similarly-interpreted messages I’ve heard on this in the last few months.
Much of the work I do involves some pretty basic stats (linear models). If there were a way to link the F-ratios and p-values from my lm() or glm() command in R into the text of a manuscript, I’m all ears! Hopefully Andrew and I can discuss this when we chat.
I’m fortunate enough (or perhaps not, depending on one’s perspective) to work somewhere where I’ve been issued a work computer with Word and Endnote. The decision to purchase these pieces of software for all staff (in the case of Word) is many (many, many) levels above my pay grade. The influence of a temporary pseudo-employee who won’t be here in 3-5 years (or at least I hope so!) to change organization-wide computing is small. So Word/Endnote still work for me. As does having someone “manage” the ms through Endnote. Otherwise, as you’ve pointed out, chaos can ensue.
The very first post I wrote for L&F was on whether the species or study site should drive research questions. By analogy, if I have a ms that has lots of equations or maths in it, I’ll switch over to LaTeX (much to the annoyance of 99% of my coauthors). I seek out tools when I encounter problems. That’s how I started using (and becoming proficient at) R, for example. Ditto for Endnote. Like it or not, new pieces of software (especially command-line) are intimidating for many, and I’ve had some sour experiences when these people (a younger me included) encountered R whizzes and LaTeX gurus.
Making the products of scientific inquiry (manuscripts, data, analyses) accessible and archivable is important, I agree. In a sense, one could argue that the best way to do so is for everything to be printed physically on paper and deposited in physical archives. But for me (and for some others, too), this need for accessibility is balanced against the time already invested in reference databases, learning current software, dealing with collaborators. If there is a way to manage data, write manuscripts, deal with inline citations and bibliographies from my existing database AND keep coauthors content when reviewing manuscripts, then count me in!
Why don’t we have a chat when I’m back, and see if we can turn my workflow into something outside Word/Endnote? Then I can truly compare the two
Yihui said:
It is easy to link the F-ratios and P-values in your manuscript to lm()/glm() in R, as long as you stay with plain-text document formats like LaTeX, Markdown, and HTML, etc. Give a man a number, he will live for a day; give him a dynamic document containing code, he will live forever (I’m a little bit exaggerating). That is the problem knitr (also previously Sweave) tries to solve.
ucfagls said:
Thanks for the reply Alex.
The point about the level of statistical analyses in the paper or their complexity is a red herring. Even if I wrote a paper with no or very little stats, I would still want to automate the inclusion of whatever data summaries or simple stats. Or even just the figures; it is liberating to draft publication quality figures in R. Once you have code that describes any analyses, data presentational elements and or figures, you save yourself lots of time when things change or need to be tweaked; efficiency also extends to not missing updating a figure or table or critical statistic in the text. Whatever level of statistical analysis in your work you can benefit from automation and reproducible research.
I do agree that individuals need to choose their tools and workflows that work best for them, but those same individuals need to be sufficiently informed of the available tools, their pros and cons (there is a certain level of retraining needed if you don’t know R, LaTeX, Markdown, git etc. though you don’t need to learn all these at once to start benefiting from individual tools.). And also afforded of training opportunities so they can make better judgements about what to use.
Andy Farke said:
I mostly agree, particularly with Word vs. LaTeX. . .in that case, it just doesn’t make a lot of sense for our field. My spouse, who is a physicist, of course disagrees…but the culture in her world is quite different from that in evolutionary biology.
As for R, I would strongly encourage everyone to learn it. At the end of the day, R just has a lot of stuff relevant for biologists that other programs don’t have, particularly when it comes to dealing with phylogenetic effects. Where LaTeX in bio is a little susceptible to hipsterism, I think R mostly escapes this problem (mostly).
Alex Bond said:
One of the biggest merits of R in my books is its pedagogical usefulness. I understood the statistics I was running better when I ran them in R.
Carl Boettiger (@cboettig) said:
Nice perspective, but the same could be said of most methods or technology sections — I don’t need these fancy statistical methods because I don’t work with that kind of data. I don’t need the data management workshops becuase I have a system that works for me. I don’t need supercomputers becuase my laptop is good enough for my science. If what you have works for you, perhaps you are not the target audience of these workshops and sessions?
I agree entirely that it would be silly to adopt these tools simply to be “hip”. But it is just as disengenuous to characterize the motivation of those using or seeking to learn these tools as being “hipster” rather than as solving problems you don’t have.
Florian Hartig said:
About the references: JabRef will import your Endnote database in no time and you can use all your references in LaTeX in exactly the same way as in Endnote, including auto-completing and popping up the full citation (assuming you use a proper editor).
Other than that, I think you build up an old/new divide that doesn’t reflect very well what’s going on in reality with these two types of software. LaTeX is as old as Microsoft word, they have coexisted for a long time, and they will probably continue to do so, because they do different things. No one claimed that you WRITE better with LaTeX than with word (except for formulas), but LaTeX produces a publishing quality output, and Word doesn’t, it’s as simple as that. In the case of a journal publication, someone (=society) pays a commercial publisher a few thousand dollars to convert the word file into a properly looking text again, so there is no immediate problem for you, but what about lecture notes, books, etc.? LaTeX has been an invaluable asset for the scientific community to produce quality publications cheaply, including innumerable PhD theses, so branding this as a “hipster” program really seems missing the point to me.
Pingback: Academic hipsters redux (and why open science is like going to the dentist) | The Lab and Field
Pingback: Tool use in ecology — conservatism & risk-taking | Curious Interactions
Pingback: Breaking down the Markdown Summit | The Lab and Field
Pingback: 2013 by the numbers « The Lab and Field
Pingback: Friday recommended reads #22 | Small Pond Science
Pingback: Using markdown + pandoc to write my biology PhD thesis | chia kaivalya
Pingback: » On Theory in Ecology – Reading Marquet et al. (2014)
Pingback: MultiMarkdown + pandoc + LaTeX | (D) Miren Berasategi
weberc2 said:
I wonder how many people are actually using Markdown because it’s “trendy”. My guess is there are a lot of people using it for a lot of very good reasons, but a lot of get-off-my-lawn types are choosing to ignore that possibility because confronting it might mean needing to learn a new technology (even one as simple as Markdown).
And there are a lot of very good, practical reasons to use Markdown over Word.
* Word isn’t human-or-machine readable; Markdown is both.
* While Word is inexpensive and widely available, plain text is even less expensive and completely ubiquitous.
* Every version control software with which I am familiar supports plain text contextually; only a small subset of proprietary version control systems support Word as more than a binary format (hugely important if you’re collaborating–otherwise your merge choices will be exclusively “mine” or “theirs”).
* Markdown can be automatically rendered onto the web; Word pretty much requires copy/paste and lots of manual reformatting. But maybe some people like downloading their blog entries as docx or pdfs?
* Markdown can be [easily] converted to Word formats. I don’t know of any Word->Markdown converters, and if any exist, I would be very suspicious of their quality, particularly for documents with complex formatting, etc.
* I already know Word.
* [a link](url)
* `preformatting`
* `* an unordered list element`
* `1. an ordered list element`
* *bold* and _italic_ text
* ^ congrats, you now know Markdown as well
I could probably go on. The question isn’t, “What requirements should convince me to use Markdown instead of Word”; the simplest, least-restrictive format should be used by default unless you have some requirement that it can’t easily meet.
Pingback: 2014 by the numbers | The Lab and Field
Pingback: Markdown for Humanities
TBOU said:
Could not agree more. Honestly, I spend a large bulk of my time programming research (Engineering and Stats). I don’t see the appeal of LaTex or desire for it. Some claim that it looks better, but I’ve found it either looks the same as word or, more frequently, much much worse (grey and eye straining fonts, tables in places I don’t want them and can’t move them).
In my classes (grad level engineering statistics) I won’t accept it for assignments (MS Word or a 0 – stated on day 1).
We sometimes get students who write their thesis and a draft journal paper in Latex. Well, the problem is they leave and then want to collaborate. Our director (my boss) doesn’t know LaTex, I don’t either, and so to collaborate we either have to use Adobe popup boxes to send comments (which are easily not seen), write out and scan, rely on the student (doesn’t always happen), or copy and paste into word (then we’re good to go).
We had one professor in the department that was a die hard LatEx fan. He was smugly writing up his brilliance in one journal paper (in LaTex of course…), only to find out that the top tier journal he wanted to send it didn’t accept LaTex. Man-o-man that was a fun day for us…
John Klein said:
“grey and eye straining fonts, tables in places I don’t want them and can’t move them”
When I use MS Word my tables fly all over the place! I hit return in one area and the whole document shifts around. Talk about frustrating.
LaTeX allows fixed positioning of everything in the document. The disadvantage to Word is that we cannot see all of the hidden code it uses to space and format everything. LaTeX gives you the freedom to design the page from the ground up. Theoretically, every pixel in the document is under your control. Let me emphasize this is not more work. I make style sheets and commands that affect every component in the paper across the board (figures, tables, captions, formulas, paragraphs, margins, headings, footers, etc.). A little learning curve and you have a lot of power at your fingertips.
Back to the original topic: I do agree that LaTeX users may develop a kind of smugness. I confess that when I first learned LaTeX I was a somewhat of a snob. However, comparing my documents written in Word to LaTeX the level of readability and professional feel to the typography went up exponentially. Extensive use of equations and forumlae is easy to use in LaTeX.
How easy is it to embed formulas in-text in Word?
There is a lot of wasted time clicking away with the mouse in Word. I prefer to do everything on the keyboard. It is like programming for documentation.
Pingback: Two simple uses of Git for writing a thesis: reviewing & focussing | EDIT
Pingback: Lo que he aprendido: Markdown + pandoc para la ciencia | Onda Hostil
Pingback: 2015 by the numbers | The Lab and Field
Pingback: 2016 by the numbers | The Lab and Field
Pingback: 2017 by the numbers | The Lab and Field
Pingback: 2018 by the numbers | The Lab and Field