Saturday, July 22, 2017

The Prince

I have arrived in Shenzhen, China, for the International Botanic Congress. I meant to upload a few pictures of the Luohu district today but it seems as if my cell phone does not want to talk to my laptop, so perhaps I can do that when I am back.

On the flights I was unable to get much work done beyond making corrections to a manuscript, so I read a book and watched movies of varying quality: Star Wars Rogue One, Suicide Squad, and Throne of Elves. It is all a matter of expectations; they weren't high, so I enjoyed all three, although the last of them partly for being so different from how a European would have done it, and while happily ignoring the humongous plot holes of the second. The funny thing about Rogue One is that it is actually in part a reasonably good attempt at rationalising why the heck the Empire would have built the Death Star with such an idiotic weakness, although it still remains implausible that nobody else noticed it during construction and just added another wall on the way.

Ah well. Anyway, the book I read nearly through on the flights - because it is not actually all that long - is Machiavelli's Il Principe. The book needs no introduction as it is a classic, but I had never read it until I happened to pick it up in a German retranslation at the last book fair I visited.

The scholar who wrote the foreword stresses that Machiavelli's reputation is undeservedly bad, that his work is really a groundbreaking piece of political philosophy. With Il Prinicipe and its sister work on republics he is considered to have pioneered political writing that sees humans as capable of influencing history within certain realistic limitations instead of being the passive objects of divine providence, and that argues for a pragmatic approach to politics instead of an unachievable spiritual ideal or political utopia.

And yes, I can see where that is coming from, although given my political socialisation I always remain sceptical of seeing history as a chain of outstanding people having influential ideas. (I think it is much more likely that if Machiavelli had not written this book others would still have organically moved towards more pragmatic political philosophy, as that was simply the Zeitgeist.)

But I can also see clearly where his bad reputation comes from. Not only is he fairly open about criticising past politicians and military leaders, including popes, for their personal and public failures, which would obviously invite opprobrium. He also matter-of-factly advises the audience to betray their allies for political gain and to murder the entire family of a previous ruler so that their bloodline is extinguished and no remaining heir can challenge the new order.

Again, both Machiavelli and the author of the foreword argue that this is just realistic. If you want to secure power and strengthen your state then this is what you must do. Machiavelli also doesn't see any issues with such behaviour because he has a very dim view of humanity in general. For example, to him it is no problem to break treaties because your treaty partners are, well, humans and as such should be expected to break the treaty themselves at the first good opportunity. That's just how dastardly humans are, fide Machiavelli at least.

Now realism is one thing. I can understand Machiavelli's advice in many cases, for example when he considers whether it is more important and easier to have the general population on one's side or the nobility (in today's context, the one percenters), and how to achieve either. And I also understand that one has to be realistic about the established rules one is subject to; if everybody habitually lies then a single honest person will indeed perish where another liar may have prospered. But I think he and that modern scholar miss to what a large degree opportunistic breaking of rules changes the rules for the worse, and what the consequences are.

Be it keeping true to treaties or showing mercy to one's enemies, the point of following rules or gentlemen's agreements is that only then can you expect that others will follow them to your benefit. When, for example, it became customary in the early to high middle ages of Germany that nobles competing for the crown would not eradicate the opposing family but instead force the male members to retire into monasteries, and only kill them if they blew that chance by coming back and raising another army, the idea was presumably that once the shoe is on the other foot one would also be given the chance to leave politics instead of coming home to find one's wife and underage children face down in puddles of blood, as Machiavelli would have it.

In other words, I am coming away from Il Principe with the impression that he was too clever by half. He took pragmatism just far enough to come out on the other side and fall back into short-sightedness.

Wednesday, July 19, 2017

How to use the reference manager Zotero

(Updated 19 July 2017 regarding Zotero 5.)

Inspired by an eMail exchange with a colleague I thought I would write a longer post on how to use the open source reference manager Zotero. Obviously all the information will in some form be available in its documentation, but at least to me the things I really need to look up are often the needle in the haystack of the obvious and the irrelevant. So here is what I think is what one needs to know to start using Zotero in what is hopefully a logical order.

All of this is based on Zotero 4, as I have not yet used the newest, version 5. From version 5 the standalone program is required instead of Zotero running through the browser alone. See the relevant comment under installation below.

How it works

Zotero is available for Win, Mac and Linux and for LibreOffice or MS Word. It integrates into the browser (I use Firefox, but it also seems to work with Chrome, Safari and Opera), and the usual way to use it is through the browser. This means that the browser has to be open at the same time as the word processor, but there is also a standalone version that I am not familiar with.

Installation

To be honest, this is the only point on which I am a bit confused at the moment because it has been a bit since I installed one of my instances. There are three items that may have to be installed: Zotero itself on the download page, for which there are specific instructions. Then there is the "connector" for the browser you have. Finally, it may be necessary to install a word processor plug-in for your browser. What confuses me is that I seem to remember only installing the latter two last time, so either I misremember or something has changed with the newest Zotero version (?).

Either way, while the installation of Zotero into the browser itself is easy, I have noticed that sometimes the word processor plug-in does not take on the first attempt. In that case I merely repeated the installation and restarted everything, and then it worked.

Update: The colleague who has now started using Zotero has, of course, installed the newest version and kindly adds the following:
The new version of Zotero needs the standalone program installed. This is because Mozilla has dropped the engine that supported a lot of extensions (like Zotero). It was deemed that allowing the browser to carry out low-level functions on the host computer introduced inherent vulnerabilities, and so Firefox versions after 48 have very much restricted what the browser is allowed to do (no longer can it communicate directly with databases, and carry out a lot of file handling functions). The problem is not restricted to Zotero, for example Gnome desktop extensions used this functionality and have also had to change the way they do things. The long and the short is that nowadays you need the standalone application installed.

Using Zotero in the browser and building your reference library

When you have Zotero installed there are two new buttons in the browser: a "Z" that opens your reference library and a symbol right next to it that you can click to import a journal article into that library. Given that the library will at first be empty let's look at that latter function first.

The Zotero buttons in the browser
For example, I may have found a journal article through a Google Scholar search. Ideally, I am now looking at the abstract or HTML fulltext on the journal website, because that page will have all the metadata I want. I now click on the paper symbol to the right of the Z, and Zotero automatically grabs all the fields it can find and saves a new entry into my reference library; if it can get a PDF it will even download that.
Viewing paper abstract
Now I click on the Z button to bring up my library, if I didn't have it open yet. In some cases I may notice that something went wrong. The typical scenarios are that the title of the paper is in Title Case or ALL CAPS. This is easily rectified by right-clicking on the title field and selecting "sentence case".
A reference has been added to the library
If something needs to be edited manually we can do so by left-clicking onto the relevant field. For paper titles, the usual problems would be having to re-capitalise names after correcting for Title Case or adding the HTML tags for italics round organism names. In the present case, however, I find that the title contains HTML codes for single quotation marks instead of the actual quotation mark characters, so I quickly correct that. Manual entry is, of course, also possible for an entire reference, for example if it isn't available online. In that case simply click on the plus in the green circle and select the appropriate publication type.

So much for importing references. It is also possible to bulk-import from the Google Scholar search results, but I would not recommend that as Google sometimes mixes up the metadata.

The style repository

The first time we try to add a reference to a manuscript, we are asked what reference style should be used. Zotero comes with only a few standard styles installed, but many more are available at the Zotero style repository. One of the in my eyes few downsides of Zotero is that it has less styles than Endnote, but often it is possible to get the relevant one under a different name. If, for example, you are preparing a manuscript for PhytoTaxa the ZooTaxa style should serve just as well.

Installing a new style is as easy as finding it in the style repository, clicking on its name, and confirming that it should be installed.

Selecting a reference style
Using Zotero in the word processor

Again, note that the browser needs to be running while we are adding references to a paper. The following assumes LibreOffice, but except for where to find the buttons everything is the same in MS Word.

In LibreOffice you will have new buttons for inserting and editing references, for inserting the reference list, and for changing document settings, in particular the reference style. To insert a reference, click on the button that seems to read r."Z. You can now enter an author name or even just a word from the title, as in my example here, and Zotero will suggest anything that fits.
Adding a reference to a manuscript
Another downside of Zotero, at least as of version 4 which I am still using, is that it doesn't do a reference like "Bronzati (2017)". Instead you can either have "(Bronzati 2017)" or reduce the reference to "(2017)". For this click on the reference in the field where you were asked to select it (if you have already entered it simply use the edit reference button showing r." and pencil) and select "suppress author". Then you have to type the author name(s) yourself outside of the brackets, which is obviously a bit annoying.

Author names outside of brackets have to be added manually
Once we have added a few references, we obviously need to add the reference list. This is as easy as clicking the third button in the Zotero field. The only others that are usually important are the two arrows (refresh) to update the reference list (although it does so automatically when the document is reloaded) and the cogwheel (document preferences) that allows changing the reference style across the document.

In LibreOffice I have sometimes found that adding or updating references changes the format of the entire paragraph they are embedded in. This seems to happen if the default text style is at variance with the text format actually used in the manuscript. Selecting a piece of manuscript text and setting the default style to fit its format has always rectified the situation for me.

Syncing

It is useful to get an account at the Zotero website and use it to sync one's reference library across computers. Again, this works cross-platform. I do it between a Windows computer at work and my personal Linux computer at home. Note, however, that it only syncs the metadata, not any fulltext PDFs that have been saved.

To sync, go into the browser and click on the Z symbol to open Zotero. Now click the cogwheel and select preferences. The preferences window has a sync tab where you can enter your username and password. Do the same on two computers and they should share their reference libraries.

Sunday, July 16, 2017

Free-association word salad is not the same as analysis

I find it remarkable what kinds of pieces are sometimes published by otherwise serious news organisations. Today during breakfast I made the mistake of trying to read something hilariously filed under "analysis" at the ABC website, Are we sleepwalking to World War III?

It starts with the claim that WW3 is coming and that Australia will be invaded:
All certainty will be lost, our economy will be devastated, our land seized, our system of government upended.
It is backed up by what a single former military commander said to the author over lunch:
This isn't mere idle speculation or the rantings of a doomsday cult, this is the warning from a man who has made it his life's work to prepare for just this scenario.
I may be missing something here, but unless there is a bit more at least circumstantial evidence I would still file this warning as mere speculation; that is kind of what the word means.

Then the author randomly quotes out of context Mark Twain ("History doesn't repeat but it does rhyme"), Alexis de Tocqueville as writing that the French Revolution was inevitable, a claim that can very conveniently be made about any historical event after the fact because it is always untestable, and then quickly moves to a historian's work on the beginning of World War I (while spelling the name of that source in two different ways).

In this latter case at least an actual argument can be discerned: Britain and Germany were trade partners and still went to war, so we should not assume that two countries today would stay at peace just because they are trade partners.

The author accelerates his already breathtaking pace to name-check a Harvard scholar and, before that person gets to say anything useful, the Ancient historian Thucydides. He seems to imply that the USA might be forced into starting WW3 to stop the rise of China, as Sparta was forced to start the Peloponnese War when Athens became too powerful. (I read Thucydides years ago, and I seem to remember it was a bit more complicated than that.)

The text descends into gibberish for a bit:
Any clash between the US and China is potentially catastrophic, but as much as we may try to wish it away, right now military strategists in Beijing and Washington are preparing for just an eventuality.
Perhaps: "just such an eventuality"?
Global think tank the Rand Corporation prepared a report in 2015 for the American military, its title could not have been more direct -- War with China: Thinking Through the Unthinkable.
Yeah, that's the job of strategists and (serious) think tanks.
It concluded that China would suffer greater casualties than the US if war was to break out now. However, it cautioned, that as China's military muscle increased so would the prospect of a prolonged destructive war.
How... what... huh? If I picked a fight with my neighbor now, I could be hurt, BUT (!) if I picked the fight an hour later, the fight could take longer. That doesn't even begin to make sense as a sentence. Even if we try to speculate about what the author may have meant here, for example that China would lose a war now but may have a better chance of winning a few decades in the future, one would have to point out that suffering greater casualties may not be incompatible with winning now either, cf. USSR in WW2. Also, why interrupt the sentence with a comma after the main verb? Did nobody proof-read this?

Having established to his satisfaction that war could happen, the author now moves to the question what precise incident could precipitate WW3 in Asia. Again a historian is cited, and again only so superficially that it is impossible for the reader to judge if what they say can be backed up. The islands of the South China Sea and other islands disputed between China and Japan are mentioned as the most likely causes of war. Okay, so I am not a military strategist, and I appreciate how useful symbolic conflicts can be to fire up nationalism when a government is in domestic trouble, but are these really the kinds of issues where a government would say, hey, let's needlessly blow up our entire economy and get hundreds of thousands killed over a practically worthless heap of rock? (Or sand, as the case may be.)

But of course we have to move on immediately. Cyber warfare! Thucydides! (Again.) Name-checking a Chinese scholar who does think that China and the USA are too economically interdependent to go to war, so at least we have an isolated counterpoint. Then the former military commander from the beginning opines that it would be helpful if politicians would also consider the risks of going to war; I am sure nobody in the history of humanity has ever had that idea before.

The piece ends with the author claiming to be more optimistic than his interview partner, only to end on a very depressing note. He takes this as an opportunity to quote Shakespeare, I presume in case the mention of Twain, de Tocqueville and Thucydides wasn't enough to signal deep erudition.

Now don't get me wrong, I am also rather pessimistic about the future. Overpopulation, resource limits and climate change may well combine to throw the world into a new dark age, with starvation, mass migrations, widespread collapse of most institutional order, and warlords duking it out with the few Byzantine Empire-like islands of stability that are left.

But that is how I would expect a serious analysis of future trends to look like: citing empirical evidence of risk factors like crop failures, water availability or shifting alliances and how they can produce unsolvable dilemmas for all involved. Merely name-checking historians in a meandering, stream-of-consciousness text without any real information or data isn't it.

Thursday, July 13, 2017

Botany picture #248: Gleichenia dicarpa


This is one of my favourite fern photographs: Gleichenia dicarpa (Gleicheniaceae) forming a large thicket at Jervis Bay, Australia, 2011.

While the group does not occur in Germany I have seen quite a few Gleicheniaceae during field work in South America. They are often aggressive colonisers, especially after major disturbances such as landslides, but are said to be very difficult or impossible to cultivate. Also, while I did my PhD there was another PhD student at the same institute who conducted a taxonomic revision of a genus of Gleicheniaceae in the Neotropics, so all things considered these odd-looking ferns were not new to me when I arrived on this continent.

The specific epithet of Gleichenia dicarpa means "two-fruited". Obviously ferns do not have fruits, but this is a reference to the fact that each little pocket on the lower leaf side contains two, and only two, tiny sporangia when the plant is fertile. Given that ferns often produce clusters (sori) of numerous sporangia this low number is rather peculiar in itself.

Sunday, July 9, 2017

What philosophy is "good for"

There is a very strange discussion popping up from time to time in some of the blogs that I read, where somebody will claim that philosophy is useless because it has not contributed anything to our understanding of the natural world or "to society" in recent times. Although I think that the charge of scientism - empirical science is all we need, every other field of scholarship is useless - is usually, mostly a straw man, it seems that there is a vocal minority of people who really think like that.

For starters, to the degree that this is about philosophy contributing to our understanding of the natural world this is clearly the wrong question to ask. What have bus drivers, as a profession, lately contributed to that endeavour? Nothing; but that does not mean the profession is useless, merely that it has a different job. Conversely, everybody who does contribute to our understanding of nature is by definition a scientist, so the claim that only scientists directly contribute to that understanding is true but trivially so.

The question could then rephrased more generally as: what do philosophers actually do? What is philosophy good for?

Now I am not a philosopher myself, and the question would perhaps be best answered by a member of that profession. But it so happens that just before I saw that remarkably nihilistic discussion about the value of philosophy I saw a use of philosophy outside of the academic context that provides a very good example of the kind of thing that the field is "good for".

In this post on his website Why Evolution Is True, Jerry Coyne had taken a completely consequentialist stand on the issue of punishment:
If you're a determinist about behavior and a consequentialist about punishment, as I am, then you punish people only if it's for the good of society. (My view is that at the moment of the slaughter, Gutierrez had no "choice" to not kill the birds.)

And there are three social goods to come from punishments like incarceration: deterrence of others, sequestration of someone who could be dangerous to society, and reformation of a criminal so he doesn't repeat his offense when freed. All three of these apply to Gutierrez: jailing him will probably deter others who want to kill wild animals, people who do that tend to be murderous psychopaths who could kill again (maybe people next time) and so need to be put away, but such people may be susceptible to reformation [...].

If none of these reasons obtain, there's no reason to imprison anyone; or can you give me one? But surely deterrence and sequestration apply in most cases--though not capital punishment, which data show isn't a deterrent. And if no social good results from imprisonment, in what sense would Gutierrez still "deserve" to be imprisoned? To satisfy a sense of vengefulness? That, to me, is not a good reason, for it caters to our baser instincts--the same instincts and feelings that make people favor executions. So, if Gutierrez can be reformed, poses a danger to society, or can be a deterrent to others, yes, he "deserves" punishment. But he doesn't deserve it just because he needs to be "paid back" for what he did.
In short, locking somebody up is to be justified (only) by good societal outcomes, while that person "deserving" to be locked up is not a just and reasonable concept (because JC believes that the existence of cause-and-effect is incompatible with personal responsibility). To this the commenter cjwinstead replied as follows:
Suppose we have strong justification to believe that punishing Gutierrez's mother will satisfy the goals of deterrence and reformation; and keeping her hostage would be as effective as sequestration (maybe he really cares a lot about his mother). If we have evidence that this will be more effective in those goals, is there any reason not to punish her? What if she gladly volunteers to receive the punishment on his behalf? I would say that Christian Gutierrez deserves to be the subject of punishment in a way that his mother does not. Proxy punishments do happen in our justice system, and they are arguably effective at deterrence and reformation. Should they be supported if they work?
This, right there, is one of the things that philosophy is "good for". This is not science, obviously, as no empirical data are involved in any but the most remote ways. What cjwinstead has done is propose a thought experiment - a classical method of analytic philosophy - to lay bare our instincts about something (here, that we would consider punishing the mother unjust), to start a conversation about where those instincts come from and what, if anything, they mean to us, and perhaps in particular to demonstrate the absurd consequences of a position (here, basing moral philosophy entirely on consequentialism).

Of course, you may disagree with cjwinstead in this instance. What is more, while his comment sparked a very long discussion, nearly all the people who replied to it missed its point in a way that is somewhere between spectacular and hilarious. But again, this is the kind of thing that is philosophy, and it is useful and necessary to hash out issues that cannot be adjudicated based on empirical studies alone.

We may do science to find out if deterrence and reformation work or not, but science alone cannot necessarily tell us if we should prefer consequentialism to deontology, for example. And even if some scientismist were to argue that it can, using analytic philosophy to point out an internal contradiction or absurdity in an argument still saves us the major investment of conducting a large scientific study to test it.

Saturday, July 8, 2017

Botany picture #247: Androsace villosa


Androsace villosa (Primulaceae), France, 2014. A cute little alpine plant whose leaf rosettes remind me somewhat of Sempervivum. I assume the colour of the throat, which can even in this photo be seen to vary between red and yellow, signals to insects whether the flower is in the right stage to be visited.

Thursday, July 6, 2017

Multi-access keys need a different approach than dichotomous keys

I am close to deploying a reasonably large online multi-access (Lucid) key and find myself fretting how it will be received by the user community. Obviously people may have different preferences for how exactly a key should look like and what features it should or should not have, but one concern I have in particular is that taxonomists used to writing traditional dichotomous keys may be disappointed with some of the choices I made.

To recapitulate, just in case it isn't immediately clear, there are two very common types of identification keys in systematics. The traditional ones are dichotomous and single-entry, because that is what works in books. As an example, consider the Craspedia key in the KeyBase repository (click on bracketed or indented to see the full key). The user has to start at couplet 1 and then answer one pair of leads after the other.

Crucially, to allow all species to be keyed out in such a dichotomous key the author has to find enough characters so that every single species differs in some clear way from at least one other species. There may consequently be lots of characters mentioned in the key, but it doesn't look that way because few of them are mentioned for all species. In the present case, couplet 9 asks if the leaves are sticky-glandular to differentiate Craspedia adenophora, but the trait is irrelevant for all other couplets because only the leaves of that one species are sticky.

The other, increasingly common type of key is multi-access and electronic. As an example, I have just basically at random clicked on the key to the Restionaceae of Western Australia. The user can enter whatever characters they have at hand in whatever order they want, and the key software will kick out all species that don't match. In this case there are also options to narrow the selection down by geography, flowering time or genus (if already known).

While working on my multi-access key (and a previous one before that) I have had conversations with colleagues on the lines of "what about this character, have you considered using that?" for characters that are sometimes very obscure and accordingly hard on the user or, and that is my main point here, serve only to differentiate a single species.

A character like that is often very important when writing a dichotomous key. Imagine the taxonomist working away, shuffling species around like so and so, perhaps ending up with a stubborn pair of species that clearly go together in the key but are hard to differentiate. And then they realise, ah, one of them has woolly hairs on the bracts, and the other doesn't! We have a contrast!

And that is great. But if, for example, the species with the woolly hairs on the bracts is the only one in the entire group with that trait, then the character works only to differentiate that one species. That is not a problem in the dichotomous key because the character is only presented to the user at the moment where it is actually relevant, while they will never see it in any other part of the key.

But in a multi-access key all the characters will be visible right from the start, even the ones that only work to differentiate a single species from the other 99 or so. And if we try to do that for all species we end up with a hundred characters, plus dozens of characters that differentiate 40 from 60 or suchlike. And now imagine the poor user being faced with a table of a bazillion characters - they won't even know where to start, the key will just look terribly daunting.

There is a reason why the ideal couplet in a dichotomous key is commonly said to mention perhaps two to three characters; when faced with too much choice or too much information at the same time the human brain just goes into Blue Screen of Death mode. For shopping decisions, for example, there seems to be some evidence that consumers are less likely to make a purchasing decision at all if a shop presents too many options.

What is more, the beauty of electronic multi-access keys is that it is not necessary to differentiate all species from each other. Yes, that is necessary in dichotomous keys printed on paper, but in our newfangled multi-access keys it is all about reducing the number of possible species to a comfortable three to five, and then the user can look at pictures and click on links or species profiles to make the final decision.

Well, I shall see what feedback I will get, but what I want to say here is that the habits that work for one type of key cannot simply be transferred onto a completely different type. The user experience would actually suffer from overloading the interface with dozens of characters each of which will hardly ever be useful.

Sunday, July 2, 2017

Seems as if time-calibration must be working to some degree

This week's journal club discussion covered McIntyre et al. 2017, Global biogeography since Pangaea, Proc. R. Soc. B 284: 20170716.

The authors set out to compare estimated continental break-up (and, to a lesser degree, collision) times as estimated from palaeomagnetic data with species divergence times as estimated from phylogenetic analyses using molecular clocks. They selected 42 vertebrate sister taxa for their presumed lack of dispersibility to exclude groups whose distribution may have been influenced by long-distance dispersal. Even among the selected taxa they tried to account for dispersibility by, if I understand correctly, extending the error bars around the divergence times for lineages that seemed more dispersible.

In the end they arrived at a very nice correlation between continental break-up times and divergence times. What does this tell us?

There were some concerns in our group about the argumentation being somewhat circular. I do not actually see that myself; one dataset was palaeomagnetic, and times in the other would presumably have been based on fossils and nucleotide substitution rates, so really two independent data sources would have been compared. (The time-calibrated phylogenies were sourced from the timetree.org database, which I have not yet used myself.)

To the degree that I found the methodology odd it is because of the decision to extend error bars when dispersal was considered somewhat probable. Yes, admittedly the immediately obvious way of identifying confounding dispersal - comparing divergence times against continental break-up times - would be circular in a study explicitly setting out to compare those two; using that approach would have amounted to massaging the data. But I would still find it more logical to have some way of categorically identifying suspected cases of dispersal and kicking them out of the dataset instead of leaving them in but making the relevant data points fuzzy.

What I found most puzzling, however, is that the paper is not actually very clear on what the research question was. It is thus somewhat up to the reader to draw a conclusion. If you already trust time-calibrated phylogenies, you could take the study to confirm the reliability of palaeomagnetic data. If you already believe that palaeomagnetics works but are somewhat skeptical about time-calibrated phylogenies, this study should at least show that molecular clocks can't be that bad after all, otherwise they wouldn't have got such a neat calibration out at the end.

And this is also what I take away from our reading, especially in the light of the criticism of molecular clocks that is still regularly advanced by vicariance biogeographers and panbiogeographers. Yes, this study did show that the fit is pretty good except where there is reason to suspect dispersal. And that brings us to the last point:

The present paper carefully excluded cases of suspected dispersal to examine only cases of vicariance, so the authors must be biogeographers (and geologists) who accept the existence of both long-distance dispersal and vicariance. And the same was true of our journal club. Nobody I know has any problem whatsoever reading a paper that concludes "this pattern is best explained by vicariance" if that is indeed what the data say.

But let's be clear here, it does not work the other way. Just read the papers I discussed a few weeks ago; pan- and vicariance biogeographers generally do have a problem reading a paper that concludes "this pattern is best explained by long-distance dispersal" and will instinctively start questioning the methodology.

The situation is just not symmetrical. The "dispersalist" who tries to explain every pattern with dispersal, no matter what the data say, is a non-existent straw-man. I have never met or read such as colleague. The panbiogeographer who tries to explain every pattern with vicariance, no matter what the data say, does, however, seem to be alive and kicking.

Wednesday, June 28, 2017

In praise of Linux

I am not saying I want to proselytise or anything. I completely understand Windows users; after all, I was a happy Windows user myself until they produced Windows 8. And if somebody is into gaming, for example, then Windows is the obvious choice. But I am really, really happy using Ubuntu now.

For starters, we do not have to worry about the most common cybersecurity issues, such as the Petya attack that is currently making the rounds. Admittedly neither my wife nor I would open a suspicious attachment anyway, but still, it is nice to know that everything that attacks the most common operating system is irrelevant to us.

More to the point of what I did this evening, I program a lot in Python these days. Well, for certain values of "a lot". I am obviously not a programmer, but the language is very useful for many tasks in science, from quickly reformatting a large data file to scripting complex analyses.

And the thing is, Windows makes it unnecessarily hard to use Python (or most programming languages, really). First, I need to install the language. Okay, perhaps understandable. But then I need to figure out how to tell Windows to look for Python in the Python folder whenever I try to run a Python program. Then perhaps I need to install a specialist Python library like BioPython to run certain analyses, and that is where things really go downhill, because it usually doesn't install because some dependency is missing or whatnot.

Now compare the Linux variant Ubuntu. First, it comes with Python already installed. Second, it is clever enough to automatically access Python from any folder where you start a Python script. Third, Linux makes it really easy to install things on top of Python, because it is usually smart enough to recognise dependencies and install them also. In fact that is such an obvious advantage that it seems bizarre in retrospect that Windows won't do it.

Anyway, today I decided to spend the evening coding a simple script. I was able to just plop down in front of a computer that had Ubuntu installed two weeks ago, and I did not need to do anything in preparation. So. Nice.

Saturday, June 24, 2017

Botany picture #246: Tanacetum vulgare


No energy to write something substantial at the moment, and instead I find myself thinking about plants. Here, Tanacetum vulgare (tansy, Asteraceae), Germany, 2016. It is in the Chamomile tribe of the daisy family. It has been on my mind because I was recently looking through our herbarium for specimens that have mature fruit on them, and while we have quite a few specimens there aren't any that fulfill that particular criterion.

Many daisies are usually collected in flower because they look rather less attractive in fruit, which is rather ironic given that there are many subgroups where fruits are extremely important for identification, e.g. among the dandelion tribe. But often you will at least get mature fruits as by-catch; the whole plant is collected because one head was in flower, but others lower down are already fruiting. In this case, however, the specimen is generally a single stem, and all its heads in the terminal corymbose panicle are at about the same stage, meaning none of them are fruiting. A bit frustrating.

Monday, June 19, 2017

Botany picture #245: Salvia patens


Today's botany picture is Salvia patens (Lamiaceae), a New World sage species photographed at the Botanic Gardens of Goettingen University while I was doing my PhD there. This plant has the most amazing blue flower colour, and consequently I was rather bemused to find a few years later that plant breeders have selected and were selling a white variant of this species. What is the point of that? It is like breeding an onion without taste, or a rose that doesn't produce flowers.

Okay, cranky get off my lawn mode deactivate.

Sunday, June 18, 2017

To publish or not to publish (locality information for rare species)

This week our journal club discussed Lindenmayer & Scheele, Do not publish (Science 356(6340): 800-801). While acknowledging the trade-offs involved, the paper argues for researchers, journals and data providers to self-censor locality information for rare species to keep them safe.

The problem, in short, is that some rare species are highly valued by professional poachers and private collectors, and they may in short order wipe out a rare species if they know where to find it. The article itself mentions a rare Chinese gecko; participants in our discussion provided other astonishing examples from various parts of the planet. It did not surprise me to learn that there are people digging up cycads to sell them to wealthy home-owners who want to adorn their front gardens, but I was definitely surprised to learn that rare beetles are traded for hundreds of dollars apiece by a demented subculture of beetle enthusiasts.

Nobody really disagreed with the sentiment of the article per se, but obviously people immediately raised scenarios where making the data available actually helped conservation. A particular concern is that it has to be known that a rare species exists in a spot when there is a development proposal; what is the use of keeping the information safe from poachers only to have an open-cut mine wipe out the species?

A comparison was made with medical data. While biodiversity researchers are used to having all data openly available, the medical research community has long had strict procedures for keeping safe medical information of individual people, but they still manage to do research. In other words, biodiversity science should not suffer from more restricted access to locality information if the right procedures are adopted. That being said, some raised the concern that this would simply add another layer of bureaucracy to a field already burdened with often unreasonable procedures around collecting permits and specimen exchange.

What the article and our discussion were mostly about are specimen data typed off the specimen labels and made available through databases such as GBIF or Australia's ALA. The idea would then be to have those data providers make the locality descriptions and GPS coordinates just fuzzy enough that nobody can find the exact spot where a species was seen or collected, while still providing that information to legitimate and trusted researchers. What should not be overlooked, however, is that currently a major push is underway to photograph the actual specimens and make those photos available online. Has anybody thought about systematically blurring out such locality information for rare species on the photographed labels? Not sure I have ever heard that discussed before.

Finally, there was some agreement that it would be good to have a global policy recommendation on this instead of leaving it up to individuals to self-censor without guidelines. Given that there are working groups agreeing on data formats etc. it should surely be possible to find agreement on this problem.

An off-topic excurs on hobgoblins

In this context it was interesting that somebody said, "consistency is the hobgoblin of small minds", a phrase that I have run into before. Of course, the idea here was that while a rule or recommendation is nice to have, people will still have to weigh trade-offs, and even if the recommendation would be to generally blur the data one may in some cases need to publish it (see a few paragraphs earlier).

And yes, I see where that is coming from. The fundamentalist wants a clear rule and apply it blindly, whether it makes sense or not; the intellectually mature realise that rules were introduced to achieve a good, and if applying the rule hurts that very same good then one should not apply the rule.

But still throwing a phrase like that around makes me a bit uncomfortable. In most cases consistency is important. When we are talking rules it should be clear that consistency is usually just another word for fairness. People who want to apply rules inconsistently would have to provide a very good reason for why they should not simply be seen as trying to get away with something that they would not let others get away with.

When we are talking argumentation, discussion and logic, intellectual consistency is the very first hurdle somebody has to clear to be taken seriously, and only then is it worth the investment to look into whether they have evidence on their side or not. People who are proud of being inconsistent in this sense (because it makes them Not Small Minds, you see) would have to explain carefully how they are not simply somewhere on the spectrum from slightly confused to totally insane, or alternatively on the spectrum from obfuscating the issue to gaslighting their conversation partner.

Monday, June 12, 2017

ANBG impressions

Although it is winter, and although it was foggy the first half of our visit, the Australian National Botanic Gardens always have something to see.


Moss cushion on a tree branch.


Golden everlasting flower-head waiting for the sun to come out.


Shadows cast onto a bridge in the rain forest gully.


Spider's web covered with dew.

Sunday, June 11, 2017

How the sausage is made: peer reviewing edition

One of the aspects of working as a scientist that I find most intriguing is peer reviewing each other's work. The main issue is that while how to write the actual manuscripts is explicitly and formally taught and further supported by style guides, helpful books and journals' instructions to authors, there is much less formal instruction on how to write a reviewer's report.

Essentially one is limited to (1) relatively vague journals' instructions to reviewers usually on the lines of "be constructive" or "be charitable", (2) deducing what matters to the editor from the questions asked in the reviewer's report form, and (3) emulating the style of the comments one receives on one's own papers. Apart from generic, often system-generated thank you messages there is generally no feedback on whether and to what degree the editors found my reviewer's reports appropriate and helpful or on how they compared with other reports.

In other words, most of it is learning by doing; after years of practice I now have a good overview of what reviewer reports in my field look like, but as a beginner I had very little to go by.

It is then no surprise that the style and tone in which reviewers in my field write their reports can differ quite a lot from person to person. There is a general pattern of first discussing general issues and broad suggestions and then minor suggestions line-by-line on phrasing, word choice or typos, and there is clearly the expectation of being reasonable and civil. But:
  • Some reviewers may summarise the manuscript abstract-style before they start their evaluation, while others assume that the editor does not need that information given that they have the actual abstract of the paper available to them.
  • Some stick to evaluating the scientific accuracy of the paper, while others obsess about wording and phrasing and regularly ask authors who are native speakers of English to have a native speaker of English check their manuscript.
  • Some stick to judging whether the analysis chosen by the authors is suitable to answer the study question, while others see an opportunity to suggest the addition of five totally irrelevant analyses just because they happen to know they are possible. And sometimes they recommend cutting another 2,000 words from the text despite suggesting those additions, as if those would come without text.
  • Some unashamedly use the reviewer's report for self-promotion by suggesting that some of their own publications be cited, relevant or not.
  • Some use a professional tone and make constructive suggestions on the particular manuscript in question, but others apparently cannot help disparaging the authors themselves. Luckily that behaviour is rare.
  • Some write one paragraph even when they recommend major revision (meaning they could have been more explicit about what and how to revise), others write six pages of suggestions even when their recommendation is rather positive.
Certainly then scientists in my field will have very different ways of approaching the task right from the start. Nonetheless, and for what it is worth, this is how I generally find it useful to do it.

First, I like to print the manuscript - I am old-fashioned like that. I try to begin reading it fairly soon after I accept the job, and for obvious reasons I also try to read through more or less over one day. Often I will read when I need a break from some other task like computer work, on a bus or in the evening at home.

Already on the first read I attempt to thoroughly mark everything I notice. Using a red or blue pen I mark minor issues a bit like a teacher correcting a dictation, while making little notes on the margins where I have more general concerns ("poorly explained", "circular", "what about geographic outliers?").

Usually the following day I order my thoughts on the manuscript and start a very rough report draft by first typing out all the minor suggestions. (I would prefer to use tracked changes on a Word document for that, but unfortunately we generally only get a PDF, and I find annotating those even more tedious than just writing things out.) Then I start on the general concerns, if any, merely by writing single sentences on each point but do not expand just then.

In particular if the study is valuable but has some weaknesses I prefer to sleep over it at this stage for 2-3 nights or, if the task has turned out to be a bit unpleasant, even a few days more, and then look at it again with fresh eyes. That helps me to avoid being overly negative; in fact I tend to start out rather bluntly and then, with some distance, rephrase and expand my comments to be more polite and constructive.

That being said, if the manuscript is nearly flawless or totally unsalvageable I usually finish my review very quickly. If I remember correctly my record is something like 45 min after being invited to review, because the study was just that deeply flawed. In that case I saw no reason to spend a lot of time on trying to develop a list of minor suggestions.

More generally I have over the years come to the conclusion that it cannot be the role of a peer reviewer to check if all papers in the reference list have really been cited or to suggest language corrections in each paragraph, although some colleagues seem to get a kick out of that. If there are more than a handful of language issues I would simply say that the language needs work instead of pointing out each instance, and if there are issues with the references I would suggest the authors consider using a reference manager such as Zotero, done. Really from my perspective the point of peer review is to check if the science is sound, and everything else is at best a distant secondary concern.

At any rate, after having slept over the manuscript a bit I will return to it and write the general comments out into more fluent text. I aim to do the usual sandwich: start with a positive paragraph that summarises the main contribution made by the manuscript and what I particularly like about it. If necessary, this is followed by something to the effect of "nonetheless I have some concerns" or "unfortunately, some changes are required before this useful contribution can be published".

Then comes the major stuff that I would suggest to change, delete or add, including in each case with a concrete recommendation of what could be done to improve the manuscript. I follow a logical order through the text but usually end with what I consider most important, or repeat that point if it was already covered earlier. To end the general comments on something positive I will have another paragraph stressing how valuable the manuscript would be, that I hope it will ultimately be published, etc. Even if I feel I have to suggest rejection I try to stress a positive element of the work.

Finally, and as mentioned above, there is the list of minor suggestions. Most other reviewers I have run into seem to structure their reports similarly.

When submitting the report, however, one does not only have to provide the text I have discussed so far, although it is certainly the most useful from the authors' perspective. No, nearly all journals have a field of "reviewer blind comments to the editor", which I rarely find necessary to use, and a number of questions that the reviewer has to answer. The latter are typically on the following lines:
  • Is the language acceptable or is revision required?
  • Are the conclusions sound and do they follow logically from the results?
  • Are all the tables and figures necessary?
And so on. The problem I usually have is that these questions are binary, but I would like to write something like "mostly yes, except for that instance here which really needs to be dealt with".

Thursday, June 8, 2017

Botany picture #244: Primula veris


Perhaps one of the most artsy pictures I have ever taken, this shows a Primula veris (Primulaceae) at the Zurich Botanic Gardens, taken in 2009. I was at that time involved in pollination experiments on the species.

Saturday, June 3, 2017

Reading up on biogeography part 5: time-slices

Today finishes up, at least for the moment and until the next special issue comes out, the little series on panbiogeography and vicariance geography. The last paper is

Corral-Ross V, Morrone JJ, 2017. Analysing the assembly of cenocrons in the Mexican transition zone through a time-sliced cladistic biogeographic analysis. Australian Systematic Botany 29: 489-501.

It uses area cladograms, as did one of the papers already discussed, but as the title indicates it does so in a way that examines different "time slices".

Before I get to the methodology I would like to establish an analogy.

Imagine you read a recipe for what are supposed to be very amazing pancakes. The instructions are as follows: (1) mix eggs and milk; (2) place the concrete in the deep freezer; (3) pour the mixture into the frying pan. Looking at such instructions you may well wonder first what concrete has to do with anything - not only would you not expect concrete to be part of pancake-making in the first place, but it does not even seem to be used for anything. Next you may notice that there is no mention of flour, although some kind of flour would appear to be necessary to produce pancakes.

There are now at least three possibilities. One is that the authors of this recipe have no idea how to make pancakes and merely pretend they do. Another is that they do in fact know what they are doing in the kitchen but merely wrote very incomplete and confusing instructions. Finally, there is the possibility that we, the readers, are just too ignorant or blinkered to understand the brilliance of the approach.

Some of the papers in cladistic biogeography and panbiogeography are very clear in their methodology, and I can immediately understand what they did and perhaps even what their logic is, even if I may have concerns. But with others I feel as if I am confronted with the above pancake recipe. Either the authors have no idea how to do biogeography, or the methods section could be clearer, or I have no idea how to do biogeography.

In the present case, the authors assembled 49 phylogenies ("cladograms") of various groups of organisms occurring in the Mexican biota that they were interested in. They then, as usual for area cladistics, replaced the names of the terminal taxa in the phylogenies with the areas those terminals occur in, and then extracted "paralogy-free subtrees" for analysis.

To this point it is the same approach as in the previous area cladistics paper, and once again I am a bit uncertain how precisely it works and, more importantly, how it could possibly be justified. When a molecular phylogeneticist removes paralogous alleles from the analysis they do so because we understand a lot about gene duplication, gene families, pseudogenes and suchlike. When an area cladist picks subtrees out of a larger area cladogram, what is the parallel? What is the theory behind it? How do they explain the existence of what they call paralogy in a way that does not make the whole idea of having area cladograms appear absurd? I cannot help but wonder if it is anything more than "this is too complicated so we will ignore it". Maybe I just haven't seen the proper justification, but the papers I have looked at so far seem to limit themselves to saying that paralogy exists and needs to be removed.

But now for the time-slicing. This is now really the pancake recipe: if it works the way I believe I understand the methods then it doesn't make any sense. But if it works in a different way that actually makes sense then it isn't explained well enough for me, at least, to understand. The way it looks to me is that the authors assigned each group of organisms for which they had a phylogeny in their analysis to a "cenocron", which they defined as a "set of taxa that share the same biogeographical history, which is recognised as a subset within a biota".

They then conducted three different analyses supposedly corresponding to the Miocene, the Pliocene and the Pleistocene, using phylogenies from only one, two and then all three cenocrons, respectively. In other words (again, if I understand correctly), the idea seems to be that only organisms from one of the cenocrons would have been in the study area in the Miocene, with the others arriving later. I think.

The conclusion after all this effort is that "the Mexican Transition Zone is a complex area that differs in delimitation from one analysis to another. The present study showed that the results may depend on the assemblage of the taxa analysed, with time-slicing being an adequate strategy for deconstructing complex patterns in cladistic biogeography". That is not exactly the most concrete conclusion I have ever seen. What is more, the second paragraph of the introduction already explained that "the Mexican transition zone, as defined by Halffter (1987), is a complex area", so at this moment I am not really on top of what new insights the analysis produced.

But my more important question is this one: How does the claim work that the "Miocene analysis" examines the Miocene time-slice when the authors appear to have used phylogenies of contemporary (that is Pleistocene) species? The Miocene was 5 to 23 million years ago. The species in the phylogeny would not have existed yet, only their distant ancestors would have, with potentially very different geographic ranges. We are talking the time of our common ancestor with the chimps and waaay beyond!

Do the authors assume that all contemporary species existed 20 Mya ago and have remained utterly static since that time? Where is the flour that one would absolutely need to get a pancake out of this? (Time-calibration of all the phylogenies they used followed by ancestral area estimation, in case that isn't immediately clear.)

Again: maybe I am missing something, perhaps even something that will be obvious to many others, which would make sense of this approach. But at the moment I pointlessly have a lump of concrete sitting in the freezer, and the promised pancake looks very much like a watery omelet to me.

Wednesday, May 31, 2017

Cogent Spam and, while we are at it, ARTOAJ spam

In the last two weeks several of the blogs I read have discussed an attempted 'hoax' publication that aimed to repeat for gender studies what Alan Sokal did for postmodern cultural studies in general when he made up a nonsense paper and got it published in a well-respected journal catering to that field. In the present case, Peter Boghossian and James Lindsay made up a deliberately nonsensical paper on the "conceptual penis as a social construct", but that is where the parallels end.

It seems as if they first submitted it to a relatively low ranking journal, were actually rejected, and then got it published in an even more obscure and, crucially, pay-to-play journal called Cogent Social Sciences. They then immediately went public explaining their hoax and declaring, "we suspected that gender studies is crippled academically by an overriding almost-religious belief that maleness is the root of all evil. On the evidence, our suspicion was justified". However, many people immediately pointed out it is not as easy as that.

They also, and actually first, discuss the problem of crappy pay-to-play journals, but as has been discussed elsewhere, we don't really get to use this experiment to prove two potential reasons why the paper was published at the same time. If it was published because Cogent Social Sciences is such a low quality journal that it will accept anything, then the stunt proves nothing about gender studies as a whole. Conversely, if the paper was published because the field of gender studies has no standards except the requirement to see maleness as evil, then it proves nothing about crappy publishers.

More concerning, however, seems to be the discussion that the 'hoax' has spawned. From a feminist perspective I have seen a blog post and an essay that both argued that the people celebrating it as a success can only be motivated by a hatred of feminism and a fear of women in power. I am not really sure where that comes from; the possibility should at least be entertained that the underlying motivation is the one that is stated, i.e. being fed up with postmodernist gibberish and the politicisation of academia.

On the other hand, I found it really frustrating how many people who are supposedly rationalists, skeptics, and generally science-savvy do not appear to understand at all the problem of crappy pay-to-play journals. Over and over, even in comments under articles that explained in detail what is going on here, people would write something to the effect of "but a social science journal accepted it, so there". Argh. If some guy operated a website called International Journal of Evolutionary Biology Research out of his garage, and a creationist paid him $200 in publication fee to get a nonsensical paper posted on that website, would that show that all of evolutionary biology is nonsense? Quite so. Then why the failure to appreciate the same problem in this case? Blind tribalism?

I was a bit torn at first when I had a look at the Cogent Open Access website myself. It looks much more professional than most obscure pay-to-play publishers I have seen so far, and I could at that moment not remember them spamming. Then again, I also never heard of that publisher before. Looking into a few papers they published in an area that I can judge I was not exactly overwhelmed, but okay. Just a few days ago, however, on some whim I looked into my junk mail folder, and what would I see but a spam eMail from Cogent Biology?


Again, it is not the worst I have seen, but let's count the ways in which it raises red flags for me:
  1. Well, first of all, it is a spam message, randomly soliciting papers from huge numbers of researchers who did not sign up to receive these message. This is not a practice generally associated with serious publishers.
  2. Promise of quick review and, in particular, suprisingly fast publication after acceptance.
  3. Suspiciously broad scope of the journal.
  4. Bragging about being 'indexed' in services that either are the usual suspects, or I have never heard of, or are mere utterly non-discriminating search engines like Google Scholar, as if any of that were a mark of quality.
  5. Unprofessional random bolding, italicisation, and colouring of words across the text of the message.
Add "greetings of the day!" and two more font colours and it would be utterly standard for the field. In other words, this clinches it, at least for me: I think publishing the 'hoax' paper in Cogent Social Sciences demonstrates absolutely nothing about gender studies (and nothing that we didn't already know about publishing).

Note, by the way, that the two following statements are completely independent:
  • This so-called hoax was a dud, and the people who celebrate it either don't understand its problems or exhibit a disconcerting failure of skepticism.
  • Gender studies as currently practiced is largely bollocks.
It is entirely possible to believe both at the same time, although personally I do not consider myself qualified to have an opinion on the second statement. More to the point, even if the second statement could be proved beyond doubt, it would not at all disprove the first. It is curious how many people do not seem to appreciate that, as they appear to try to demonstrate the success of the hoax by pointing at completely different papers that they also consider ridiculous.

---

While on the topic of science spam, on Monday I received a particularly hilarious instance:
Good Morning.....!
What a professional salutation.
Can we have your article for successful release of Volume 6 Issue 5 in our Journal?
Wait, what article are you talking about, specifically? Also: no.
In fact, we are in need of one article to accomplish the Issue prior 10th June; we hope that the single manuscript should be yours. If this is a short notice please do send 2 page opinion/mini review/case report, we hope 2 page article isn't time taken for eminent people like you.
So basically: send us whatever you want, we just want stuff!
Your trust in my efforts is the highest form of our motivation, 
Gibberish alert!
I believe in you that you are eminent manuscript brings out the best citation to our Journal.
I believe in you that you are poor at constructing English sentences. And this is just beautiful: they come right out and say that this is about what is best for their randomly capitalised "Journal" as opposed to what is best for science or the author. Ye gods.
Anticipate for your promising response.
Ding! Gibberish!
Regards,
Sophia Mathis
If that is really the name of the author of that message I will eat my hat.
Agricultural Research & Technology: Open Access Journal (ARTOAJ)
The names keep getting more ridiculous. I guess all the good ones are taken? Now for the finale:
*Note: Wanna get more citations for your articles publish with us as i-books, e-books & Videos.
"Wanna". Somebody thought they could emulate what a serious academic publisher would write and they came up with "wanna get more citations". The mind reels.

Monday, May 29, 2017

Botany picture #243: Cnicothamnus


It occurs to me that it has been quite some time since I posted a plant picture. Should do more of those again.

I took this particular photo on a field trip to Bolivia in 2007, without knowing what it was; I merely thought that it was a rather unusual-looking daisy. Years later I then read a phylogenetic study of the Gochnatieae, and when I saw its figure 2B I went, "hey, that looks familiar!"

I cannot be sure which species it is, but it sure seems like it is a Cnicothamnus (Asteraceae); the combination of unusual capitulum size and colour, many greyish capitular bracts, and geographic provenance seem to make it a safe conclusion.

Saturday, May 27, 2017

Reading up on biogeography part 4: track analysis for bioregionalisation

With two papers left, I was wondering whether there would still be any point to going on. The last two use track analysis and area cladograms, respectively, and those were already used by the first and second paper, so would there be any new insights into the methodology of pan- and vicariance biogeography?

However, the next paper,

Martinez et al., 2017. Biogeographical relationships and new regionalisation of high-altitude grasslands and woodlands of the central Pampean Ranges (Argentina), based on vascular plants and vertebrates. Australian Systematic Botany 29: 473-488.

... uses track analysis at least partly to do something different than the previous instance. There is the question of the "relationship" of a biome, but then there is also bioregionalisation. So that is a new angle.

The idea seems to be relatively simple. As before, the panbiogeographer looks at the occurrences of species, draws minimum-distance lines ("tracks") between them, and then identifies areas where the tracks of several species overlap as "generalised tracks". In the present case, a very short generalised track is then "used to recognise natural areas in terms of their biota because they result from more or less consistent overlapping distributions of two or more endemic taxa".

Okay, same question as always: does this make sense?

Well, more than the claim that generalised tracks are always evidence of vicariance, which this paper kind of only makes in passing (while, weirdly, explaining the panbiogeographic reasoning in words so identical to those used in the Romano et al paper that I wonder if they were in both cases copy-pasted from Croizat). To me the approach just seems part unnecessarily complicated, part not data-rich enough.

As for the first, yes, an area with several endemic taxa may well deserve recognition as a natural unit, a vegetation zone, a biome (whatever) in some area classification. But if the idea is to identify areas defined by endemic species, why do we need a track analysis as an intermediate step? Why not simply plot the occurrences of endemic species? At that point all the information is there, and tracks, generalised or not, do not add anything.

As for the second, as I mentioned before there are several other methods available for bioregionalisation. Some use clustering approaches to group grid cells or other small areas into larger areas based on shared species content or even the relatedness of those species. The newest ones use modularity or map equation analyses to examine networks of species and the grid cells they occur in; in contrast to clustering, where it is the researcher's somewhat subjective choice how many clusters to accept, these network approaches have algorithms for deciding more objectively how many truly distinct units there are.

In other words, in my eyes track analysis seems to be superfluous to requirements if we are merely interested in the simple measure of shared endemics, and it is unable to provide the depth of information that could be obtained from examining other shared distribution patterns.

Sunday, May 21, 2017

Reading up on biogeography part 3: Hopping between islands yes, hopping from continent to island no?

The third vicariance biogeography / panbiogeography paper in the special issue is

Grehan JR, 2017. Biogeographic relationships between Macaronesia and the Americas. Australian Systematic Botany 29: 447-472.

Despite being very long, its gist is easily summarised:

The mainstream explanation for the occurrence of plants and animals on the Macaronesian islands (Canary Islands, Madeira, etc.) is that they must have got there via long-distance dispersal, often from Africa but sometimes from the Americas, because the islands are of relatively young volcanic origin and distant from other land masses. However, the "model-based approaches" that this conclusion is based on cannot be accepted because they supposedly assume dispersal and ignore the possibility of vicariance.

This is followed by many pages of example cases of plants and animals illustrated with maps and phylogenies. It is not clear to me what that is supposed to show, because without a time axis it doesn't move the inference either way; at best it could show that some of the groups have a pattern that is consistent with vicariance, but if a lineage is too young then vicariance is still out, and the same if the lineage is much older than the island.

Finally, there is some speculation, again illustrated with maps, about whether there were always volcanic islands in the same area, all through from the time when the Atlantic started to open. They would have been transient on a geological scale, so the local lineages supposedly produced by vicariance when Africa and the Americas started moving apart would have had to island-hop as new volcanoes rose and older ones eroded away, over more than 100 million years.

In contrast to the previous two papers I did not really gain new insights into the methodologies favoured by vicariance biogeographers. In a sense the present paper is closer to an opinion piece or perhaps a review article than to a research study.

The supposed assumptions of "model-based approaches"

The paper claims
"Model-based approaches to Maccaronesian biogeography assume the that the [sic] sequence of phylogenetic relationships reflects a sequence of chance dispersal. Although often cited as Hennig's progression rule, it is not a rule but an assumption that does not address the equal applicability of sequential differentiation across a widespread ancestor."
And further on:
"Model-based methods use chance dispersal to explain divergence and allopatry, ..."
Unfortunately this claim at least is demonstrably false. There are various models available to do ancestral area inference (see this graphic as an example), and DIVA and very popular DEC, for example, include vicariance. That's what the V in the acronym DIVA means! If a model-based analysis with a model that allows vicariance infers no vicariance then we can assume it is not because the model does not allow vicariance, but because the data didn't support that conclusion.

I am also reasonably certain that Hennig's progression rule does not only apply to long distance ("chance") dispersal but would just as well apply to a series of range expansions followed by speciation events across a single land mass. It simply applies the principle of parsimony to historical biogeography, arguing that if several lineages along a grade occur in an area then that would probably, all else being equal, have been (at least part of) the ancestral range, because other explanations require more dispersal and/or extinction events.

It is interesting, by the way, how the word "model" seems to be used in this context, as if a mathematical description of a system is something bad.

What distribution patterns would we expect under vicariance and long-distance dispersal, respectively?

"The progression rule also assumes that a 'basal' grade is located in the source region or centre of origin, but some Macaronesian clades are basal to large continental clades, and there are also clades with 'reciprocal monophyly' in which a diverse Macaronesian clade is the sister group to a diverse continental clade. These phylogenetic and geographic incongruities do not arise in a vicariance interpretation of phylogeny, because a basal clade or grade marks only the location of the intial phylogenetic break or breaks within a widespread ancestral range."
I don't really understand the reasoning here. The idea seems to be that if an island clade is nested within a continental grade, then it may make sense to conclude dispersal, but if an island clade and a continental clade are sister to each other then it is somehow "incongruent" (with what?) and can only be explained by vicariance. Why?

I would look at the nearest outgroup to get more information, but even if that occurred in neither region then we would still have to ask if additional continental or island lineages may have simply gone extinct. The key questions are whether the lineage split is so recent that it happened considerably after continental break-up and whether an island lineage is older than the island(s). Really I don't see how we can conclude anything with confidence without a time axis.

Perhaps the idea is to equate "distribution of the species along a basal grade is evidence of a centre of origin" with "absence of such a basal grade is evidence of absence of a centre of origin"? If so, that would not be logical; absence of evidence for A is not evidence for not-A.

The paper also discusses other patterns, in this case non-overlapping ranges of related species (allopatry):
"Model-based methods use chance dispersal to explain divergence and allopatry, and yet allopatric divergence requires isolation, which cannot exist if there is effective dispersal."
The point of the second half of this sentence is a false dichotomy set up between dispersal that is so frequent that it makes speciation impossible and no dispersal at all. It seems obvious to me that the excluded middle is dispersal that happens but is too rare to make speciation impossible.
"In the same way that allopatric lineages within Tarentola are incongruent with the expectations of chance dispersal, so too is the allopatry of Tarentola and its New World sister group."
Again this makes no sense to me whatsoever, and again there seems to be some very black-and-white reasoning behind it: if species can disperse to distant islands everything should occur everywhere; but we observe that all species do not occur everywhere, so we have to conclude that dispersal is completely impossible. But this is one-to-one equivalent to the argument that you cannot produce random numbers with a die because when you cast it the second time it came up with a different number than the first time. Really, that seems to be the logic here.

One might also add that there is another fairly obvious reason why one would find patterns of allopatry even if the same region was reached two or three times by the same lineage: competitive exclusion. It is a well established, empirically tested insight of biogeography that islands (and by extension restricted areas in general) have a carrying capacity, both in overall diversity and in the number of species trying to occupy about the same ecological space. In the case of islands in particular, their species diversity is a function of size (the more land, the more species, mostly because lower area increases extinction rate) and distance from the nearest larger land mass (the closer, the more species, mostly because of higher immigration / dispersal rates filling up the species pool).

This makes a lot of intuitive sense. Assume you have a seed of a continental shrub species blown onto an island that so far has only been colonised by mosses, lichen, one species of grass, and a bunch of insects eating the former. Your shrub niche is still free, and there is nothing on the island that is adapted to eating you, so even if at first you are in a bit of trouble genetically (inbreeding) and ecologically (not used to this soil and climate) you have a reasonable chance of establishing. Now fast forward 500,000 years, and the single seed of that shrub has diversified into six species occupying every niche on the island that they could adapt to in that time, forming thick scrubland from coastal dunes to the highest peak. A new seed of a related continental shrub species ends up on the island - but now everything is occupied by relatives that have become well-adapted to this new environment. Are we really surprised that the second comer will have a harder time establishing?

Time-calibration of phylogenies, again

We had that one already in the Ung et al paper, but once more:
"Model-based methods, with rare exceptions, present molecular divergence ages as falsifications of early origins, at or before continental breakup, even though they are calibrated by fossils that can generate only minimal divergence dates. Although it is widely claimed that molecular-clock analyses are generate [sic] evidence of dispersal (Sanmartin et al., 2008), molecular divergence estimates artifically constrain the maximum age of taxa that may be much older than their oldest fossil or the age of the current island they occupy (Heads 2009a, 2012, 2014a, 2014b, 2016)."
I like the little caveat "with rare exceptions", although it is unclear what it refers to. But it is not a method, but the researcher using a method, who would draw the conclusion that a lineage diverging 12 Mya would not have diverged because of a tectonic event that happened 120 Mya. And yes, that conclusion makes a lot of sense to me, and no, "model-based" methods do not magically transform minimum ages into maximum ages. This has been discussed repeatedly in rebuttals to Heads' papers. What is more, people have run analyses using the alternative approach suggested by Heads and in the present paper and found that the results are generally absurd, such as pushing the age of the daisy family back before the origin of multi-cellular life.
"The timing of ancestral differentiation may be assessed either by fossils (including molecular extrapolations) or tectonic-biogeographic correlation."
First, fossil calibration or using estimated substitution rates are really two completely different data sources, so the former does not really "include" the latter. Second, using continental breakup to calibrate splits in the phylogeny would, as mentioned before, be circular reasoning. It would build the assumption of vicariance into the analysis to subsequently conclude vicariance as a result. I think that's not how science is supposed to work.
"Fossil data provide only the minimum known-age of taxa and [sic] fossils are often lacking for clades of interest to Macaronesia. In tectonic correlation, the estimate of clade age is more precise, because it refers to a particular, dated event, rather than a minimal (fossil-calibrated) age."
Yes, a fossil provides a minimum age. But unless I severely misunderstand something, a continental break-up could, at best, provide only a maximum age, if we assume that divergence would not have been possible before break-up. (And even that seems fishy to me, given that there are plenty of speciation events on the same landmass.) If it were to be taken as "precise" that would, once more, automatically exclude the possibility that the divergence happened later, after dispersal from one continent to the other, and that would be circular reasoning.

Even the vicariance approach would need long distance dispersal to work

Finally, I am puzzled by the idea of how the lineages would have stayed in place after the supposed vicariance event that would have happened long before the present islands came into existence:
"Island biota survives erosion and subsidence of island habitats by local dispersal onto newer volcanoes"
What I don't get is this: if a vicariance biogeographer can accept that a species hops across the ocean from one volcanic island to another, why can they not accept that it hops across the ocean from Africa onto one of the volcanic islands? What's the difference? Why is this discussion taking place again? I must be missing something very subtle here.

Friday, May 19, 2017

Reading up on biogeography part 2: Panbiogeographic Track Analysis

The second paper in this little series of posts is
Romano MG et al, 2017. Track analysis of agaricoid fungi of the Patagonian forests. Australian Systematic Botany 29: 440-446.
What I appreciated about reading it was first that it was concisely written, and second that it gave me insight into the Panbiogeographic methodology of Track Analysis. It had so far been merely a bunch of arcane terms to me, which of course makes it impossible to judge its meaning. And in contrast to the previous paper, which left out most the details of its methodology and instead referenced earlier papers, this one gives a clear explanation. This kind of stuff is exactly why I am reading through the journal issue.

So, how exactly does Track Analysis work?

First, you need species with disjunct areas of distribution - or at least species that are poorly enough sampled that they appear to be disjunct. Then you draw a line along the shortest distance between any two of their occurrences. Let's assume we have a species occurring on two islands of this little landscape I just generated in GIMP:


Panbiogeographers call this red line, with the occurrences of the species forming the end points, a Track.

If you have more than one species showing the same Track, you promote that line on the map to a Generalised Track:


To cite the present paper, in panbiogeographic logic "a generalised track ... allows inference of the existence of an ancestral biota widely distributed and fragmented by vicariance events, suggesting a shared history."

Now you may come up with other tracks in the same study group that do not run parallel. Where generalised tracks cross each other, panbiogeographers draw a circle with an X in it and call that place a Node, like this:


In this case, their interpretation is that this is "a complex area, where different ancestral biotic and geological fragments interrelate in space-time as a consequence of terrain collision, docking or suturing".

Aaaaand... that was it, really. Draw some lines on the map, conclude vicariance and "complexity". The rest of the conclusions in the present paper are largely about the need for more sampling, and that fungi can also be used as a study group.

Does this approach make sense?

Unfortunately, I don't really see it. The logic behind the panbiogeographic interpretation of Generalised Tracks is that patterns of disjunction shared by several taxa are evidence of vicariance, presumably because they assume that chance dispersal would have to be utterly random and create different distributional patterns in each and every species.

But a little contemplation should blow that idea out of the water. There are several other good reasons why disjunct ranges can be shared across taxa. One would be an a priori lack of alternative habitat - if you have two wet patches and otherwise only steppe, then all wetland species will be restricted to those two patches, even if one of the two wetlands was colonised from the other entirely through long distance dispersal. And that restriction alone will produce a shared history, without vicariance. Another option would be prevailing wind or ocean currents, which make long distance dispersal decidedly more probable in some directions even as it is still a stochastic process (dice, but a bit loaded) and, more importantly, not vicariance.

The interpretation of Nodes as showing things like terrain collision also seems to be missing a few crucial steps, at least in my eyes. Don't get me wrong, I am as aware of fossil ranges being an important part of evidence in geology as the next biologist, but still I'd actually prefer to consult a geologist instead of trying to deduce geological history from patterns of distribution alone.

Finally, this whole approach appears to have a weakness that seems quite critical. Science does not proceed by knowing how to confirm, it proceeds by knowing how to reject a hypothesis. Now the question here is this. Yes, panbiogeographic track analysis is apparently designed to conclude vicariance and an area being "complex". But if a disjunction really has not been caused by vicariance, how would a panbiogeographer conclude that? Would they ever do so?

That, alas, is left unexplained, at least in this paper.