Saturday, March 17, 2018

Bioregionalisation part 5: Cleaning point distribution data in R

I should finally complete my series on bioregionalisation. What is missing is a post on how to do a network (Modularity) analysis in R. But first I thought I would write a bit about how to efficiently do some cleaning of point distribution data in R. As often I write this because it may be useful to somebody who finds it via search engine, but also because I can then look it up myself if I need it after not having done it for months.

The assumption is that we start our spatial or biogeographic analyses by obtaining point distribution data by querying e.g. for the genus or family that we want to study on an online biodiversity database or aggregator such as GBIF or Atlas of Living Australia. We download the record list in CSV format and now presumably have a large file with many columns, most of them irrelevant to our interests.

One problem that we may find is that there are numerous cases of records occurring in implausible locations. They may represent geospatial data entry errors such as land plants supposedly occurring in the ocean, or vouchers collected from plants in botanic gardens where the databasers fo some reason entered the garden's coordinates instead of those of the source location , or other outliers that we suspect to be misidentifications. What follows assumes that this at least has been done already (and it is hard to automate anyway), but we can use R to help us with a few other problems.

We start up R and begin by reading in our data, in this case all lycopod records downloaded from ALA. (One of the advantages about that group is that very few of them are cultivated in botanic gardens, and I did not want to do that kind of data clean-up for a blog post.)
rawdata <- read.csv("Lycopodiales.csv", sep=",", na.strings = "", header=TRUE, row.names=NULL)
We now want to remove all records that lack any of the data we need for spatial and biogeographic analyses, i.e. identification to the species level, latitude and longitude. Other filtering may be desired, e.g. of records with too little geocode precision, but we will leave it at that for the moment. In my case the relevant columns are called genus, specificEpithet, decimalLatidue, and decimalLongitude, but that may of course be different in other data sources and require appropriate adjustment of the commands below.
rawdata <- rawdata[!($decimalLatitude) | rawdata$decimalLatitude==""), ]
rawdata <- rawdata[!($decimalLongitude) | rawdata$decimalLongitude==""), ]
rawdata <- rawdata[!($genus) | rawdata$genus==""), ]
rawdata <- rawdata[!($specificEpithet.1) | rawdata$specificEpithet.1==""), ]
All the records missing those data should be gone now. Next we make a new data frame containing only the data we are actually interested in.
lat <- rawdata$decimalLatitude
long <- rawdata$decimalLongitude
species <- paste( as.character(rawdata$genus), as.character(rawdata$specificEpithet.1, sep=" ") )
mydata <- data.frame(species, lat, long)
mydata$species <- as.character(mydata$species)
Unfortunately at this stage there are still records that we may not want for our analysis, but they can mostly be recognised by having more than the two usual name elements of genus name and specific epithet: hybrids (something like "Huperzia prima x secunda" or "Huperzia x tertia") and undescribed phrase name taxa that may or may not actually be distinct species ("Lycopodiella spec. Mount Farewell"). At the same time we may want to check the list of species in our data table with unique(mydata$species) to see if we recognise any other problems that actually have two name elements, such as "Lycopodium spec." or "Lycopodium Undesignated". If there are any of those, we place them into a vector:
kickout <- c("Lycopodium spec.", "Lycopodium Undesignated")
Then we loop through the data to get rid of all these problematic entries.
myflags <- rep(TRUE, length(mydata[,1]))
for (i in 1:length(myflags))
  if ( (length(strsplit(mydata$species[i], split=" ")[[1]]) != 2) || (mydata$species[i]) %in% kickout )
    myflags[i] <- FALSE
mydata <- mydata[myflags, ]
If there is no 'kickout' vector for undesirable records with two name elements, we do the same but adjust the if command accordingly to not expect its existence.

Check again unique(mydata$species) to see if the situation has improved. If there are instances of name variants or outdated taxonomy that need to be corrected, that is surprisingly easy with a command along the following lines:
mydata$species[mydata$species == "Outdatica fastigiata"] = "Valida fastigiata"
In that way we can efficiently harmonise the names so that one species does not get scored as two just because some specimens still have an outdated or misspelled name.

Although we assume that we had checked for geographic outliers, we may now still want to limit our analysis to a specific area. In my case I want to get rid of non-Australian records, so I remove every record outside of a box of 9.5 to 44.5 degrees south and 111 to 154 degrees east around the continent. Although it turns out that this left parts of New Guinea in that is fine with me for present purposes, we don't want to over-complicate this now.
mydata <- mydata[mydata$long<154, ]
mydata <- mydata[mydata$long>111, ]
mydata <- mydata[mydata$lat>(-44.5), ]
mydata <- mydata[mydata$lat<(-9.5), ]
At this stage we may want to save the cleaned up data for future use, just in case.
write.table(mydata, file = "Lycopodiales_records_cleaned.csv", sep=",")
And now, finally, we can actually turn the point distribution data into grid cells and conduct a network analysis, but that will be the next (and final) post of the series.

Saturday, March 10, 2018

Reading The Varieties of Religious Experience: Lecture 2

In his second lecture, James defines what he would 'religion' consider to be for the purposes of the lecture series.

He stresses right at the beginning that religion is such a complex phenomenon that anybody who thinks they can come up with a clear and simple definition is fooling themselves. He then mentions two aspects, the organisational structure (the church with its office holders and buildings) and the personal beliefs and feelings of each believer, and he excludes the former from consideration to focus his efforts on the latter.

That is unsurprising, given his psychological approach, and fair enough. A historian would perhaps be most comfortable addressing religion as an organised body while excluding personal psychology from their considerations. What I find interesting to observe, however, is that one aspect of religion as I see it is not even mentioned. To me, schools of thought that make truth claims, be they ideologies, religions, or scientific, philosophical, scholarly, and engineering communities, have three main components:
  • The people who adhere to the school of thought; they are the focus of James' lectures,
  • The institutional framework (research institutions, churches, political parties, think tanks, journals, internet fora, conferences, etc.); this James mentioned but excluded from consideration, and
  • The actual body of knowledge or belief system; it appears to remain unexamined so far.
Because 90% of the lectures are still to follow I don't want to dwell on this too much, but I find it interesting even at this stage that James appears curiously incurious about the first question that would come to my mind when faced with a school of thought: are its beliefs true? I guess I will see if he will go there later or if he will remain completely disinterested in that question throughout.

After having settled on the personal relationship of an individual human to the divine as his focus, James clarifies that believing in an actual personal god is not a criterion for him. He mentions 'Emersonianism' and Buddhism as examples of  systems that work to produce religious feelings without having personalised deities. I had never heard of Emersonianism, but it appears to be a variant of pantheism, seeing the whole universe as divine and (believe it or not) benign.

Finally, James spends an astonishingly large part of his second lecture on discussing what mindsets he considers truly religious and what mindsets he does not. Again and again he negatively contrasts the philosophical, Stoicist acceptance of the way the world is with the Christian ideal of a joyous embrace of whatever happens, no matter how terrible. Although he sometimes calls the ascetic or highly spiritual Christian 'extreme', the language he uses leaves no doubt that he considers mindless exultation in the face of, say, seeing a loved one dying terribly to be an admirable state of mind, as evidence that religion is a positive force for humanity.

Again I hesitate to immediately reject his argumentation given how little I have progressed into this book, but even here I cannot help wonder if this view does not rely quite a bit of conflation of many different injustices or tribulations to which, really, we would be justified to react in very different ways. We are not merely talking about "the universe is unfair, and a truly wise person will accept that they can only do their best and be happier for it". No, depending on what we are talking about and if we assume gods to exist we may reasonably take very different stances - and I would actually say that religious bliss is the appropriate stance in none of the various cases.

We cannot always get all we wanted. Some things are unachievable, and sometimes we have to compromise with other people. Accepting that is just a sign of maturity. (Embracing such compromises joyously would seem to be a bit twee, though.)

Then there are the evils we do to each other, such as theft, bullying, rape, murder, etc. Really one of the most frustrating facets of human existence is how much needless misery we cause each other, both deliberately and accidentally, given that we would have quite enough misery left to deal with even if we were all perfectly nice to each other (see next point). Point is, in this case the perpetrators generally have a moral responsibility to do better, and joyously accepting their bad deeds is both unreasonable and counterproductive, as it will set perverse incentives and reward bad actors.

What James must really be talking about, however, would have to be 'natural evils', harm to us that is no other human's fault, everything ranging from having to die of old age across natural disasters to people being born with a genetic disorder. Under the (atheist) assumption that there is no god behind these phenomena, that they just happen, James' preferred stance of a joyous embrace would be ridiculous. Stoicist acceptance of what cannot be undone while trying one's best to undo these evils is a more sensible approach.

But what if we assume that natural evils are caused or at least allowed to happen by an omnipotent god who could, with the snap of their metaphorical finger, deliver us from such needless suffering? Does it make sense, under this assumption, to write, "dear superior intelligence running the universe, please accept my heartfelt thanks for making me slowly die of an untreatable, incredibly painful disease; and while on that topic, thanks also for that landslide that crushed my best friend when we were twelve years old"?

I can't say that this would feel sane to me. I would have some very serious questions about the moral character and motivations of such gods, if I believed for a moment that they existed. But then again, James acknowledges himself that there are some people who are unable to have religious feelings as he defined them. I assume I am one of those people, for better or for worse.

And note also that there are presumably many people who would consider themselves religious but who do not feel what James considers to be the religious impulse at its most pure.

Thursday, March 8, 2018

Alpha diversity and beta diversity

At today's journal club meeting, we discussed Alexander Pyron's opinion piece We don't need to save endangered species - extinction is part of evolution. I mentioned it in passing before and still think that his core argument, which is also reflected in the title, is logically equivalent to saying that murder is okay because all humans are going to die of natural causes one day anyway. But reading his piece more thoroughly than before, I now notice a few other, um, problems. The highlights:
Species constantly go extinct, and every species that is alive today will one day follow suit. There is no such thing as an "endangered species," except for all species.
What weirds me out here is the lack of a phylogenetic perspective in a piece written by a systematist - species are discussed as individuals that pop out of thin air and then disappear again. Of course, in the very long run every species will one day go extinct when the sun expands and boils off the oceans. But until then, in the time frame that Pyron discussed, no, not every species will go extinct, quite a few of them will diversify and survive as numerous descendant species, as did the ancestor of all land vertebrates or the ancestor of all insects in the past. They thus become effectively immortal (until, once more, the sun explodes anyway, etc.).
Yet we are obsessed with reviving the status quo ante. The Paris Accords aim to hold the temperature to under two degrees Celsius above preindustrial levels, even though the temperature has been at least eight degrees Celsius warmer within the past 65 million years. Twenty-one thousand years ago, Boston was under an ice sheet a kilometer thick. We are near all-time lows for temperature and sea level ; whatever effort we make to maintain the current climate will eventually be overrun by the inexorable forces of space and geology.
This is sadly a classic of climate change denialism. Yes, there was change in the past too, but there are some major differences. One is the rate of change - the impacts we are having are coming much faster than most natural changes (excepting e.g. meteorite strikes and similarly sudden events), so that animals and plants have less of a chance to migrate or to adapt than they had in past cycles of warm and ice ages. Second, they have even less of a chance to migrate because we have fragmented their available habitats by putting roads, towns, croplands and pastures into their way. Third, past changes did not affect a highly urbanised human population of more than seven billion people; the potential of global change producing catastrophic results even just for us is much greater now than when we were just a few million widely dispersed hunter-gatherers. So yes, it is true that we cannot freeze the status quo in place forever, but I think we would do well to slow the rate of change as far as possible.
Infectious diseases are most prevalent and virulent in the most diverse tropical areas. Nobody donates to campaigns to save HIV, Ebola, malaria, dengue and yellow fever, but these are key components of microbial biodiversity, as unique as pandas, elephants and orangutans, all of which are ostensibly endangered thanks to human interference.
I just don't even. What is the logic here? "Nobody cares about conserving diseases that horribly kill us humans, so we should not care about conserving harmless pandas either?" How does that follow?
And if biodiversity is the goal of extinction fearmongers, how do they regard South Florida, where about 140 new reptile species accidentally introduced by the wildlife trade are now breeding successfully? No extinctions of native species have been recorded, and, at least anecdotally, most natives are still thriving. The ones that are endangered, such as gopher tortoises and indigo snakes , are threatened mostly by habitat destruction. Even if all the native reptiles in the Everglades, about 50, went extinct, the region would still be gaining 90 new species -- a biodiversity bounty. If they can adapt and flourish there, then evolution is promoting their success. If they outcompete the natives, extinction is doing its job.
And this is perhaps what frustrates me most, because while this is not an uncommon argument against biosecurity measures one would expect a biologist to know about different types of biodiversity instead of confusing them. To explain more clearly what is going on, consider the following diagrams. First, we have three areas, roundland, squareland, and hexagonland, with two endemic species each.

Then humans recklessly move species between the areas, allowing them to invade each other's natural ranges. It turns out that three of the species are particularly competitive and prosper at the cost of the other three, driving them to extinction.

Now there are three types of diversity to consider. The first is alpha-diversity, which means simply the number of species in a given place. As we see it has gone up by 50% in all three areas, from two to three species. Yay, more diversity! This is what Pyron proudly points at in Florida.

What is lost, however, is beta-diversity or turnover, that is the heterogeneity you observe as you move between areas. It was very high originally, as every area had its unique species, but now it has been wiped out entirely. Beta-diversity in the second diagram is precisely zero. Under the first scenario a squarelander can go on a holiday trip to roundland and admire the unique flora of that part of the world; under the second scenario they will travel to roundland and merely see the same few weeds that they have growing in their own front yard back home. And the endemic plants of hexagonland have all gone extinct, a 100% loss of that area's irreplaceable evolutionary history.

(Note that beta-diversity would also be zero if all six species survived everywhere. But that is clearly not a realistic assumption, as it would require each area to have such a high carrying capacity that they should each have evolved more than two species to begin with. We would not expect that all the plant species of the world could survive next to each other in, say, Patagonia, even if they were all introduced there.)

Finally, in our example global diversity has of course also been reduced, by 50%. So yeah, great to have more alpha-diversity in Florida, but does that make up for a massive net loss in both beta-diversity and global diversity? The argument seems rather misguided.

Sunday, March 4, 2018

Reading The Varieties of Religious Experience: Lecture 1

I have started reading William James' The Varieties of Religious Experience. Published first in 1902, this collection of twenty lectures is considered to be a classic of the study of religion. It approaches the subject with a psychological as opposed to theological, historical, or apologetic angle, but appears to remain rather charitable towards religious beliefs.

This becomes clear already in the first lecture, much of which is spent assuring the believing reader that they have no reason to be offended by a psychological examination of religious experience.

James calls 'medical materialism' the idea that religion originated as the hallucinations and ravings of 'psychopaths' and 'degenerates' and can therefore be dismissed. (His words; see e.g. the interpretation of Saint Paul's vision of Jesus as the result of an epileptic seizure.) He argues that the value of a phenomenon, here religious truth claims, cannot be deduced from its origins; as an argumentum ad absurdum he points out that a scientific insight would be judged on its own merits even if the scientist who gained it was suffering from some mental disorder. By their fruits ye shall know them, not by their roots.

Well, fair enough, one might say. But while I cannot tell what the state of the discussion was around the year 1900, it seems as if this argument would miss the point of 'medical materialism' as it is applied today. Taking the position of an atheist, it is not the case that they attempt to answer the question of what to think of religious truth claims by looking at how they originated. They would most likely argue that that particular question has already been answered by applying the same criteria as James would (or at least the empirical one, see further down). They already take it as given that religious claims are largely false, and true only by lucky accident:

There is no evidence that there is something to us that lives on after death, and indeed the study of brain damages suggests that all there is to our personality is an emergent property of the physical. There is no evidence that the universe was created by a higher intelligence, and indeed it looks very much as if it wasn't. There is no evidence that the universe was created for our benefit, and indeed it looks very much as if it wasn't. There is no evidence that prayer works; and so on. There is also the small matter that hundreds of religions made and continue to make contradictory claims, meaning that only such a small percentage of them could be true as to be too close to zero percent to matter.

So given that background, the atheist now asks not what to think of a religious claim, but instead: How and why would people come up with something as wrong as that? And here hallucinations are a decent explanation for divine visions. That is why I feel that James' central argument in the first lecture misses its mark. But then again, he seemed to be more interested in reassuring religious readers than in criticising atheist ones anyway.

In this context it is also fascinating to examine what 'fruit' criteria James accepts as valid for judging spiritual and theological claims, now that he has rejected the 'root' criterion. He names three: immediate luminousness, philosophical reasonableness, and moral helpfulness.

Immediate luminousness is also described as based on 'our immediate feeling' upon being exposed to the claim. This seems rather oddly subjective and emotional, and at least in my eyes falls flat as a useful criterion.

Philosophical reasonableness is to be understood as based on how the claim relates to 'the rest of what we hold as true'. This is the most sensible of the three criteria, because that is also how we do it in science. If, for example, somebody presents us with the theories underlying homeopathy, such as water memory, we may consider in comparison what we believe we already understand about physics and chemistry. We then find that either large bodies of scientific knowledge supported by numerous experiments and empirical observations must all be utterly, mind-boggingly wrong, or that, alternatively, homeopathy must be nonsense. At this stage it should be easy to figure out which of the two options strains our credulity less.

Still, in the context of religious truth claims, this approach still appears unsatisfactory. How, after all, are any religious truth claims justified? If they are justified based on fitting into our body of scientific knowledge they are simply more scientific truth claims. If not, as of course they are, then each religion constitutes a network of beliefs that may (or may not) be internally consistent but that is completely unmoored from other such networks and from observable reality. The philosophical reasonableness criterion will have a Christian accept a vision of Jesus in heaven as true and reject a vision of the imminent death of the sun as false, and it will have a precolumbian Aztec reject the former as false and accept the latter as true, with exactly the same justification. How useful.

Finally, moral helpfulness suffers from exactly the same flaw as the previous does in a religious context. Unless the belief system is at some point anchored on empirical, observable reality, it is turtles all the way down.

Monday, February 5, 2018

Botany picture #255: Exocarpos nanus

Currently we are back in Kosciusko National Park for field work, and for the first time I have consciously seen Exocarpos nanus (Santalaceae), although it is so tiny that I may have previously stepped onto it without noticing. Like its larger congeners it is a hemiparasite.

Sunday, February 4, 2018

Bioregionalisation part 4: networks

Having examined a clustering approach to bioregionalisation, today I will try to illustrate the increasingly popular alternative of network analysis.

Consider again our hypothetical study area of five cells with five taxa, where we want to know how to delimit bioregions (or phytoregions, given that the taxa are plant species) in an objective way:

The first step in the analysis is to interpret these data as a network. Specifically, as we have two different types of elements, what we are dealing with is called a bipartite network. Each type of element is connected directly only to elements of the other type, and to elements of its own type only via the other. In this case, the plant species are connected to all cells they occur in, and cells are connected to all plant species occurring in them:

Once we have scored this kind of network structure in a way that the software of our choice understands (either a list of connections or a matrix with 0s and 1s), we can use an algorithm that divides the network into modules. This algorithm tries to maximise connections within a module and to minimise the connections between modules, which in bioregion terms again means to maximise endemism.

As indicated in the posts on clustering, network analysis has the great advantage that it does not only produce groups, it also provides a reproducible and objective answer for the question about the optimal number of groups, whereas in clustering analysis the user still has to make a subjective decision.

That being said, it is always possible to take a large module by itself and explore its internal structure, if so desired, although of course the answer may be that there are no meaningful subdivisions any more.

Either way, any such algorithm will return modules, and what we are mostly interested in is what cells belong to what module. Nonetheless we would also be able to infer what species belong to what module, and depending on the type of network analysis we may be able to get other statistics that may be of interest for the network and for each individual module or even each element.

There are two main approaches to network analysis that have been explored in bioregionalisation. The first is called the Map Equation, developed by Rosvall et al. (2009) and promoted with a sleek, eponymous website. It was first applied to bioregionalisation by Vilhena & Antonelli (2015). One of its advantages is that it is the faster of the two, which may be particularly attractive if one's dataset is large and complex.

The second is Modularity Analysis (Newman, 2006). This is the approach that I prefer personally, after colleagues at my institution conducted a study comparing the two and clustering against each other (Bloomfield et al., 2017). It is slower than the Map Equation, but it seems to be better at recognising the transitional nature of cells situated between two 'pure' modules, which the Map Equation appears to tend to group into distinct modules in their own right.

Next time, how to do modularity analysis in practice.


Bloomfield NJ, Knerr N, Encinas-Viso F, 2017. A comparison of network and clustering methods to detect biogeographical regions. Ecography 41: 1-10.

Newman MEJ, 2006. Modularity and community structure in networks. Proceedings of the National Academy of Sciences, USA 103: 8577-8582.

Rosvall M, Axelsson D, Bergstrom CT, 2009. The map equation. arXiv: 0906.1405 [physics.soc-ph]

Vilhena DA, Antonelli A, 2015. A network approach for identifying and delimiting biogeographical regions. Nature Communications 6: 6848.

Saturday, January 27, 2018

Bioregionalisation part 3: clustering with Biodiverse

Biodiverse is a software for spatial analysis of biodiversity, in particular for calculating diversity scores for regions and for bioregionalisation. As mentioned in previous posts, the latter is done with clustering. Biodiverse is freely available and extremely powerful, just about the only minor issues are that the terminology used can sometimes be a bit confusing, and it is not always easy to intuit where to find a given function. As so often, a post like this might also help me to remember some detail when getting back to a program after a few months or so...

The following is about how to do bioregionalisation analysis in Biodiverse. First, the way I usually enter my spatial data is as one line per sample. So if you have coordinates, the relevant comma separated value file could look something like this:
Planta vulgaris,-26.45,145.29
Planta vulgaris,-27.08,144.88
To use equal area grid cells you may have reprojected the data so that lat and long values are in meters, but the format is of course the same. Alternatively, you may have only one column for the spatial information if your cells are not going to be coordinate-based but, for example, political units or bioregions:
Planta vulgaris,Western Australia
Planta vulgaris,Northern Territory
Just for the sake of completeness, different formats such as a tsv would also work. Now to the program itself. You are running Biodiverse and choose 'Basedata -> Import' from the menus.

Navigate to your file and select it. Note where you can choose the format of the data file in the lower right corner. Then click 'next'.

The following dialogue can generally be ignored, click 'next' once more.

But the third dialogue box is crucial. Here you need to tell Biodiverse how to interpret the data. The species (or other taxa) need to be interpreted as 'label', which is Biodiversian for the things that are found in regions. The coordinates need to be interpreted as 'group', the Biodiversian term for information that defines regions. For the grouping information the software also needs to be told if it is dealing with degrees for example, and what the size of the cells is supposed to be. In this case we have degrees and want one degree squared cells, but we could just as well have meters and want 100,000 m x 100,000 m cells.

After this we find ourselves confronted with yet another dialogue box and learn that despite telling Biodiverse which column is lat and which one is long it still doesn't understand that the stuff we just identified as long is meant to be on the x axis of a map. Arrange the two on the right so that long is above lat, and you are ready to click OK.

The result should be something like this: under a tab called 'outputs' we now have our input, i.e. our imported spatial data.

Double-clicking on the name of this dataset will produce another tab in which we can examine it. Clicking on a species name will mark its distribution on the map below. Clicking onto a cell on the map will show how similar other cells are to it in their species content. This will, of course, be much less clear if your cells are just region names, because in that case they will not be plotted in a nice two-dimensional map.

Now it is time to start our clustering analysis. Select 'Analyses -> cluster' from the menu. A third tab will open where you can select analysis parameters. Here I have chosen S2 dissimilarity as the metric. If there are ties during clustering it makes sense to break them by maximising endemism (because that is the whole point of the analysis anyway), so I set it to use Corrected Weighted Endemism first and then Weighted Endemism next if the former still does not resolve the situation. One could use random tie-breaks, but that would mean an analysis is not reproducible. All other settings were left as defaults.

After the analysis is completed, you can have the results displayed immediately. Alternatively, you can always go back to the first tab, where you will now find the analysis listed, and double-click it to get the display.

As we can see there is a dendrogram on the right and a map on the left. There are two ways of exploring nested clusters: Either change the number of clusters in the box at the bottom, or drag the thick blue line into a different position on the dendrogram; I find the former preferable. Note that if you increase the number too much Biodiverse will at a certain point run out of colours to display the clusters.

The results map is good, but we you may want to use the cluster assignments of the cells for downstream analyses in different software or simply to produce a better map somewhere else. How do you export the results? Not from the display interface. Instead, go back to the outputs tab, click the relevant analysis name, and then click 'export' on the right.

You now have an interface where you can name your output file, navigate to the desired folder, and select the number of clusters to be recognised under the 'number of groups' parameter on the left.

The reward should be a csv file like the following, where 'ELEMENT' is the name of each cell and 'NAME' is the column indicating what cluster each cell belongs to.

Again, very powerful, only have to keep in mind that your bioregions, for example, are variously called clusters, groups, and NAME depending on what part of the program you are dealing with.