Time for a spring clean? Episode 1: Taxa and Groups
Spring is in the air here in the UK and I for one have forgotten my new year’s resolutions. Blow away the cobwebs and embrace the new season feeling virtuous by spending a little time spring-cleaning your StrataBugs database. All databases need ongoing tidying up, organising and pruning. It need not be a huge job – a little dusting here and there to keep things in shape will go a long way if performed frequently. Perhaps get into the habit of performing these tasks at the end of each project.
There is another installment of this list to come!
1. Run the updater
If it’s been a while since you ran the updater, now is a good time. We have added lots of tweaks and fixes over the past few months.
The updater has ‘production’ and ‘test’ areas. We recommend most users download from ‘test’ – this is where we put everything new including bug fixes. We aim to release a ‘production’ version roughly quarterly and this should be used in environments where it is difficult to update more regularly.
If your update needs to be run by a system administrator (this may be the case if your StrataBugs install is in a shared location), then it might be worth setting up a schedule to do this every few months – by calendar reminder, etc.
2. Tackle taxa
Tidying your entire taxon dictionary is probably going to be more involved than a quick spring spruce-up. However, you should be aware of the tools because a little trim here and there, when executed frequently, will keep your database in good shape. A tidy database (garden vs. jungle) is important for (a) being able to dig deep (what are the occurrence patterns of this species in all the wells in my database?); (b) general consistency – after all, this is the whole point of using a taxon dictionary rather than a spreadsheet; and (c) sanity – both for you and whoever receives your data.
- Familiarise yourself with naming conventions and suggested use of qualifiers. The StrataBugs data model is undoubtably a compromise, designed to capture as neatly as possible the taxomomic concepts of all disciplines. There will always be areas which work better than others, and room for disagreements. We have documented our conventions here – for example the expected use of “sp” vs. “spp”. You should of course agree conventions amongst the colleagues with whom you will be sharing your dictionary; you don’t want to end up with duplicates just because you have slighly different preferences.
- The new Genus window is a good place to start if you want to tackle the dictionary one genus at a time. In Taxon database go to Taxa > Genera. Here you can open a ‘search’ similar to the taxon search.
You can see at a glance how many species belong to each genus, and you may be able to identify duplicates here. Drag and drop a genus onto the Taxa window to browse species.
- Use the Find similar taxa feature in Taxon Database. This searches your entire dictionary for taxa whose ‘name’ comes out the same, but which are not duplicates because they use slightly different fields.
Here are some of the results on the demo database:Lots of these I wouldn’t want to merge – they are in different categories – but some are just mistakes. Take ‘AL Algal cysts (reticulate)’ – these two simply have a slightly different configuration of species names and are clearly candidates for a merge.
- Regularly review new entries. You can search the taxon dictionary by created/modified date (Taxa > Advanced search in dictionary). Perhaps nominate somebody to do this every week, or have a ‘quality control’ rota, to review these additions and delete or merge where necessary. We’ve even toyed with the idea of having a scheduled task to email out regular reports of new species added. Please get in touch or leave a comment below if this is a feature you would like to see developed.
3. Organise groups
There are a few things you can do (since v2.1) to better organise your groups and sets. Make your groups sidebar manageable again by trying these steps (in order).
- Delete groups you no longer need. You can check whether a group is referenced by charts by selecting it and choosing Group > Show Chart Panels or Show Legacy charts. There are probably a few ‘picklists’ from old projects which are now redundant (don’t forget that if you needed to you could easily grab a list of all the taxa used in a well or project using Taxa > Select taxa from wells/outcrops). If you don’t remember what the group was for, you probably don’t need it anymore*!
I’m going to cull ‘Aggluts.’, ‘Aggluts2’ and ‘Important’ from this list as they are clearly junk. Right-click > Delete.
*Clearly it’s a good idea to check with your colleagues if you are unsure! I do not accept responsibility for arguments…
- Move groups into projects. The icon next to the group name shows whether it is global or belongs to a project. You can change the projct by right-clicking > Edit details:
In the top-righthand corner of the Taxon Database window, there is a project selector. This will automatically change when you move a group into another project. Your groups list will show you ‘global’ groups and groups from the selected project, with the project groups sorted to the bottom of the list.
My groups list is looking more manageable now.
- It’s a nice idea to add descriptions to groups, while you’re there. Imagine how easy this job would be if all your groups were well-documented…
- Move all remaining global groups to an Archive project. If you’ve had your StrataBugs database for a while then I suspect you will have many, many groups, only a tiny number of which are currently in use. When your database was converted to v2.1, all groups were added to the ‘global’ project. It might be easier to bulk-move all of these groups into an ‘archive project’, which will be hidden most of the time. The following SQL will update all global groups into the project with the name ‘Archive’ (you must create this first):
UPDATE TXGROUP SET PROJ_ID=(SELECT ID FROM SBWLLST WHERE NAME=’Archive’) WHERE PROJ_ID IS NULL;
There should now be nothing in your groups list. You can select the ‘Archive’ project and edit any groups which are truly applicable across projects to ‘global’.
- Use ordered group sets. In the past it was necessary to prefix group names with numbers so that they appeared in the correct order on charts. You don’t need to do this anymore thanks to ‘ordered’ group sets which I mentioned in the Genera in Groups post.
More jobs to follow next week!