Overview of Scampi Process
Workflow
-
Slides are prepared by a laboratory for palynological analysis in the usual way.
-
Prepared slides are scanned using a high-resolution slide scanner. This might be done at the laboratory where the slides were prepared, or by a separate scanning contractor (see below for discussion on scanner compatability). The digital slides are uploaded to Scampi.
-
The slide images are processed. Scampi distills the slide images into rectangular "crops" (usually 10-25,000 per slide). Each crop is then analysed using Scampi’s embedding models. The processing can be initiated by an admin user in the Scampi Administration Portal.
-
When the processing is finished, the slides can be viewed in the Scampi application. The user can search through the entire set of crop images using one or more images as a query. Search results and labels can be saved as distribution charts.
-
Distribution charts and selected images are exported to CSV or PDF, or imported directly to StrataBugs for geological interpretation.
-
At the end of the project, an admin user triggers the deletion of data from Scampi.
How does it work?
Scampi was originally developed over several years by a collaboration of data scientists, software engineers and biostratigraphers from Equinor and the University of Tromsø, Norway. The research basis is detailed in their paper, The Fossil Frontier: An answer to the 3-billion fossil question, and demonstrated in practice in the EAGE extended abstract Scampi - Exploring The Fossil Frontier. StrataData are delighted to have joined this collaboration because we support Scampi’s design philosophy.
The Scampi embedding models produce a numerical fingerprint of every image’s visual features. From this fingerprint, we can compute a 'distance' between each image. A smaller distance implies more 'similar' images. We calculate the distance between each image and every other image just once, during the processing step, and store the results in a searchable data structure. This makes searching computationally efficient (fast).
In simple terms, we could assume that where we find 'similar' images, they are examples of the same species. In practice, it takes knowledge and experience to make this decision. Even if we had enough data to train models to automatically classify all palynomorphs down to the species level, and we could account for variations in preservation, differences in preparation and the specifics of every scanner – the concept of the species itself is often disputed!
By leaving identification to the experts, Scampi allows for continuing developments in the field of palynology. We give our existing experts a tool with which to make new discoveries. By making palynology more efficient, and therefore more cost-effective, we support the discipline to stay relevant. We also present a valuable tool with which to train the next generation of palynologists.
FAQ
Do I need a StrataBugs license to use Scampi?
No. Scampi is an entirely standalone product. However, if you do use StrataBugs, you will easily be able to pull in your Scampi results, along with some of the crop images.
Does StrataData provide preparation and scanning?
No. Slides must be prepared by a laboratory and scanned using an appropriate scanner.
Your scanning contractor will be able to upload their scans directly to Scampi.
Does it matter which scanner we use?
There are now many microscopy scanners on the market, which are commonly used in the pathology industry. They vary enormously in quality and price! Unfortunately, there is no standard format for their images. Each scanner’s format must be implemented in Scampi.
In addition to this, the (complex) scanning parameters must be finely tuned to create the best scan images for Scampi to process. Arriving at the best configuration for each scanner requires experimentation and careful analysis.
If you have a scanner and you are interested in providing scanning services for Scampi, please get in touch with us.
How long does the processing take?
Once the scans are uploaded, the processed slides should be available for viewing within a few hours, depending on the number of slides and the existing workload.
Will StrataData store my data in the long term?
Scampi is not designed to be a long-term data store. Monthly charges will apply for every slide in the system. Once the project is complete, all image data, including the original scans, will be deleted. If you need to retain the scans in the long term, you will need to make your own arrangements for this.
Will my input be used to train the system?
No. Scampi’s embedding model is already trained. The processing step uses the Scampi model to perform analysis on the slide data. The results of the processing are used to provide the search results. The crop images and any labels you add to them remain your private data, and can be extracted from Scampi to use as you wish.
Can we try it out?
Yes, a demonstration dataset is available - please contact us to request access.
Our per-slide pricing model means you can try it out on your own data without committing to a subscription license.
Where are the data held?
The Scampi instance is physically located in the EU.
Does Scampi have applications beyond industrial biostratigraphy?
Palynology itself has many applications, and Scampi could be applied to any of them. The "axis" of depth could easily be time, for example.