- Introduction: Why grayscale pictures would possibly have an effect on anomaly detection.
- Anomaly detection, grayscale pictures: Fast recap on the 2 foremost topics mentioned on this article.
- Experiment setting: What and the way we evaluate.
- Efficiency outcomes: How grayscale pictures have an effect on mannequin efficiency.
- Pace outcomes: How grayscale pictures have an effect on inference pace.
- Conclusion
1. Introduction
On this article, we’ll discover how grayscale pictures have an effect on the efficiency of anomaly detection fashions and study how this selection influences inference pace.
In laptop imaginative and prescient, it’s effectively established that fine-tuning pre-trained classification fashions on grayscale pictures can result in degraded efficiency. However what about anomaly detection fashions? These fashions don’t require fine-tuning, however they use pre-trained classification fashions akin to WideResNet or EfficientNet as function extractors. This raises an vital query: do these function extractors produce much less related options when utilized to a grayscale picture?

This query is not only educational, however one with real-world implications for anybody engaged on automating industrial visible inspection in manufacturing. For instance, you would possibly end up questioning if a coloration digital camera is critical or if a less expensive grayscale one shall be enough. Or you possibly can have issues relating to the inference pace and need to use any alternative to extend it.
2. Anomaly detection, grayscale pictures
In case you are already acquainted with each anomaly detection in laptop imaginative and prescient and the fundamentals of digital picture illustration, be happy to skip this part. In any other case, it gives a quick overview and hyperlinks for additional exploration.
Anomaly detection
In laptop imaginative and prescient, anomaly detection is a fast-evolving area inside deep studying that focuses on figuring out uncommon patterns in pictures. Sometimes, these fashions are educated utilizing solely pictures with out defects, permitting the mannequin to be taught what “regular” seems to be like. Throughout inference, the mannequin can detect pictures that deviate from this discovered illustration as irregular. Such anomalies typically correspond to numerous defects which will seem in a manufacturing atmosphere however weren’t seen throughout coaching. For a extra detailed introduction, see this hyperlink.
Grayscale pictures
For people, coloration and grayscale pictures look fairly comparable (except for the shortage of coloration). However for computer systems, a picture is an array of numbers, so it turns into just a little bit extra sophisticated. A grayscale picture is a two-dimensional array of numbers, usually starting from 0 to 255, the place every worth represents the depth of a pixel, with 0 being black and 255 being white.
In distinction, coloration pictures are usually composed of three such separate grayscale pictures (known as channels) stacked collectively to type a three-dimensional array. Every channel (purple, inexperienced, and blue) describes the depth of the respective coloration, and its mixture creates a coloration picture. You possibly can be taught extra about this right here.
3. Experiment setting
Fashions
We are going to use 4 state-of-the-art anomaly detection fashions: PatchCore, Reverse Distillation, FastFlow, and GLASS. These fashions characterize various kinds of anomaly detection algorithms and, on the identical time, they’re extensively utilized in sensible functions as a consequence of quick coaching and inference pace. The primary three fashions use the implementation from the Anomalib library, for GLASS we make use of the official implementation.

Dataset
For our experiments, we use the VisA dataset with 12 classes of objects, which gives a wide range of pictures and has no color-dependent defects.

Metrics
We are going to use image-level AUROC to see if the entire picture was labeled accurately with out the necessity to choose a specific threshold, and pixel-level AUPRO, which reveals how good we’re at localizing faulty areas within the picture. Pace shall be evaluated utilizing the frames-per-second (FPS) metric. For all metrics, greater values correspond to raised outcomes.
Grayscale conversion
To make a picture grayscale, we’ll use torchvision transforms.

For one channel, we additionally modify function extractors utilizing the in_chans parameter within the timm library.

The code for adapting Anomalib to make use of one channel is out there right here.
4. Efficiency outcomes
RGB
These are common pictures with purple, blue, and inexperienced channels.

Grayscale, three channels
Photographs have been transformed to grayscale utilizing torchvision remodel Grayscale with three channels.

Grayscale, one channel
Photographs have been transformed to grayscale utilizing the identical torchvision remodel Grayscale with one channel.

Comparability
We will see that PatchCore and Reverse Distillation have shut outcomes throughout all three experiments for each picture and pixel-level metrics. FastFlow turns into considerably worse, and GLASS turns into noticeably worse. Outcomes are averaged throughout the 12 classes of objects within the VisA dataset.
What about outcomes per class of objects? Perhaps a few of them carry out worse than others, and a few higher, inflicting the common outcomes to seem the identical? Right here is the visualization of outcomes for PatchCore throughout all three experiments exhibiting that outcomes are fairly secure inside classes as effectively.

The identical visualization for GLASS reveals that some classes may be barely higher whereas some may be strongly worse. Nevertheless, this isn’t essentially attributable to grayscale transformation solely; a few of it may be common end result fluctuations as a consequence of how the mannequin is educated. Averaged outcomes present a transparent tendency that for this mannequin, RGB pictures produce one of the best end result, grayscale with three channels considerably worse, and grayscale with one channel the worst end result.

Bonus
How do outcomes change per class? It’s attainable that some classes are merely higher fitted to RGB or grayscale pictures, even when there are not any color-dependent defects.
Right here is the visualization of the distinction between RGB and grayscale with one channel for all of the fashions. We will see that solely pipe_fryum class turns into barely (or strongly) worse for each mannequin. The remainder of the classes turn out to be worse or higher, relying on the mannequin.

Additional bonus
In case you are excited by how this pipe_fryum seems to be, listed below are a few examples with GLASS mannequin predictions.

5. Pace outcomes
The variety of channels impacts solely the primary layer of the mannequin, the remainder stays unchanged. The pace enchancment appears to be negligible, highlighting how the primary layer function extraction is only a small a part of the calculations carried out by the fashions. GLASS reveals a considerably noticeable enchancment, however on the identical time, it reveals the worst metrics decline, so it requires warning if you wish to pace it up by switching to at least one channel.

6. Conclusion
So how does utilizing grayscale pictures have an effect on visible anomaly detection? It relies upon, however RGB appears to be the safer wager. The impression varies relying on the mannequin and knowledge. PatchCore and Reverse Distillation typically deal with grayscale inputs effectively, however you’ll want to be extra cautious with FastFlow and particularly GLASS, which reveals some pace enchancment but additionally probably the most vital drop in efficiency metrics. If you wish to use grayscale enter, you’ll want to check and evaluate it with RGB in your particular knowledge.
The jupyter pocket book with the Anomalib code: hyperlink.
Comply with creator on LinkedIn for extra on industrial visible anomaly detection.
References
1. C. Hughes, Switch Studying on Greyscale Photographs: The right way to Wonderful-Tune Pretrained Fashions (2022), towardsdatascience.com
2. S. Wehkamp, A sensible information to image-based anomaly detection utilizing Anomalib (2022), weblog.ml6.eu
3. A. Baitieva, Y. Bouaouni, A. Briot, D. Ameln, S. Khalfaoui, and S. Akcay. Past Tutorial Benchmarks: Crucial Evaluation and Greatest Practices for Visible Industrial Anomaly Detection (2025), CVPR Workshop on Visible Anomaly and Novelty Detection (VAND)
4. Y. Zou, J. Jeong, L. Pemula, D. Zhang, and O. Dabeer, SPot-the-Distinction Self-Supervised Pre-training for Anomaly Detection and Segmentation (2022), ECCV
5. S. Akcay, D. Ameln, A. Vaidya, B. Lakshmanan, N. Ahuja, and U. Genc, Anomalib (2022), ICIP