The task of an imaging sensor is to convert an optical signal into an electrical signal. This principle of imaging sensors is based on the so-called photovoltaic effect, which describes how photons interact with material to free an electron resulting in the buildup of charge. In the majority of cameras silicon is the substrate used for this purpose. In all cases an electron is removed from its binding by absorption of a photon.
The natural properties of silicon make it ideal to use as the major component of the elementary unit of most imaging sensors: the pixel.
Independent of the type of sensor one can consider the pixel as the basic unit. The main element of a pixel in turn is the photo-sensitive photodiode with silicon coupled to an electron storage well (Fig. 1). Silicon is responsible for generating the electrons which then can be collected, moved, and finally converted into a digital signal. Additional components of a pixel include electrical control circuits and pigment layers to exclude unwanted or destructive wavelengths.
During the imaging process, photons hitting the photodiodes are converted into electrons. These electrons are stored in the electron storage wells for subsequent transfer – the readout – to the amplifier (Fig. 2). The amplifier reads the accumulated electrons and transforms them into a voltage, whereas the adjacent analog-digital (AD) converter does the digitization and produces equivalent digital signals.
The charge generated in a pixel is directly proportional to the number of photons striking the sensor, which is typically influenced by the duration of light exposure (integration time), the detected wavelength and most importantly the light intensity. As a rule of thumb the pixel size defines the number of electrons which can be collected without saturating a pixel. The size of pixels typically varies between 2-24 µm² for microscopy imaging sensors.
Due to typical pixel architectures, not the entire surface of a pixel is photo-sensitive. The fill factor of an image sensor describes the relation of a pixel’s light-sensitive area to its overall area. Microlenses can be added to a pixel to better focus the light onto the photosensitive regions improving the fill factor.
A complete digital imaging sensor consists of millions of pixels organized in a geometrical array. Very often the number of pixels is mixed up with “resolution”. It is noteworthy that it is not simply the number of pixels but their size which is defining the resolution of the camera chip. In general smaller pixels will give a higher resolution than large ones. In the end, the resolution of a microscopy system depends not only on the sensor array but the complete optical system.
Noise and signal-to-noise ratio
Unfortunately noise is a fundamental of physics that affects all signals. The impact and the type of dominant noise vary for different sensor types. Generally one can classify camera noise into three major classes according to their source:
Dark noise - also known as dark current - is a fundamental noise present in the sensor. Dark noise is caused by thermal energy in the silicon randomly generating electrons in pixels. Dark noise builds up in pixels with exposure time. It is expressed in electrons per pixel per second (e-/px/sec). It is less of a concern for fast applications with short exposure times. When it comes to long exposure times e.g. one second or more for weak fluorescent signals, this noise type can become a major issue. Dark noise is reduced by cooling the sensor, halving the dark current with every 8 degrees of cooling (Fig. 3).
Read noise originates from the electrical readout circuitry of the sensor involved in quantifying the signal. As a rule of thumb the read noise can be reduced by reducing the pixel readout rate. This pixel readout rate defines how fast charge can be read out from the sensor (unit: MHz). As this determines the frame rate of the camera read noise has to be taken into account for fast experiments like high-speed time-lapse of living cells. Some cameras offer the possibility to alter read out rates enabling cameras to be optimized for fast readout mode or slower low noise modes for low light applications. The unit of read noise is e- and is independent of integration time. Read noise together with the dark noise can be used to decide if a particular camera is suitable for low light fluorescence application or not.
Photon shot noise as another source of noise is based on the uncertainty in counting the incoming photons. In other words it arises from the stochastic nature of photon impacts on the sensor but is not introduced by the sensor itself. It is best explained by imagining you are trying to catch rain drops in buckets. Even if every bucket is of identical size and shape, not every bucket will catch exactly the same numbers of drops hence detection of photons on the chip can be visualized as a Poisson distribution.
Under low light conditions such as fluorescence imaging when the signal intensity is low, the different noise sources can have a major impact on the quality of the image as they impact the signal-to-noise ratio. Using the right camera for the application is therefore essential for capturing good images.
The signal-to-noise ratio (SNR) is a measure for the overall quality of an image, which is heavily influenced by the sensor type. Broadly speaking it can be designated as its sensitivity. Although this can be rather complex, the SNR expresses how well a signal of interest is distinguished from the background noise (Fig. 4). There are several factors to explore here, as the signal depends on the photon number arriving at the sensor combined with the sensor ability to convert those photons into a signal and how well the camera can suppress unwanted noise. That is why e. g. the fill factor and micro lenses play an important role here, as well as the Quantum Efficiency of the sensor (see section “Quantum Efficiency”).
Finally it is important to mention that optical noise from the sample, auto fluorescence or poor staining is often the dominant noise source in an image. The use of advanced sensors cannot help you to overcome a poorly prepared sample.
The full-well capacity is largely dependent on the physical size of the pixel. It refers to the charge storage capacity of a single pixel. This is the maximum number of electrons it can collect before saturation. Reaching the full-well capacity can be compared to a bucket filled with water (Fig. 5).
Larger pixels have a greater full-well capacity than small pixels (typically 18,000 e- for 6.45 µm pixel vs 300,000 e- for a 24 µm pixel). Spatial resolution is sacrificed for the larger full well capacity which in turn influences the dynamic range (see section “Dynamic Range”).
Electrons exceeding the full-well capacity cannot be quantified. In some instances the charge can leak into adjacent pixels causing an effect known as blooming (Fig. 6). Some sensors contain anti blooming electronics which attempt to bleed off the excess charge to suppress blooming artifacts.
Fig. 6: Blooming artefacts. In the left image the bucket’s volume is sufficient to hold all the incoming water drops. A corresponding microscopic image is shown right next to it. If the incoming water exceeds the bucket’s capacity, water will overflow and fill adjacent containers. The overflowing electrons can lead to blooming artefacts which can be seen on the microscopic image.
A characteristic directly connected to full-well capacity is the dynamic range. This describes the sensor’s ability to simultaneously record low and high intensity signals. In a practical sense this means that the weaker signal is not lost in the noise and the brightest signal does not saturate the sensor. Expressed in a mathematical term the dynamic range is defined as the full-well capacity (FWC) divided by the camera noise.
It is often described in decibel units (dB):
The dynamic range improves if the full-well capacity is higher and the camera noise is lower. In a first approximation one can say, that the following parameters therefore affect the dynamic range:
- Pixel size (Full-well capacity)
- Temperature (Dark noise)
- Readout rate (Readout noise)
For fluorescence applications a large dynamic range is a major benefit to document bright fluorescence signals against a dark background (Fig. 7), especially when quantifying signals.
The dynamic range is directly affected by the applied gain. The term “gain” here is used to express the amplification of a generated signal. If you e.g. double the gain of your sensor you effectively halve the full-well capacity, this in turn decreases the dynamic range. Thus a trade-off between sensitivity and dynamic range is often required.
If the inherent dynamic range of the sensor is not sufficient for the application - respectively specimen - one may consider a “high-dynamic range” (HDR) acquisition. During this procedure a series of images is acquired with varying exposure intensities. The resulting image is finally calculated by applying different algorithms (Fig. 8). The drawback of this approach is the elongated time needed to acquire the images. Therefore this is not preferable for fast-moving or light-sensitive samples.
Fig. 8: HDR acquisition. This specimen (Tilia spec.) has areas with a strong fluorescence signal (upper part) and a weak fluorescence signal (lower part). The camera’s dynamic range is not sufficient to record dark areas simultaneously with bright areas. Therefore exposure intensities can only be optimized to image either the strong (left) or the weak (middle) fluorescence signal. An HDR picture (right) consists of a series of images acquired with varying exposure intensities combined together to one image.
In an ideal world one would assume that 100 photons are capable to generate 100 electrons. When interacting with a sensor, photons may be absorbed, reflected or even pass straight through. The ability of a sensor to absorb and convert light of a certain wavelength into electrons is known as its quantum efficiency (QE).
The quantum efficiency of a sensor is affected by a number of factors including:
- Fill factor
- Addition / performance of microlenses
- Anti-reflective coating
- Sensor format (back- or front-illuminated)
The quantum efficiency is always a function of the wavelength of the incoming light. Silicon detectors most commonly used in scientific imaging are able to detect wavelengths just beyond the range of visible light (~400 to 1000 nm). By looking at a QE curve you can see how efficient a particular sensor is at converting a particular wavelength into signal (Fig. 9).
The majority of camera sensors are front-illuminated where incident light enters from the front of the pixel, having to pass semi-opaque layers containing the pixels circuitry, before hitting the photo-sensitive silicon (Fig. 1). These layers cause some light loss, so front-illuminated sensors typically have maximum QE’s around 50-60%. As the electronics on the surface of sensors are only able to generate a localized electrical field they’re not able to manipulate the charge that forms deeper in the silicon wafer (Fig. 10).
In the case of a back-illuminated sensor, light directly hits the photo-sensitive silicon from the “back” without having to pass through the pixel circuitry, offering maximum QE’s approaching 95%. To manufacture back-illuminated sensors, also known as back-thinned, this additional silicon is ground away - an expensive process - to create an incredibly thin silicon layer where all of the charge can be manipulated by the pixels’ electronics.
The bit depth can be related to, but should not be confused with, dynamic range, and refers to how the analog signal is digitized – or chopped up – into gray scale values or gray levels. The dynamic range of a digital camera sensor depends on its FWC and noise. Bit depth depends on the AD converter’s ability to transform the number of generated electrons into gray scale values. The more gray scales it can output, the more details can be reproduced (Fig. 11).
Some cameras offer more gray scale values than the maximum number of electrons that can be generated by photons (e.g. 16 bit digitization chops the signal into ~65K grey scale units). In extreme circumstances the sensor may saturate below 1,000 photons/pixel yet the image still shows 65,000 gray scale values. Moreover, computer screens typically are only able to display 8-bit data. That is why a camera signal with more than 8 bit has to be scaled down to be displayed. Users can influence this process with the help of the look up table (LUT). Playing with it can often reveal hidden detail in an image.
Fig. 11: Dynamic Range vs. Bit Depth. The dynamic range of a sensor refers to its ability to simultaneously record low and high intensity signals. This goes back to its pixels’ FWC and its noise properties. A high FWC is good for the detection of high-intensity signals where many photons strike the pixel. On the other side a low noise is good for the detection of low intensity signals. Whereas the dynamic range mainly refers to the characteristics of pixels, bit depth is a property of the AD converter. The greater the bit depth, the better the full dynamic range of an image can be resolved. With a 2 bit AD converter a digital imaging sensor can output 4 grey levels, with a 4 bit AD converter 16 etc.
Imaging speed and binning
The imaging speed of a digital camera is measured in frame rates indicated as frames per second (fps). This is the number of images (frames) the camera can acquire in one second. Many factors can affect the maximum achievable frame rate of a camera. At a given exposure time the following parameters need to be considered:
- Pixel count
- Pixel readout rate
- Computer interface (USB 2.0/ USB 3.0/ CamLink etc.)
The easiest way to increase frame rates is by reducing the number of pixels being read out by switching to a smaller region of interest (ROI). As the frame rates increase, the number of photons hitting the sensor will decrease, so depending on the sample type there comes a point at which additional sensitivity is required. One trick that can be employed to boost speed and reduce noise is “on-chip binning”.
During binning instead of reading out the data from every pixel separately, the data from several adjacent pixels is combined on chip in the serial register and read out as a “Super-Pixel”. In this manner the data from 2x2, 3x3 or 4x4 and more pixels can be combined (Fig. 12).
Binning improves the signal-to-noise ratio at the expense of resolution. Assuming each pixel contains 100 electrons and the read noise is 10 electrons, read out one by one the signal-to-noise ratio is 10/1. If binned 2x2 the signal being read out is now 400 and the read noise is still 10 so the signal-to-noise ratio increases dramatically to 40/1. As the readout electronics have to process fewer data points (4 x less in the case of 2x2 binning) the frame rate can also increase. The major drawback of binning is the loss of resolution as the effective pixel size increases by bin value squared (Fig. 13).
Working with binning is standard for fast fluorescence imaging e.g. fast time-lapse. The aim is to reduce the noise, data size, and to reduce the exposure time. The latter is especially worth mentioning since this reduces the bleaching and light-induced damage of living cells.
For brightfield applications like documentation of stained pathology tissues, binning is often applied to the live image allowing a smooth on-screen image whilst the microscope stage is moved.
Types of sensors
Most of the above described features and parameters are generic for all types of imaging sensors in microscopy. However - based on historical developments and technical improvements the microscopist can select between different types of sensors and camera, respectively. They differ in the principle architecture (e.g. CCD vs. CMOS), ability to enhance signals (e.g. EMCCDs vs. CCDs) and quality of images (e.g. CMOS vs. sCMOS).
CCD sensors - Charged Coupled Device: Cameras based on this sensor type are the workhorses for brightfield and fluorescence imaging. Characteristically the charge generated in the pixels is moved over from one pixel to the other across the surface into the serial register (Fig. 14). From the serial register charges are passed one by one to the read out electronics where the signal is converted into a voltage, amplified, quantified, and digitized. So all the data within a CCD sensor is usually read out through a single output node.
EMCCD sensors - Electron Multiplying CCD: EMCCD sensors are basically CCD sensors with the addition of an EM gain register between the sensor and the read out electronics. The EM gain register amplifies the signal before it encounters the read out electronics. In addition to this EMCCD cameras employ back thinned sensor technologies typical offering peak QE >90%. These types of cameras are used for extreme low light applications and can be single photon sensitive. The price of these cameras is typically significantly higher than for regular CCD-based cameras.
CMOS - Complementary Metal Oxide Semiconductor: Originally used in cell phones and low end cameras, CMOS technology has improved significantly in recent years and has become an important imaging device for standard brightfield application in microscopy. The major difference compared to CCDs is the intra pixel electronics and time saving sensor read out principle with thousands of read out nodes compared to the single read out node used in traditional CCD sensors.
sCMOS - scientific CMOS: Introduced a couple of years ago this type of sensor overcomes common drawbacks of CMOS sensors like high noise level. This type of sensor is used for high-end fluorescence imaging which benefits from the fast frame rates, high dynamic range and low noise.
Modern optical microscopy is unthinkable without digital camera technology. Most of the microscope users either want to watch their specimen live on a monitor, or want to save and process their discoveries on a computer. Moreover some microscopic techniques such as localization microscopy even would not have been possible without the rise of digital camera sensors. The reader of this article should have learned how digital microscopic images are produced. This in turn will help using a digital camera properly and how to interpret the generated data in the right way.