多元散射校正预处理方法原理公式及python实现代码

多元散射校正(MSC,Multiplicative Scatter Correction)是一种用于光谱数据预处理的技术,主要用于消除由于散射效应和颗粒大小差异引起的光谱基线漂移和幅度变化。

MSC的主要目标是将测量的光谱校正到一个基准光谱上,从而减少这些不相关的变化。基准光谱通常是所有测量光谱的平均光谱。选择平均光谱作为参考光谱可以很好地代表整个数据集的特征。

MSC的公式如下:

① 计算所有光谱数据的均值作为基准光谱:

X_{ref}=\frac{1}{n}\sum_{i=1}^{n}X_{i}

② 对每个测量光谱进行线性回归:

对于每个测量光谱X_{i},用最小二乘法拟合一个线性模型:

X_{i}=a_{i}\cdot X_{ref}+b_{i}

③ 计算校正后的光谱:

使用得到的a_{i}和b_{i}校正原始光谱:

X_{i}^{MSC}=\frac{X_{i}-b_{i}}{a_{i}}

注:其中a_{i}和b_{i}的计算方法如下:

a. 计算均值:计算每个测量光谱X_{i}的均值\hat{X_{i}}和基准光谱的均值\overline{X_{ref}}

b. 计算测量光谱和基准光谱之间的协方差:

Cov(X_{i},X_{ref})=\frac{1}{n-1}\sum_{k=1}^{n}(X_{i}[k]-\overline{X_{i}})(X_{ref}[k]-\overline{X_{ref}})

c. 计算基准光谱的方差:

Var(X_{ref})=\frac{1}{n-1}\sum_{k=1}^{n}(X_{ref}[k]-\overline{X_{ref}})^{2}

d: 计算回归系数a_{i}和b_{i}

a_{i}=\frac{Cov(X_{i},X\overline{X_{ref}})}{Var(\overline{X_{ref}})}     b_{i}=\overline{X_{i}}-a_i\cdot \overline{X_{ref}}

具体示例

假设有一个基准光谱X_{ref}和一个测量光谱 X_i,它们在某些波长点上的值如下:

X_{ref}=[2.0,2.1,2.2,2.3,2.4]    X_i=[1.8,1.9,2.0,2.1,2.2]

我们计算均值: 

\overline{X_{ref}}=\frac{2.0+2.1+2.2+2.3+2.4}{5}=2.2   \overline{X_{i}}=\frac{1.8+1.9+2.0+2.1+2.2}{5}=2.0

计算协方差和方差:

Conv(X_{i},\overline{X_{ref}})=\frac{1}{4}[(1.8-2.0)(2.0-2.2)+(1.9-2.0)(2.1-2.2)+(2.0-2.0)(2.2-2.2)+(2.1-2.0)(2.3-2.2)+(2.2-2.0)(2.4-2.2)]=0.1

Var(\overline{X_{ref}}))=\frac{1}{4}[(2.0-2.2)^2+(2.1-2.2)^2+(2.2-2.2)^2+(2.3-2.2)^2+(2.4-2.2)^2]=0.02

计算回归系数:

a_{i}=0.1\div0.02 =5.0   b_{i} = 2.0-5.0\cdot 2.2=9.0

最后,计算校正后的光谱:X_{i}^{MSC}=\frac{X_{i}-(-9)}{5}

python代码实现(使用LinearRegression函数和手动实现的两种方式):

def msc(X):
    """
    Perform Multiplicative Scatter Correction (MSC) on the given spectral data.
    
    Parameters:
    X (ndarray): The input spectral data matrix, where rows are samples and columns are wavelengths.
    
    Returns:
    ndarray: The MSC corrected spectral data.
    """
    # Calculate the mean spectrum
    X_ref = np.mean(X, axis=0)
    
    # Initialize the corrected spectra matrix
    X_msc = np.zeros_like(X)
    
    # Perform MSC on each spectrum
    for i in range(X.shape[0]):
        # Reshape data for LinearRegression
        X_ref_reshaped = X_ref.reshape(-1, 1)
        X_i_reshaped = X[i].reshape(-1, 1)
        
        # Fit linear regression model
        model = LinearRegression().fit(X_ref_reshaped, X_i_reshaped)
        a = model.coef_[0][0]
        b = model.intercept_[0]
        
        # Apply correction
        X_msc[i] = (X[i] - b) / a
    
    return X_msc
import numpy as np
import matplotlib.pyplot as plt

def msc_manual(X):
    """
    Perform Multiplicative Scatter Correction (MSC) on the given spectral data manually.
    
    Parameters:
    X (ndarray): The input spectral data matrix, where rows are samples and columns are wavelengths.
    
    Returns:
    ndarray: The MSC corrected spectral data.
    """
    # Calculate the mean spectrum
    X_ref = np.mean(X, axis=0)
    
    # Initialize the corrected spectra matrix
    X_msc = np.zeros_like(X)
    
    # Perform MSC on each spectrum
    for i in range(X.shape[0]):
        # Calculate means
        X_i = X[i]
        mean_X_i = np.mean(X_i)
        mean_X_ref = np.mean(X_ref)
        
        # Calculate covariance and variance
        covariance = np.mean((X_i - mean_X_i) * (X_ref - mean_X_ref))
        variance = np.mean((X_ref - mean_X_ref) ** 2)
        
        # Calculate regression coefficients
        a = covariance / variance
        b = mean_X_i - a * mean_X_ref
        
        # Apply correction
        X_msc[i] = (X_i - b) / a
    
    return X_msc

相关推荐

  1. 更通用的excel公式python代码方法

    2024-06-11 13:36:04       40 阅读
  2. R-Tree原理实现代码

    2024-06-11 13:36:04       46 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-06-11 13:36:04       172 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-06-11 13:36:04       190 阅读
  3. 在Django里面运行非项目文件

    2024-06-11 13:36:04       158 阅读
  4. Python语言-面向对象

    2024-06-11 13:36:04       171 阅读

热门阅读

  1. linux系统的使用

    2024-06-11 13:36:04       42 阅读
  2. 选题排序(十大排序算法)

    2024-06-11 13:36:04       38 阅读
  3. python class __format__ __bytes__区别

    2024-06-11 13:36:04       54 阅读
  4. lua网站开发中如何制作自定义模块

    2024-06-11 13:36:04       39 阅读
  5. 等保工控安全

    2024-06-11 13:36:04       41 阅读
  6. 《计算机组成原理》笔记整理

    2024-06-11 13:36:04       33 阅读
  7. Redis专题----2

    2024-06-11 13:36:04       35 阅读
  8. CSS中背景断裂和精灵图的关系,以及4种解决方式

    2024-06-11 13:36:04       39 阅读