卷积神经网络

卷积神经网络 | AI生成和翻译

Home 2025.03

什么是CNN（卷积神经网络）？

卷积神经网络（CNN） 是一种深度神经网络，主要用于图像相关任务，如分类、目标检测和分割。CNN 被设计为能够自动、自适应地从输入图像中学习空间层次特征。与传统神经网络中的全连接层相比，CNN 在捕获局部依赖性和减少参数数量方面非常高效。

CNN 的关键组成部分：

卷积层：
该层对输入数据应用卷积操作，有助于从输入图像中提取特征（例如边缘、纹理、模式）。卷积操作使用在输入图像上滑动的滤波器（也称为卷积核）。
池化层：
池化层用于对特征图进行下采样，减小其空间维度，使网络计算更高效，同时有助于实现平移不变性（即使图像中的物体位置移动，也能识别出物体的能力）。
全连接层：
在卷积层和池化层之后，全连接层用于对先前层提取的特征进行分类。最终的输出层通常使用 softmax 或 sigmoid 激活函数进行分类任务。
激活函数（ReLU）：
在每个卷积层或全连接层之后，通常会使用像 ReLU（线性整流单元）这样的激活函数，为模型引入非线性，使其能够学习更复杂的模式。

CNN 架构示例：

输入层：一张图像或一批图像。
卷积层 1：应用一组卷积滤波器（卷积核）。
ReLU 激活：应用 ReLU 以引入非线性。
池化层 1：最大池化或平均池化。
卷积层 2：应用额外的卷积操作。
全连接层：将输出展平并输入到全连接层进行分类。
输出层：使用 softmax 或 sigmoid 激活函数得到最终分类结果。

从零实现 CNN（不使用 TensorFlow/PyTorch 等框架）

以下是一个使用 NumPy 实现的简单 CNN。这将帮助你理解 CNN 中的操作（卷积、ReLU、池化等）是如何工作的。

我们将实现一个基本的 CNN，包含：

一个卷积层
一个 ReLU 激活层
一个池化层
一个全连接层

我们将专注于一个非常简化的 CNN 版本，不包含批归一化、丢弃等高级功能。

步骤 1：卷积层

我们将实现卷积操作，该操作涉及在输入图像上滑动一个滤波器（卷积核）。

import numpy as np

def convolve2d(input_image, kernel):
    kernel_height, kernel_width = kernel.shape
    image_height, image_width = input_image.shape
    
    # 卷积后的输出维度
    output_height = image_height - kernel_height + 1
    output_width = image_width - kernel_width + 1
    
    output = np.zeros((output_height, output_width))
    
    # 在输入图像上滑动卷积核
    for i in range(output_height):
        for j in range(output_width):
            region = input_image[i:i+kernel_height, j:j+kernel_width]
            output[i, j] = np.sum(region * kernel)  # 逐元素相乘并求和
    return output

步骤 2：ReLU 激活

ReLU 在卷积输出上逐元素应用。

def relu(input_image):
    return np.maximum(0, input_image)  # ReLU 操作

步骤 3：池化层（最大池化）

我们将实现一个简单的最大池化层，使用 2x2 窗口和步长为 2。

def max_pooling(input_image, pool_size=2, stride=2):
    image_height, image_width = input_image.shape
    output_height = (image_height - pool_size) // stride + 1
    output_width = (image_width - pool_size) // stride + 1
    
    output = np.zeros((output_height, output_width))
    
    # 应用最大池化
    for i in range(0, image_height - pool_size + 1, stride):
        for j in range(0, image_width - pool_size + 1, stride):
            region = input_image[i:i+pool_size, j:j+pool_size]
            output[i // stride, j // stride] = np.max(region)
    
    return output

步骤 4：全连接层

全连接层是一个简单的密集层，它接收先前层的输出并计算加权和。

def fully_connected(input_image, weights, bias):
    # 将输入图像展平（如果是多维的）
    flattened_input = input_image.flatten()
    
    # 计算全连接层的输出
    output = np.dot(flattened_input, weights) + bias
    return output

步骤 5：整合所有部分

现在，我们定义一个简单的示例，创建一个 CNN，该网络接收一张图像，应用卷积、ReLU、池化，然后通过全连接层进行预测。

# 示例图像 (5x5)
image = np.array([[1, 2, 3, 0, 1],
                  [4, 5, 6, 1, 2],
                  [7, 8, 9, 2, 3],
                  [1, 2, 3, 4, 5],
                  [6, 7, 8, 5, 6]])

# 定义一个简单的卷积核 (3x3)
kernel = np.array([[1, 0, -1],
                   [1, 0, -1],
                   [1, 0, -1]])

# 卷积操作
conv_output = convolve2d(image, kernel)
print("卷积输出:")
print(conv_output)

# 应用 ReLU 激活
relu_output = relu(conv_output)
print("ReLU 输出:")
print(relu_output)

# 应用最大池化
pool_output = max_pooling(relu_output)
print("最大池化输出:")
print(pool_output)

# 全连接层（展平输出，一维权重和偏置）
weights = np.random.randn(pool_output.size)  # 随机权重
bias = np.random.randn()  # 随机偏置
fc_output = fully_connected(pool_output, weights, bias)
print("全连接输出:")
print(fc_output)

过程解释：

输入图像：我们创建了一个简单的 5x5 图像。
卷积：我们对图像应用了一个 3x3 滤波器，得到了一个较小的输出。
ReLU：我们应用了 ReLU 函数以引入非线性。
池化：我们使用 2x2 窗口和步长 2 对图像进行下采样（最大池化）。
全连接：我们将池化后的输出展平，并应用带有随机权重和偏置的全连接层。

这是一个非常简化的 CNN，但它应该能让你对卷积、ReLU、池化和全连接层的工作原理有一个基本的了解。

结论

这个实现提供了一个基础框架，用于理解 CNN 在底层是如何工作的。在实践中，像 TensorFlow、Keras 和 PyTorch 这样的框架处理了许多优化、效率改进和附加功能（例如批归一化、丢弃），这些功能使得 CNN 更强大，并且更容易在大型数据集上进行训练。

如果你希望进一步了解如何扩展这个实现，或者需要关于 CNN 其他部分的帮助，请告诉我！

Back Donate