How many VRAM should you need of your Neural Network ?

How many VRAM should you need of your Neural Network ?

October 30, 2024

C X

Camille X

估算

VRAM=model.parameters()+middle_variables×batch_size ext{VRAM} = ext{model.parameters()} + ext{middle_variables} imes ext{batch_size}
( https://www.slideshare.net/slideshow/cs231n-2017-lecture9-cnn-architecture/83282815#34 )

( https://www.slideshare.net/slideshow/cs231n-2017-lecture9-cnn-architecture/83282815#34 )

以float32精度为例, 一个VGG16模型的参数量为138M,中间变量的大小为96M/image,batch size为32,那么一个batch的VRAM需求为

(0.1384+0.096232)4=6.4GB(0.138 * 4 + 0.096 * 2 * 32) * 4 = 6.4GB

代码测试

torchsummary

			import torch
import torch.nn as nn
import torchvision.models as models
from torchsummary import summary
 
# 定义一个简单的模型(这里以 ResNet18 为例)
model = models.resnet18().cuda()  # 将模型移动到 GPU
 
# 查看模型的概要信息
summary(model, (3, 256, 256))  # 输入尺寸为 256x256 的 RGB 图像
 
# 计算显存占用
def get_memory_usage():
    allocated_memory = torch.cuda.memory_allocated()  # 已分配的显存
    reserved_memory = torch.cuda.memory_reserved()    # 保留的显存
    return allocated_memory, reserved_memory
 
# 进行一次前向传播以查看中间变量的显存占用
input_tensor = torch.randn(1, 3, 256, 256).cuda()  # 生成一个随机输入
allocated_before = get_memory_usage()[0]  # 前向传播前的显存占用
output = model(input_tensor)  # 前向传播
allocated_after = get_memory_usage()[0]  # 前向传播后的显存占用
 
print(f"前向传播前显存占用: {allocated_before / (1024 ** 2):.2f} MB")
print(f"前向传播后显存占用: {allocated_after / (1024 ** 2):.2f} MB")
print(f"前向传播占用显存: {(allocated_after - allocated_before) / (1024 ** 2):.2f} MB")
	
			import torch
import torch.nn as nn
import torchvision.models as models
from torchsummary import summary
 
# 定义一个简单的模型(这里以 ResNet18 为例)
model = models.resnet18().cuda()  # 将模型移动到 GPU
 
# 查看模型的概要信息
summary(model, (3, 256, 256))  # 输入尺寸为 256x256 的 RGB 图像
 
# 计算显存占用
def get_memory_usage():
    allocated_memory = torch.cuda.memory_allocated()  # 已分配的显存
    reserved_memory = torch.cuda.memory_reserved()    # 保留的显存
    return allocated_memory, reserved_memory
 
# 进行一次前向传播以查看中间变量的显存占用
input_tensor = torch.randn(1, 3, 256, 256).cuda()  # 生成一个随机输入
allocated_before = get_memory_usage()[0]  # 前向传播前的显存占用
output = model(input_tensor)  # 前向传播
allocated_after = get_memory_usage()[0]  # 前向传播后的显存占用
 
print(f"前向传播前显存占用: {allocated_before / (1024 ** 2):.2f} MB")
print(f"前向传播后显存占用: {allocated_after / (1024 ** 2):.2f} MB")
print(f"前向传播占用显存: {(allocated_after - allocated_before) / (1024 ** 2):.2f} MB")
	

torchinfo

			from torchinfo import summary
model = models.resnet18().cuda()
summary(model, input_size=(1, 3, 256, 256))
	
			from torchinfo import summary
model = models.resnet18().cuda()
summary(model, input_size=(1, 3, 256, 256))
	
  • next day
  • leetcode * 2
  • 评估zarr与hdf5,手写dataloador,重点是预处理的代码,包括随机化
    • 弄清楚sampler, replay buffer的代码
  • 看看灵巧手的文章
    • 想想人形的模型改进方案,和技巧
  • 把这个文章写完
Camille-X | © 2025

Made with

svelte-logo