mindcv/configs at main · mindspore-lab/mindcv

History

Name		Name	Last commit message	Last commit date
parent directory ..
bit		bit
cmt		cmt
coat		coat
convit		convit
convnext		convnext
convnextv2		convnextv2
crossvit		crossvit
densenet		densenet
dpn		dpn
edgenext		edgenext
efficientnet		efficientnet
ghostnet		ghostnet
googlenet		googlenet
halonet		halonet
hrnet		hrnet
inceptionv3		inceptionv3
inceptionv4		inceptionv4
mixnet		mixnet
mnasnet		mnasnet
mobilenetv1		mobilenetv1
mobilenetv2		mobilenetv2
mobilenetv3		mobilenetv3
mobilevit		mobilevit
nasnet		nasnet
pit		pit
poolformer		poolformer
pvt		pvt
pvtv2		pvtv2
regnet		regnet
repmlp		repmlp
repvgg		repvgg
res2net		res2net
resnest		resnest
resnet		resnet
resnetv2		resnetv2
resnext		resnext
rexnet		rexnet
senet		senet
shufflenetv1		shufflenetv1
shufflenetv2		shufflenetv2
sknet		sknet
squeezenet		squeezenet
swintransformer		swintransformer
swintransformerv2		swintransformerv2
vgg		vgg
visformer		visformer
vit		vit
volo		volo
xception		xception
xcit		xcit
README.md		README.md

README.md

File Structure and Naming

This folder contains training recipes and model readme files for each model. The folder structure and naming rule of model configurations are as follows.

    ├── configs
        ├── model_a                         // model name in lower case with _ seperator
        │   ├─ model_a_small_ascend.yaml    // training recipe denated as {model_name}_{specification}_{hardware}.yaml
        |   ├─ model_a_large_gpu.yaml
        │   ├─ README.md                    //readme file containing performance results and pretrained weight urls
        │   └─ README_CN.md                 //readme file in Chinese
        ├── model_b
        │   ├─ model_b_32_ascend.yaml
        |   ├─ model_l_16_ascend.yaml
        │   ├─ README.md
        │   └─ README_CN.md
        ├── README.md //this file

Note: Our training recipes are verified on specific hardware, and the suffix hardware (ascend or gpu) in the file name of training recipes indicates different hardware. Since Mindspore operators have different precision and performance on different hardware, different training recipes are required under different hardware. However, if you want to train on another hardware (e.g. GPU) using the training recipe under specific hardware (e.g. Ascend), you only need to make minor or no adjustments to the hyperparameters, because the training recipe has a certain degree of generalization across different hardware.

Model Readme Writing Guideline

The model readme file in each sub-folder provides the introduction, reproduced results, and running guideline for each model.

Please follow the outline structure and table format shown in densenet/README.md when contributing your models :)

Table Format

model name	params(M)	cards	batch size	resolution	jit level	graph compile	ms/step	img/s	acc@top1	acc@top5	recipe	weight
densenet121	8.06	8	32	224x224	O2	300s	47,34	5446.81	75.67	92.77	yaml	weights

Illustration:

model name: model name in lower case with _ seperator.
top-1 and top-5: Accuracy reported on the validatoin set of ImageNet-1K. Keep 2 digits after the decimal point.
params(M): # of model parameters in millions (10^6). Keep 2 digits after the decimal point
batch size: Training batch size
cards: # of cards
ms/step: Time used on training per step in ms
jit level: Jit level of mindspore context, which contains 3 levels: O0/O1/O2
recipe: Training recipe/configuration linked to a yaml config file.
weight: url of the pretrained model weights

Model Checkpoint Format

The checkpoint (i.e., model weight) name should follow this format: {model_name}_{specification}-{sha256sum}.ckpt, e.g., poolformer_s12-5be5c4e4.ckpt.

You can run the following command and take the first 8 characters of the computing result as the sha256sum value in the checkpoint name.

sha256sum your_model.ckpt

Training Script Format

For consistency, it is recommended to provide distributed training commands based on msrun --bind_core=True --worker_num {num_devices} python train.py, instead of using shell script such as distrubuted_train.sh.

# standalone training on single NPU device
python train.py --config configs/densenet/densenet_121_gpu.yaml --data_dir /path/to/dataset --distribute False

# distributed training on NPU divices
msrun --bind_core=True --worker_num 8 python train.py --config configs/densenet/densenet_121_ascend.yaml --data_dir /path/to/imagenet

URL and Hyperlink Format

Please use absolute path in the hyperlink or url for linking the target resource in the readme file and table.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

README.md

File Structure and Naming

Model Readme Writing Guideline

Table Format

Model Checkpoint Format

Training Script Format

URL and Hyperlink Format

Files

configs

Directory actions

More options

Directory actions

More options

Latest commit

History

configs

Folders and files

parent directory

README.md

File Structure and Naming

Model Readme Writing Guideline

Table Format

Model Checkpoint Format

Training Script Format

URL and Hyperlink Format