Name		Name	Last commit message	Last commit date
parent directory ..
exp		exp
exp_light		exp_light
models		models
token_labeling		token_labeling
README.md		README.md
check_model.py		check_model.py
datasets.py		datasets.py
engine.py		engine.py
engine_light.py		engine_light.py
generate_tensorboard.py		generate_tensorboard.py
losses.py		losses.py
main.py		main.py
main_light.py		main_light.py
requirements.txt		requirements.txt
samplers.py		samplers.py
utils.py		utils.py

README.md

Image classification with UniFormer

We currenent release the code and models for:

ImageNet-1K pretraining
ImageNet-1K pretraining + Token Labeling
Large resolution fine-tuning
Lightweight Model

Update

05/21/2022

Lightweight models are released, which surpass MobileViT, PVTv2 and EfficientNet.

03/06/2022

Some models with head_dim=64 are released, which can save memory cost for downstream tasks.

01/19/2022

Pretrained models on ImageNet-1K with Token Labeling.
Large resolution fine-tuning.

01/13/2022

Pretrained models on ImageNet-1K are released.

Model Zoo

Lightweight models on ImageNet-1K

The followed models and logs can be downloaded on Google Drive: total_models, total_logs.

We also release the models on Baidu Cloud: total_models (bdkq), total_logs (ttub).

Model	Top-1	Resolution	#Param.	FLOPs	Model	Log	Shell
UniFormer-XXS	76.8	128x128	10.2M	0.43G	google	google	run.sh
UniFormer-XXS	79.1	160x160	10.2M	0.67G	google	google	run.sh
UniFormer-XXS	79.9	192x192	10.2M	0.96G	google	google	run.sh
UniFormer-XXS	80.6	224x224	10.2M	1.3G	google	google	run.sh
UniFormer-XS	81.5	192x192	16.5M	1.4G	google	google	run.sh
UniFormer-XS	82.0	224x224	16.5M	2.0G	google	google	run.sh

For those lightweight models, we train them with longer (600) epochs and weaker data augmentation. Besides, to avoid loss NAN, we do not use mixed precision training.

ImageNet-1K pretrained (224x224)

The followed models and logs can be downloaded on Google Drive: total_models, total_logs.

We also release the models on Baidu Cloud: total_models (bdkq), total_logs (ttub).

Model	Top-1	#Param.	FLOPs	Model	Log	Shell
UniFormer-S	82.9	22M	3.6G	google	google	run.sh
UniFormer-S†	83.4	24M	4.2G	google	google	run.sh
UniFormer-B	83.8	50M	8.3G	google	-	run.sh
UniFormer-B+Layer Scale	83.9	50M	8.3G	google	google	run.sh

Though Layer Scale is helpful for training deep models, we meet some problems when fine-tuning on video datasets. Hence, we only use the models trained without it for video tasks.

Due to the model UniFormer-S† uses head_dim=32, which cause much memory cost for downstream tasks. We re-train these models with head_dim=64. All models are trained with 224x224 resolution.

Model	Top-1	#Param.	FLOPs	Model	Log	Shell
UniFormer-S†	83.4	24M	4.2G	google	google	run.sh

ImageNet-1K pretrained with Token Labeling (224x224)

The followed models and logs can be downloaded on Google Drive: total_models, total_logs.

We also release the models on Baidu Cloud: total_models (p05h), total_logs (wsvi).

We follow LV-ViT to train our models with Token Labeling. Please see token_labeling for more details.

Model	Top-1	#Param.	FLOPs	Model	Log	Shell
UniFormer-S	83.4 (+0.5)	22M	3.6G	google	google	run.sh
UniFormer-S†	83.9 (+0.5)	24M	4.2G	google	google	run.sh
UniFormer-B	85.1 (+1.3)	50M	8.3G	google	google	run.sh
UniFormer-L+Layer Scale	85.6	100M	12.6G	google	google	run.sh

Due to the models UniFormer-S/S†/B use head_dim=32, which cause much memory cost for downstream tasks. We re-train these models with head_dim=64. All models are trained with 224x224 resolution.

Model	Top-1	#Param.	FLOPs	Model	Log	Shell
UniFormer-S	83.4 (+0.5)	22M	3.6G	google	google	run.sh
UniFormer-S†	83.6 (+0.2)	24M	4.2G	google	google	run.sh
UniFormer-B	84.8 (+1.0)	50M	8.3G	google	google	run.sh

Large resolution fine-tuning (384x384)

The followed models and logs can be downloaded on Google Drive: total_models, total_logs.

We also release the models on Baidu Cloud: total_models (p05h), total_logs (wsvi).

We fine-tune the above models with Token Labeling on resolution of 384x384. Please see token_labeling for more details.

Model	Top-1	#Param.	FLOPs	Model	Log	Shell
UniFormer-S	84.6	22M	11.9G	google	google	run.sh
UniFormer-S†	84.9	24M	13.7G	google	google	run.sh
UniFormer-B	86.0	50M	27.2G	google	google	run.sh
UniFormer-L+Layer Scale	86.3	100M	39.2G	google	google	run.sh

Usage

Our repository is built base on the DeiT repository, but we add some useful features:

Calculating accurate FLOPs and parameters with fvcore (see check_model.py).
Auto-resuming.
Saving best models and backup models.
Generating training curve (see generate_tensorboard.py).

Installation

Clone this repo:

git clone https://github.com/Sense-X/UniFormer.git
cd UniFormer

Install PyTorch 1.7.0+ and torchvision 0.8.1+
```
conda install -c pytorch pytorch torchvision
```
Install other packages
```
pip install timm
pip install fvcore
```

Training

Simply run the training scripts in exp as followed:

bash ./exp/uniformer_small/run.sh

If the training was interrupted abnormally, you can simply rerun the script for auto-resuming. Sometimes the checkpoint may not be saved properly, you should set the resumed model via --reusme ${work_path}/ckpt/backup.pth.

Evaluation

Simply run the evaluating scripts in exp as followed:

bash ./exp/uniformer_small/test.sh

It will evaluate the last model by default. You can set other models via --resume.

Generate curves

You can generate the training curves as followed:

python3 generate_tensoboard.py

Note that you should install tensorboardX.

Calculating FLOPs and Parameters

You can calculate the FLOPs and parameters via:

python3 check_model.py

Acknowledgement

This repository is built using the timm library and the DeiT repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image_classification

image_classification

README.md

Image classification with UniFormer

Update

Model Zoo

Lightweight models on ImageNet-1K

ImageNet-1K pretrained (224x224)

ImageNet-1K pretrained with Token Labeling (224x224)

Large resolution fine-tuning (384x384)

Usage

Installation

Training

Evaluation

Generate curves

Calculating FLOPs and Parameters

Acknowledgement

Files

image_classification

Directory actions

More options

Directory actions

More options

Latest commit

History

image_classification

Folders and files

parent directory

README.md

Image classification with UniFormer

Update

Model Zoo

Lightweight models on ImageNet-1K

ImageNet-1K pretrained (224x224)

ImageNet-1K pretrained with Token Labeling (224x224)

Large resolution fine-tuning (384x384)

Usage

Installation

Training

Evaluation

Generate curves

Calculating FLOPs and Parameters

Acknowledgement