Sliding window approach feature in image_demo.py for finetuning Grounding Dino #12290

shvardhan1994 · 2025-01-13T15:39:49Z

I have finetuned the grounding dino model on a custom dataset for binary class object detection. I have finetuned the model on cropped images (512x512). During the inference time using image_demo.py, the model almost correctly classifies the classes along with localization for similar cropped images.
However, when inference is performed on original size image (which in my case is 5464x3640), the performance is very bad.
I believe a sliding window inference feature would help in this case and it would be of great help if someone can help me to modify the image_demo.py to perform sliding window approach.
Currently sliding_window approach can be performed using large_image_demo.py but it can only handle faster_rcnn variant architectures and not Grounding Dino.

mm-assistant bot assigned Czm369 Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sliding window approach feature in image_demo.py for finetuning Grounding Dino #12290

Sliding window approach feature in image_demo.py for finetuning Grounding Dino #12290

shvardhan1994 commented Jan 13, 2025 •

edited

Loading

Sliding window approach feature in image_demo.py for finetuning Grounding Dino #12290

Sliding window approach feature in image_demo.py for finetuning Grounding Dino #12290

Comments

shvardhan1994 commented Jan 13, 2025 • edited Loading

shvardhan1994 commented Jan 13, 2025 •

edited

Loading