Auto tagging--is there a list of possible "objects" and "scenes" somewhere?

Sat Apr 12 09:56:09 BST 2025

 Hi Paul,

Wow, thank you for finding that. You did a much better job in your searches than I did! Based on a few of the seemingly random tags I've gotten it does look like this must be the 1,000 objects list.

And using your search terms with yolo+coco....category replacing efficientnet+b7...classes led me to a post that looks like maybe it has the "80 objects" list. Again, some of it seems kind of random, but whatever, at least now I have some idea as to what to expect.

https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/
https://gist.github.com/rcland12/dc48e1963268ff98c8b2c4543e7a9be8

Billy     On Saturday, April 12, 2025 at 05:17:37 PM GMT+9, Paul A. Norman <paul at paulanorman.info> wrote:  

 Hi Billy,

Yes I could not believe how much digging and refining of serach terms it took, but this might just be something of what you are looking for? ...

‘IMAGENET 1000 Class List’

https://deeplearning.cms.waikato.ac.nz/user-guide/class-maps/IMAGENET/

— specific to:

“This is used by most pretrained models included in WekaDeeplearning4j”

Search was:

efficientnet b7 1000 object classes list names

https://www.google.com/search?q=efficientnet+b7+1000+object+classes+list+names

Kind regards,
Paulhttps://PaulANorman.info

On 12 April 2025 4:12:20 pm NZST, Billy <mrgreatzot at yahoo.com> wrote:
 Hi,

I've been trying out the auto-tagging feature since yesterday (so far just the EfficientNet B7 option), but the documentation is rather vague. It says one model detects "1,000 different objects and scenes", while the others "detect 80 different objects". Ok...but what exactly are the objects and scenes these models can detect? The tags have to come from somewhere, so there must be a list somewhere, right? I tried running a few searches online but so far found none. As it is I have no clear idea of what the models are intended to look for or what exactly is different between them.

https://docs.digikam.org/en/left_sidebar/tags_view.html#auto-tagging-images

"The default model is EfficientNet B7. The EfficientNet B7 model is a general-purpose model that can detect 1,000 different objects and scenes. The YOLOv11-Nano model is faster and uses less memory than the EfficientNet B7 model. The YOLOv11-Nano model is recommended for users with limited memory or slower processors, and YOLOv11-XLarge is recommended for users with more memory and faster processors. Both YOLOv11 models are trained to detect 80 different objects based on the COCO dataset."

I'd appreciate whatever additional info you can provide.
--Billy  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/digikam-users/attachments/20250412/32fe22c4/attachment.htm>