上一篇測試Clip Interrogator Mixlab,這篇要測試Deepseek Janus Pro,這個模型是2025.01.28才剛出的模型 (發文的今天)。
前幾天還在想有沒有好一點的圖片識別文字模型,今天就推出這個,真的太猛了~
使用的實驗圖(仍然是這五張)
![]() | ![]() | ![]() | ![]() | ![]() |
使用的顯卡:RTX3060 12GB 、記憶體:64GB DDR4
Deepseek Janus Pro評測
1.Janus-Pro-1b模型
模式:Janus-Pro-1b | 輸出結果(英文) | 輸出結果(中文翻譯-Google) | 說明 |
![]() | The image depicts a cute, animated frog sitting in a small, round pond. The frog has large, expressive eyes and a green, spotted body. It appears to be enjoying the water, with its front legs partially submerged and its mouth slightly open as if it's about to drink or splash. The pond is surrounded by lush green plants and trees, creating a serene, natural environment. The sky above is bright blue with fluffy white clouds, and there are a few small birds flying in the background. The overall atmosphere is peaceful and whimsical, with glowing particles floating around the water, adding a magical touch to the scene. | 圖像描繪了一隻可愛的動畫青蛙坐在一個圓形的小池塘裡。這種青蛙有一雙大而富有表現力的眼睛和綠色斑點的身體。它看起來很享受水,前腿部分浸入水中,嘴巴微微張開,好像要喝水或潑水。池塘周圍綠樹成蔭,環境寧靜自然。頭頂的天空蔚藍,飄浮著蓬鬆的白雲,背景有幾隻小鳥在飛翔。整體氛圍是平靜而奇妙的,發光的粒子漂浮在水面上,為場景增添了神奇的感覺。 | 生成的時間:5.06s |
![]() | The image depicts a cartoon-style character who appears to be a young boy. He has short, spiky dark hair and is wearing a yellow short-sleeved shirt with a small emblem on the left side of the chest. He is also wearing dark shorts and red sneakers with white soles and blue laces. The boy is holding a barbell with both hands, seemingly lifting it as if he is about to perform a weightlifting exercise. The barbell has several large black weight plates stacked on it. The background is plain and light-colored, with some small rocks or debris scattered around the ground. The boy's expression is one of excitement or determination, with his mouth open and eyes wide. | 該圖像描繪的是一個卡通風格的人物,看起來像是一個小男孩。他有一頭短短的、尖尖的黑髮,身穿黃色短袖襯衫,胸部左側有一個小徽章。他還穿著深色短褲和白色鞋底、藍色鞋帶的紅色運動鞋。男孩雙手握住槓鈴,看起來像是要舉重練習一樣。槓鈴上堆放著幾塊黑色的大重量盤。背景樸素、色彩淺淡,地面上散落著一些小石頭或碎片。男孩的表情充滿興奮或堅定,嘴巴張開,眼睛睜得大大的。 | 生成的時間:6.44s |
![]() | The image depicts a heartwarming scene featuring two animated characters, a boy and a girl, holding hands and facing each other. The boy is dressed in a school uniform with a backpack, while the girl is wearing a red jacket and has bunny ears on her head. They are surrounded by floating hearts in a warm, glowing light, suggesting a romantic or affectionate atmosphere. The background appears to be a cityscape with blurred figures in the distance, enhancing the focus on the two characters. The overall mood is tender and loving. | 圖片描繪的是兩個動畫人物,一個男孩和一個女孩,手牽手,面對面的溫馨場景。男孩身著校服,背著書包,女孩則身穿紅色外套,頭上戴著兔耳朵。它們周圍環繞著漂浮的心形圖案,散發著溫暖而明亮的光芒,暗示著一種浪漫而深情的氛圍。背景看起來像是城市景觀,遠處有模糊的人物,增強了對兩個角色的關注。整體氛圍是溫柔而充滿愛意的。 | 生成的時間:4.56s |
![]() | The image depicts a dramatic scene where a person is emerging from a large cauldron filled with a bubbling, golden liquid. The individual appears to be in a state of distress or surprise, with water splashing around them as they emerge from the cauldron. The background is engulfed in flames, suggesting a fire or intense heat. The person is shirtless, and their expression is one of shock or fear. The scene is set outdoors, with the ground covered in sand or dirt. The overall atmosphere is intense and chaotic, with bright orange and yellow flames dominating the background. | 這張圖描繪了一個戲劇性的場景:一個人從一個裝滿冒泡的金色液體的大鍋中走出來。當這個人從大鍋裡出來時,周圍水花四濺,看起來似乎處於痛苦或驚訝的狀態。背景被火焰吞沒,暗示著火災或酷熱。此人赤裸上身,表情震驚或恐懼。場景設置在戶外,地面覆蓋著沙子或泥土。整體氛圍緊張而混亂,明亮的橙色和黃色火焰佔據了背景。 | 生成的時間:5.05s
|
![]() | The image depicts a large, anthropomorphic bear standing on a rocky path. The bear has a muscular build and is covered in brown fur. It is positioned slightly to the right of the image, appearing to be in motion, possibly walking or running. In front of the bear, there is a large rock balanced precariously on another rock, creating a precarious yet stable pose. The background consists of a cloudy sky and some grassy terrain with scattered rocks and small pebbles. The overall scene gives a sense of adventure and exploration. | 圖像描繪了一隻巨大的擬人化熊站在一條岩石小路上。這隻熊肌肉發達,身穿棕色皮毛。它位於影像的右側稍遠處,看起來正在移動,可能是在走路或跑步。在熊的前方,有一塊大石頭搖搖欲墜地壓在另一塊石頭上,形成一種岌岌可危但又穩定的姿勢。背景由多雲的天空和一些草地組成,上面散落著岩石和小鵝卵石。整體場景給人一種冒險和探索的感覺。 | 生成的時間:4.62s |
結論:1B的效果出乎意料,居然5個描述都正確,且生成的時間可以接受(4~6秒),有畫面資訊跟形容。整體表現比用llama 7B的還好。目前來說的最佳選擇。
2.Janus-Pro-7b (記憶體峰值45GB) 常態21.3GB
這個在讀取model時記憶體一度吃到45G上下,讀取好以後吃約21.3GB。
模式:Janus-Pro-7b | 輸出結果(英文) | 輸出結果(中文翻譯-Google) | 說明 |
![]() | The image depicts a cute, animated frog sitting in a small pond surrounded by lush greenery. The frog has large, expressive eyes and a friendly, smiling face. Its body is predominantly green with some lighter patches, and it has orange-tipped toes. The frog is sitting upright with its front legs resting on its belly. The pond is bordered by a low stone wall, and there are several rocks and pebbles around the water. The background features a bright, sunny sky with fluffy white clouds and a few birds flying. The surrounding vegetation includes green leaves and branches, creating a serene and natural environment. | 圖像描繪了一隻可愛的動畫青蛙坐在被茂密的綠色植物包圍的小池塘裡。青蛙有一雙大而富有表現力的眼睛和一張友好而微笑的臉。它的身體主要呈綠色,帶有一些淺色斑點,腳趾尖呈橙色。青蛙直立坐著,前腿擱在腹部。 池塘周圍有一堵低矮的石牆,水邊有幾塊岩石和鵝卵石。背景是明亮、晴朗的天空,有蓬鬆的白雲和幾隻飛翔的鳥兒。周圍植被綠葉、枝條茂盛,營造出寧靜自然的環境。 | 生成的時間:126.92s |
![]() | The image depicts a cartoon-style character, a young boy with spiky dark hair and expressive brown eyes. He is wearing a yellow short-sleeved shirt with a small emblem on the left side and dark shorts. The boy is lifting a barbell with both hands, showcasing his strength and determination. The barbell is loaded with weight plates on each end, and there are small pieces of debris flying off the ground, indicating the intensity of his effort. The character is wearing red and white sneakers with blue accents. The background is a simple gradient, emphasizing the focus on the boy and his action. | 該圖像描繪的是一個卡通風格的人物,一個有著黑色尖刺髮和富有表現力的棕色眼睛的年輕男孩。他穿著黃色短袖襯衫,左側有一個小徽章,下方穿著深色短褲。男孩雙手舉起槓鈴,展現他的力量和決心。槓鈴的兩端都裝著重量片,地上飛出的小碎片顯示了他努力的強度。這個角色穿著帶有藍色裝飾的紅白運動鞋。背景是簡單的漸變,強調對男孩和他的動作的關注。 | 生成的時間:89.03s |
![]() | This image depicts a romantic scene between two animated characters. They are standing close to each other, holding hands, and facing each other with expressions of affection. The boy is wearing a white shirt with a dark vest and has a backpack on his shoulders. The girl is dressed in a red coat with a hood and also carries a backpack. In the background, there are numerous glowing hearts floating in the air, creating a warm and romantic atmosphere. The scene appears to be set in an outdoor urban environment, possibly a street or a park, with buildings visible in the distance. The overall mood of the image is one of love and tenderness, emphasized by the glowing hearts and the characters' close interaction. | 這張圖描繪的是兩個動畫人物之間的浪漫場景。他們緊緊地站在一起,手牽手,四目相對,滿臉深情。男孩身穿白色襯衫,搭配深色背心,肩上背著背包。女孩身穿帶兜帽的紅色外套,還背著背包。 背景中,無數顆發光的心漂浮在空中,營造出溫馨浪漫的氛圍。場景似乎設定在戶外城市環境中,可能是街道或公園,遠處可見建築物。 影像的整體基調是愛與溫柔,透過熾熱的心和人物的親密互動得到強調。 | 生成的時間:105.36s
|
![]() | The image depicts a dramatic scene where a person is submerged in a large black cauldron filled with water. The cauldron is placed on a bed of burning logs, and flames surround the entire scene, creating a fiery backdrop. The person appears to be exerting effort, as water splashes out from the cauldron, indicating vigorous movement. The overall atmosphere is intense and dynamic, with the bright orange and yellow hues of the fire contrasting sharply with the dark silhouette of the person and the cauldron. | 圖片描繪的是一個戲劇性的場景:一個人被浸泡在裝滿水的黑色大鍋裡。大鍋子被放置在燃燒的木頭上,火焰環繞著整個場景,創造了一個火熱的背景。這個人似乎在用力,因為水從大鍋中濺出,這表明動作很激烈。整體氛圍強烈而動感,火焰的明亮橙色和黃色與人物和大鍋的深色輪廓形成鮮明對比 | 生成的時間:89.17s |
![]() | The image depicts a large, anthropomorphic bear standing on a rocky path in a mountainous area. The bear has a muscular build and is covered in brown fur. Its eyes are a striking yellow, and it has a determined expression on its face. The bear is positioned in the center of the image, facing forward. In the foreground, there are several large rocks, with one particularly prominent rock balancing on top of another smaller rock. The path the bear is standing on is surrounded by grass and smaller stones. The background shows a cloudy sky and the outline of distant mountains, adding to the natural and rugged setting of the scene. | 圖像描繪了一隻巨大的擬人化熊站在山區的岩石小路上。這隻熊肌肉發達,身穿棕色皮毛。它的眼睛是醒目的黃色,臉上帶著堅定的表情。熊位於影像的中央,面向前方。 前景中有幾塊大石頭,其中一塊特別顯眼的石頭壓在另一塊較小的石頭上。熊站立的小路周圍是草地和小石頭。背景顯示出多雲的天空和遠處山脈的輪廓,增添了場景的自然和崎嶇感。 | 生成的時間:97.85s |
結論:描述的很詳細,測試的五張圖也都正確描述出來。缺點就是太慢了90~130秒才有一個結果。而且我發現在同一個工作流跑這個以後其他的就運行不了,似乎是資源被完全吃掉了。只有12G顯存的用戶不推薦。目前來說跟1B比起來,推薦1B的模型。
這個模型,1B的真的超出我的預期的好,能替代我之前用的其它的圖片識別文字描述模型了。
接著下來要測試Florence2系列的
文章轉載或引用,請先告知並保留原文出處與連結!!(單純分享或非營利的只需保留原文出處,不用告知)
原文連結:
https://blog.aidec.tw/post/comfyui-image-description-janus-pro-deepseek
若有業務合作需求,可寫信至: opweb666@gmail.com
創業、網站經營相關內容未來將發布在 小易創業筆記