Enhancing object pose estimation for RGB images in cluttered scenes.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Metwalli Al-Selwi, Yan Chao, Yin Gao, Jun Li, Qiming Li, Huang Ning

Ngôn ngữ: eng

Ký hiệu phân loại: 069.134 Representations of museum objects

Thông tin xuất bản: England : Scientific reports , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 711442

Thêm vào giỏ Liên kết toàn văn

Estimating the 6D pose of objects is crucial for robots to interact with the environment. 6D Object pose estimation from RGB images in a cluttered scene and heavy occlusions is a critical issue. Most existing methods use two stages to estimate object pose: First, extract the object features, and then use the PnP/RANSAC method to estimate object pose. However, most of these techniques merely localize a group of key-points by regressing their coordinates, which are vulnerable to occlusion and have poor performance for multi-object pose estimation. These methods cannot directly regress the 6D pose estimation from a loss during training. In this paper, we propose a framework based on convolutional neural network (CNN) and self-attention mechanism as an end-to-end method for single and multi-object 6D pose estimation using RGB images with low computational cost. Our method utilizes feature fusion to extract local features and combines multi-head self-attention (MHSA) with iterative refinement to improve pose estimation performance. Furthermore, our method can be scaled according to computational resources. Our experiments illustrate that our method performs in benchmark datasets the Linemod and Occlusion Linemod and achieves 97.45% and 84.84% in terms of the ADD(-S) metric in both datasets, respectively.

Tạo bộ sưu tập với mã QR