Networks”, X. Wang et. al., CVPR2018 [2] Asymmetric Non-local Block: “Asymmetric Non-local Neural Networks for Semantic Segmentation”, Z. Zhu et. al., ICCV2019 [3] “Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation”, H. Wang et. al., CVPR2019 [4] CenterNet: “Objects as Points”, X. Zhou et. al., arXiv2019 [5] DORN: “Deep Ordinal Regression Network for Monocular Depth Estimation.”, H. Fu et. al., CVPR2018 [6] PSMNet: “Pyramid stereo matching network.”, J. Chang et. al., CVPR2018 [7] AVOD: “Joint 3d proposal generation and object detection from view aggregation.”, J. Ku et. al., IROS2018 [8] F-POINTNET: “Frustum pointnets for 3d object detection from rgb-d data.”, C. R. Qi et. al., CVPR2018 [9] “Dynamic Graph Message Passing Networks”, L. Zhang et. al., CVPR2020 [10] Focal loss: “Focal Loss for Dense Object Detection”, T. Y. Lin et. al., ICCV2017 [11] “Center3D: Center-based Monocular 3D Object Detection with Joint Depth Understanding”, Y. Tang et. al., arXiv2020 [12] PointPillars: “PointPillars: Fast encoders for object detection from point clouds”, A. H. Lang et. al., CVPR2019 [13] BTS: “From big to small: Multi-scale local planar guidance for monocular depth estimation”, J. H. Lee et. al., arXiv2019 [14] “Softsort: A continuous relaxation for the argsort operator”, S. Prillo et. al., ICML2020 [15] AP-Loss: “AP-Loss for accurate one-stage object detection.”, K. Chen et. al., TPAMI2020 [16] “Soft-NMS‒improving object detection with one line of code.”, N. Bodla et. al., ICCV2017 [17] “Distance-normalized unified representation for monocular 3D object detection.”, X. Shi et. al., ECCV2020 References