Abstract—Deep learning is a subfield of machine learning. Computer vision is one of the technological advances that utilizes deep learning in image processing, object classification, and object detection. In the Object Detection, there have been various models that can detect objects with different characteristics, and with so many models that have been developed, it takes longer to determine which model is suitable for the needs of a project because it requires comparisons between each model. In this study, an analysis was conducted by comparing three models that utilize Deep Learning to detect car and bus objects, namely Faster-RCNN with ResNet50, SSD with MobileNet, and EfficientDet with D0. Each model is run using TensorFlow Object Detection. The models will be trained using a custom dataset containing of 52 images and will be trained in 3000 steps. Based on experiments, it is known that from the comparison of mAP, Faster-RCNN ResNet50 has the highest score of 0.453, and the lowest is EfficientDet D0 with 0.274; for the comparison of Average Recall, Faster-RCNN ResNet50 has the highest score with 0.337, and the lowest is EfficientDet D0 with 0.190, as well as for model size comparison, EfficientDet D0 has the smallest size with 290 MB, and the largest is Faster-RCNN ResNet50 with 1280 MB.