An employee should be competent and expertise in their respective fields. An evaluation is needed to maintain the quality of employee’s performance, one of which can be done by observing their activity during working hours. This research discusses the classification of the employee’s activity in desk work. Classification of employee’s activity is investigated using ResNet and the Cyclical Learning Rate method in a novel dataset, i.e. vision-based employee activity. Classification is done by looking at three types of employee activities: talking on the phone, using a PC, and playing smartphone. The most optimal result of this research is ResNet50 using CLR with image input of 224x224x3, which has an accuracy of 87.01% and 12.99% error rate for talking on the phone, 99.95% accuracy and 0.05% error rate for using a pc, 81.67% accuracy and 18.83% error rate for playing smartphone and has a decreasing loss value. In addition, this research shows that cyclical learning rate significantly affects the model performance.