Microplastic pollution has emerged as a critical environmental threat, yet traditional spectrum-based detection methods remain limited by high costs, complex preprocessing, and poor generalizability. To address these challenges, we propose Yolov7CS, a novel deep learning framework that enhances the YOLOv7 architecture with a plug-and-play hybrid attention mechanism comprising channel and spatial attention modules. We developed a comprehensive, high-resolution image dataset of seven microplastic types under diverse background conditions, along with the real-world testing and robustness testing platform simulating practical water environments. Yolov7CS demonstrates superior detection performance across controlled, noise-degraded, and real-world datasets, achieving a mean average precision (mAP@50) of 0.955 and F1-score of 0.938. It also outperforms four state-of-the-art models, including DETR and Yolov5 variants, in both accuracy and robustness. Despite a moderate computational trade-off, the model’s accuracy, adaptability, and generalization capacity make it highly promising for scalable microplastic monitoring applications.