Motivation

Impaired drivers (inattentive, drowsy, under the influence) contribute to around 50% of accidents on the road in the U.S. These accidents contribute to around 15,000 deaths per year in the US (275,000 worldwide), making this an important issue that we hope to address. Thus for our project, we want to detect such drivers using computer vision via dashcam in the car, and issue a warning to the driver, or even lock the vehicle, to prevent these accidents before they happen.

Approach and Implementation

Impaired Driver

To accomplish our goal, we implemented a pipeline consisting of two Convolutional Neural Networks (CNN) that we developed the architecture for and trained with data we found (linked at the bottom) using keras with TensorFlow in python, followed by a post classifier. Our first CNN is a regression model that detects the driver in the image. Our second CNN is a binary classification model that uses the detected driver as input, and determines if the driver is attentive to the road or not (head is facing forward means attentive). If this model determines the driver to be attentive, the cropped image of the driver is sent through the post classifier. This determines if the driver is drowsy or under the influence by looking for yawns, closed eyes, or head nodding by counting pixels in expected locations of the mouth and eyes. The reason we split this into two models and a post classifier was we struggled to find datasets; most datasets for classifying the state of a face were focused on the face, meaning our model would struggle when given an image from a dashcam. We also attempted to join multiple smaller datasets of images from a dashcam focused on inattentive drivers or drowsy drives together. However, this resulted in heavy class imbalance and inconsistency in the location from which the image of the driver was taken. Thus, we decided to split the bigger problem into smaller ones and solve those with the respective smaller datasets that we found.

This is a diagram of what our pipeline looks like:

This is sample flow of images:

Issues we faced along the way

Until the midterm report

Our initial plan was to rely on OpenCV’s HAAR cascade classifier for face detection to find the driver and then develop + train our own model for classifying state of the driver, HAAR cascade proved unreliable for images captured from a dashboard perspective. Although effective with standard, front-facing portraits, the classifier struggled with varied angles, lighting, and partial occlusions typical of in-car dashcam footage. To address this limitation, we pivoted to training our custom TensorFlow keras CNN for driver localization using a small labeled Roboflow dataset. This pivot introduced new challenges of model complexity and computational limits. As mentioned in the section above, due to dataset limitations, we decided here to make two models, one for driver detection and one for classifying state of the driver.

Until Final Presentation

Our initial plan for our driver state classification model was to classify if the driver was attentive, inattentive, drowsy, or under the influence in one model. However, we ran into issues with developing and training a model with more than two classes largely due to the dataset limitations and imbalance when combining multiple datasets. Also, increasing model complexity for improved accuracy was not viable due to compute power limitations, as we trained everything on our personal machines. So we decided here to pivot towards training a model to classify the driver as attentive and facing the road, or inattentive, and develop a post classifier for determining if the driver was drowsy or under the influence.

Results

In this image the driver is facing forward but eyes are closed, so we expect post classification to catch impaired, and it does:

In this image the driver is turned to the left, so we expect our attention classification model to detect inattentive, and it does:

In this image the driver is attentive, so we expect the driver to be marked as attentive and not impaired, and they are:

In this image the driver is inattentive looking at something on the left, so we again expect our attention classification model to detect inattentive, and it does:

Further ideas

If we had additional time to extend our completed system, we would gather better data and broaden our classifier from binary classification into a multi classification model capable of distinguishing not only attentiveness but also drowsiness and other types of impairement, so that we would only need one model end to end.

If this is not possible, and we were to keep our pipeline, we definitely would improve our post classifier to look for Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR) for detecting drowsiness and being under the influence for more accuracy.

We also see opportunity to evolve our existing inference pipeline into a fully real-time application by running inference on live webcam footage. Further polish could include a more sophisticated visualization interface, featuring color-coded state predictions, supplemental cues like EAR and MAR charts if we switch to using those, and exportable video clips for crash proof. These enhancements would make our system more practical, interpretable, and deployable in real driving environments.

Sources and Data Links

    Sources

  • Hou, Junjian. “A Systematic Review for the Fatigue Driving Behavior Recognition Method - Junjian Hou, Yaxiong Xu, Wenbin He, Yudong Zhong, Dengfeng Zhao, Fang Zhou, Mingyuan Zhao, Shesen Dong, 2024.” Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, 18 Nov. 2023. Link to report
  • Liu, Fan, et al. “A Review of Driver Fatigue Detection and Its Advances on The ...” A Review of Driver Fatigue Detection and Its Advances on the Use of RGB-D Camera and Deep Learning, Key Laboratory of Water Big Data Technology of Ministry of Water Resources, Hohai University, Nanjing, China, Nov. 2022. Link to report
  • Soultana, Abdelfettah, et al. “A Systematic Literature Review of Driver Inattention Monitoring Systems for Smart Car.” International Journal of Interactive Mobile Technologies (iJIM), Laboratory of Modeling and Information Technology Faculty of Sciences Ben M’SIK University Hassan II, 31 Aug. 2022. Link to report
  • Data

  • Data for face detection - Link
  • Dataset for classification - Link

    Reports

  • Project Proposal - Link
  • Midterm Report - Link