Impaired Driver Detection

Motivation

Impaired drivers (inattentive, drowsy, under the influence) contribute to around 50% of accidents on the road in the U.S. These accidents contribute to around 15,000 deaths per year in the US (275,000 worldwide), making this an important issue that we hope to address. Thus for our project, we want to detect such drivers using computer vision via dashcam in the car, and issue a warning to the driver, or even lock the vehicle, to prevent these accidents before they happen.

State of the Art

Drowsy Driver

There is currently no all encompasing ready to use in any vehicle solution for detecting impaired drivers. One solution is Tesla's interior camera for monitoring the driver when FSD is turned on. However, from what Tesla has released, this only attempts to detect that the driver is looking towards the road, and does not attempt to detect other types of impairement like drowsiness. Additionally, this is only available in Teslas. Samsara released a comprehensive approach for detecting drowsy drivers in 2024, targeted towards truck drivers. However, their solution only detects drowsy drivers by looking for queues like yawns and head nods, but does try to detect other types of impairement. Thus, this solution is limited and less applicable for everyday use. Another approach is to monitor steering, however detecting poor steering might not happen quick enough to prevent an accident, and poor steering is not representative of impairment as a driver may have bad reaction time (due to impairment) but maintain acceptable steering patterns.

Approach and Implementation

Impaired Driver

To accomplish our goal, we implemented a pipeline consisting of two Convolutional Neural Networks (CNN) that we developed the architecture for and trained with data we found (linked at the bottom) using keras with TensorFlow in python, followed by a post classifier. Our first CNN is a regression model that detects the driver in the image. Our second CNN is a binary classification model that uses the detected driver as input, and determines if the driver is attentive to the road or not (head is facing forward means attentive). If this model determines the driver to be attentive, the cropped image of the driver is sent through the post classifier. This determines if the driver is drowsy or under the influence by looking for yawns, closed eyes, or head nodding by counting pixels in expected locations of the mouth and eyes. The reason we split this into two models and a post classifier was we struggled to find datasets; most datasets for classifying the state of a face were focused on the face, meaning our model would struggle when given an image from a dashcam. We also attempted to join multiple smaller datasets of images from a dashcam focused on inattentive drivers or drowsy drives together. However, this resulted in heavy class imbalance and inconsistency in the location from which the image of the driver was taken. Thus, we decided to split the bigger problem into smaller ones and solve those with the respective smaller datasets that we found.

This is a diagram of what our pipeline looks like:

This is sample flow of images:

Issues we faced along the way

Until the midterm report

Our initial plan was to rely on OpenCV’s HAAR cascade classifier for face detection to find the driver and then develop + train our own model for classifying state of the driver, HAAR cascade proved unreliable for images captured from a dashboard perspective. Although effective with standard, front-facing portraits, the classifier struggled with varied angles, lighting, and partial occlusions typical of in-car dashcam footage. To address this limitation, we pivoted to training our custom TensorFlow keras CNN for driver localization using a small labeled Roboflow dataset. This pivot introduced new challenges of model complexity and computational limits. As mentioned in the section above, due to dataset limitations, we decided here to make two models, one for driver detection and one for classifying state of the driver.

Until Final Presentation

Our initial plan for our driver state classification model was to classify if the driver was attentive, inattentive, drowsy, or under the influence in one model. However, we ran into issues with developing and training a model with more than two classes largely due to the dataset limitations and imbalance when combining multiple datasets. Also, increasing model complexity for improved accuracy was not viable due to compute power limitations, as we trained everything on our personal machines. So we decided here to pivot towards training a model to classify the driver as attentive and facing the road, or inattentive, and develop a post classifier for determining if the driver was drowsy or under the influence.

Results

In this image the driver is facing forward but eyes are closed, so we expect post classification to catch impaired, and it does:

In this image the driver is turned to the left, so we expect our attention classification model to detect inattentive, and it does:

In this image the driver is attentive, so we expect the driver to be marked as attentive and not impaired, and they are:

In this image the driver is inattentive looking at something on the left, so we again expect our attention classification model to detect inattentive, and it does:

Further ideas

If we had additional time to extend our completed system, we would gather better data and broaden our classifier from binary classification into a multi classification model capable of distinguishing not only attentiveness but also drowsiness and other types of impairement, so that we would only need one model end to end.

If this is not possible, and we were to keep our pipeline, we definitely would improve our post classifier to look for Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR) for detecting drowsiness and being under the influence for more accuracy.

We also see opportunity to evolve our existing inference pipeline into a fully real-time application by running inference on live webcam footage. Further polish could include a more sophisticated visualization interface, featuring color-coded state predictions, supplemental cues like EAR and MAR charts if we switch to using those, and exportable video clips for crash proof. These enhancements would make our system more practical, interpretable, and deployable in real driving environments.

Sources and Data Links

Sources

Hou, Junjian. “A Systematic Review for the Fatigue Driving Behavior Recognition Method - Junjian Hou, Yaxiong Xu, Wenbin He, Yudong Zhong, Dengfeng Zhao, Fang Zhou, Mingyuan Zhao, Shesen Dong, 2024.” Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, 18 Nov. 2023. Link to report
Liu, Fan, et al. “A Review of Driver Fatigue Detection and Its Advances on The ...” A Review of Driver Fatigue Detection and Its Advances on the Use of RGB-D Camera and Deep Learning, Key Laboratory of Water Big Data Technology of Ministry of Water Resources, Hohai University, Nanjing, China, Nov. 2022. Link to report
Soultana, Abdelfettah, et al. “A Systematic Literature Review of Driver Inattention Monitoring Systems for Smart Car.” International Journal of Interactive Mobile Technologies (iJIM), Laboratory of Modeling and Information Technology Faculty of Sciences Ben M’SIK University Hassan II, 31 Aug. 2022. Link to report

Data

Data for face detection - Link
Dataset for classification - Link

Reports
Project Proposal - Link
Midterm Report - Link

Issues we faced along the way

Until the midterm report

Until Final Presentation

Sources

Data

Reports