Synthesis AI for Driver Monitoring
Overview
Synthesis AI offers enhanced capabilities to simulate driver behavior in the car cabin environment across a broad set of environments and scenarios without compromising customer privacy. Beginning in 2022, all new cars entering the EU market must be equipped with advanced safety systems. Among the mandatory safety measures is distraction recognition and alert systems (more on our Synthetic Data for Safe Driving blog post).
We provide the ability to mimic driver behavior in virtual car environments so you can quickly test and iterate machine learning models across a broader set of settings and situations without building and deploying fleets of vehicles. This includes thousands of unique identities with granular control of emotion, gaze angle, head pose, accessories, gestures, lighting environments, camera systems (e.g., RGB, NIR, TOF), and more.
Since the data is synthetically generated, the image data comes with an expanded set of pixel-perfect labels, including 3D facial landmarks, 3D body landmarks, gaze, angle, depth maps, sub-segmentation, and surface normals.
This document aims to provide copy-and-paste input examples to our API that serve many of the use-cases in the car-cabin scenario, including:
- Typical Camera Placements
- NIR Settings and lighting
- Eye Gaze Direction
- Closing Eyes, Yawning Mouths
- Head Turns and Body Poses
Workflow
The standard workflow for the Synthesis AI Face API is:
- Register Account, Download & Setup CLI
- Select Identities for the Face API with the Identities API
- Create Face API job with input JSON
- Download job output with CLI
- Parse outputs into your ML pipeline
Hello Car World
Most machine learning tasks benefit from a wide distribution of our human features. To make it easy to get started, the JSON provided contains the following features:
- 1 identity, 6 renders per identitiy (selected via Identities API)
- Dashboard camera placement
- NIR emitter & RGB emitter
- Eye Gaze: +/- 20 degrees (all images)
- Head Turn: Several combinations of pitch, yaw, roll together
To process the outputs more easily, we provide a synthesisai library in Python, and output documentation.
Quickstart JSON for Car Interior
{
"version": 1,
"humans": [
{
"identities": {"ids": [80],"renders_per_identity": 3},
"3d_location": [{"specifications": {"type": "vehicle"}}],
"skin": {"highest_resolution": true},
"facial_attributes": {
"gaze": [{
"horizontal_angle": {
"type": "range",
"values": {"min": -20,"max": -15}
},
"vertical_angle": {
"type": "range",
"values": {"min": -15,"max": -10}
},
"percent": 100
}],
"head_turn": [{
"pitch": {"type": "range","values": {"min": -10,"max": 0}},
"yaw": {"type": "range","values": {"min": -10,"max": 10}},
"roll": {"type": "range","values": {"min": -10,"max": 0}},
"percent": 100
}]
},
"environment": {
"hdri": {"name": ["vatican_road"],
"intensity": { "type": "list","values": [1]},
"rotation": {"type": "list","values": [-120]}}
},
"camera_and_light_rigs": [
{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard_rgb",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [30]},
"wavelength": "visible"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
},{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard_nir",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [30]},
"wavelength": "nir"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
}
]
}
]
}
Visual examples of Hello Car World Outputs:
Camera Placement
Our API for in-car cabins allows for flexible camera system placements. Two common driver monitoring cameras placements are at the dashboard
and rearview_mirror
locations, so we provide an example of each as NIR images in the following JSON.
Note: we adjust the driver's seat position depending on the body height and weight of the incoming driver, just as a person would adjust their seat in real life.
JSON example of camera placement
{
"version": 1,
"humans": [
{
"identities": {"ids": [80],"renders_per_identity": 1},
"3d_location": [{"specifications": {"type": "vehicle"}}],
"skin": {"highest_resolution": true},
"environment": {
"hdri": {"name": ["vatican_road"],
"intensity": { "type": "list","values": [1]},
"rotation": {"type": "list","values": [-120]}}},
"camera_and_light_rigs": [
{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [22.61]},
"wavelength": "nir"
}
}
]
},{
"type": "preset_location",
"preset_name": "rearview_mirror",
"cameras": [
{
"name": "rearview_mirror",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [1.82]},
"sensor_width": {"type": "list","values": [8.46666]},
"wavelength": "nir"
}
}
]
}
]
}
]
}
Visual examples of driver monitoring camera placements:
NIR Settings
Most driver monitoring modules have at least one NIR emitter & sensor for monitoring the driver in nighttime conditions. Our API lets you customize the primary emitter location, as well as up to 3 other emitter locations to get pin-point reflections off the front of the eye. The API also lets you customize the intensity and size of each emitter to fine-tune the look and feel to match your particular camera module.
For a single emitter, use the wavelength: "nir" and colocated_emitter parameters.
For multiple NIR emitters, place additional white directional lights where the NIR emitters would be in the scene. An example json is to the right.
External HDRI settings can also be adjusted to match the camera system and daytime/nighttime conditions. Many NIR cameras do not see to the outside world, so we recommend `vignaioli` HDRI at intensity 0.1
Finally, to have matching visual spectrum (RGB) and NIR images, please contact us (support@synthesis.ai) as we have beta tools to specifically match scenes and sensor settings.
2 NIR Emitter locations
"lights": [{
"type": "rect",
"color": {
"red": {
"type": "list",
"values": [255]
},
"blue": {
"type": "list",
"values": [255]
},
"green": {
"type": "list",
"values": [255]
}
},
"intensity": {
"type": "list",
"values": [0.7]
},
"wavelength": "nir",
"size_meters": {
"type": "list",
"values": [0.02]
},
"relative_location": {
"x": {
"type": "list",
"values": [-0.372]
},
"y": {
"type": "list",
"values": [-0.117]
},
"z": {
"type": "range",
"values": {
"min": 1,
"max": 1
}
},
"yaw": {
"type": "list",
"values": [-10]
},
"roll": {
"type": "list",
"values": [0]
},
"pitch": {
"type": "list",
"values": [-20]
}
}
},
{
"type": "rect",
"color": {
"red": {
"type": "list",
"values": [255]
},
"blue": {
"type": "list",
"values": [255]
},
"green": {
"type": "list",
"values": [255]
}
},
"intensity": {
"type": "list",
"values": [0.7]
},
"wavelength": "nir",
"size_meters": {
"type": "list",
"values": [0.05]
},
"relative_location": {
"x": {
"type": "list",
"values": [-0.372]
},
"y": {
"type": "list",
"values": [-0.117]
},
"z": {
"type": "range",
"values": {
"min": 1,
"max": 1
}
},
"yaw": {
"type": "list",
"values": [-3]
},
"roll": {
"type": "list",
"values": [0]
},
"pitch": {
"type": "list",
"values": [15]
}
}
}]
Eye Closure
Detecting when the driver's eyes are closed more frequently, and/or for extended periods of time is key to any driver monitoring system. Our API provides expressions which allow for closure of eyes.
These expressions are compatible with both head turns, and glasses, which serve as effective confounds to train against.
Expressions which close the eye include "eyes_closed"
, "eye_closed_max_left"
, "eye_closed_max_right"
, "sad"
, and "disgusted"
.
Eye Closure Expressions
{
"version": 1,
"humans": [
{
"identities": {"ids": [80],"renders_per_identity": 3},
"3d_location": [{"specifications": {"type": "vehicle"}}],
"skin": {"highest_resolution": true},
"facial_attributes": {
"expression": [{
"name": ["eyes_closed"],
"intensity": { "type": "list", "values": [0,0.5,1]},
"percent": 100
}]
},
"environment": {
"hdri": {"name": ["vatican_road"],
"intensity": { "type": "list","values": [1]},
"rotation": {"type": "list","values": [-120]}}},
"camera_and_light_rigs": [
{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [12]},
"sensor_width": {"type": "list","values": [8.46666]},
"wavelength": "nir"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
}
]
}
]
}
Visual examples of eye closure:
Mouth Openness
Detecting when the driver is yawning is also a key feature to train in driver monitoring systems. Our API provides expressions which allow for opening of the mouth.
These expressions are compatible with both head turns, and face masks, which serve as effective confounds to train against.
Expressions which opening of the mouth include "aa"
, "ae_ax_ah"
, "ao"
, "aw"
, "ay"
, "disgusted"
, "d_t_n_19"
, "er_05"
, "ey_eh_uh_04"
, "happy"
, "h_12"
, "k_g_ng_20"
, "mouth_large_opened"
, "mouth_little_opened"
, "mouth_wide_opened"
, "opened_closedteeth_left"
, "opened_closedteeth_right"
, "opened_global_left"
, "opened_global_right"
, "revulsion"
, and "scared"
.
JSON example of mouth openness
{
"version": 1,
"humans": [
{
"identities": {"ids": [80],"renders_per_identity": 3},
"3d_location": [{"specifications": {"type": "vehicle"}}],
"skin": {"highest_resolution": true},
"facial_attributes": {
"expression": [{
"name": ["mouth_little_opened"],
"intensity": { "type": "list", "values": [0,0.5,1] },
"percent": 100
}]
},
"environment": {
"hdri": {"name": ["vatican_road"],
"intensity": { "type": "list","values": [1]},
"rotation": {"type": "list","values": [-120]}}},
"camera_and_light_rigs": [
{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [12]},
"sensor_width": {"type": "list","values": [8.46666]},
"wavelength": "nir"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
}
]
}
]
}
Visual examples of driver monitoring mouth openness:
Gaze Direction
Detecting where a driver is looking is not easy under many lighting conditions, particularly when NIR is involved. Our API provides programmatic control of eye gaze which allow for limitless combinations of pupil direction; and also comes with 3d gaze vectors and 3d landmarks.
These gazes are compatible with head turns, glasses, and NIR emitters, which all serve as effective confounds to train against.
JSON example of gaze direction
{
"version": 1,
"humans": [
{
"identities": {"ids": [80],"renders_per_identity": 4},
"3d_location": [{"specifications": {"type": "vehicle"}}],
"skin": {"highest_resolution": true},
"facial_attributes": {
"gaze": [{
"vertical_angle": { "type": "list", "values": [-25,0,25]},
"horizontal_angle": { "type": "list", "values": [-25,0,25]},
"percent": 100
}]
},
"environment": {
"hdri": {"name": ["vatican_road"],
"intensity": { "type": "list","values": [1]},
"rotation": {"type": "list","values": [-120]}}},
"camera_and_light_rigs": [
{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [12]},
"sensor_width": {"type": "list","values": [8.46666]},
"wavelength": "nir"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
}
]
}
]
}
Visual examples of eye gaze variation:
Head Turn
Just as with eye gaze, head turn is key to understanding where a driver's attention is focused, and can also be a sign of drowsiness or even consciousness. Our API provides programmatic control of head turn which allow for simulation of many scenarios; and comes with both face and body 3d landmarks.
These head turns are compatible with hats, glasses, and face masks, which serve as effective confounds to train against.
JSON example of head turn
{
"version": 1,
"humans": [
{
"identities": {"ids": [80],"renders_per_identity": 6},
"3d_location": [{"specifications": {"type": "vehicle"}}],
"skin": {"highest_resolution": true},
"facial_attributes": {
"head_turn": [{
"pitch": {"type": "list", "values": [-30,0,30]},
"yaw": {"type": "list", "values": [-30,0,30]},
"roll": {"type": "list", "values": [-10,0,10]},
"percent": 100
}]
},
"environment": {
"hdri": {"name": ["vatican_road"],
"intensity": { "type": "list","values": [1]},
"rotation": {"type": "list","values": [-120]}}},
"camera_and_light_rigs": [
{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [12]},
"sensor_width": {"type": "list","values": [8.46666]},
"wavelength": "nir"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
}
]
}
]
}
Visual examples of driver monitoring head turns:
Putting it All Together
The great part about the API is that you get complete programmatic control. Take a look in our GitHub appendix for more samples.
This particular sample runs the gamut of camera placements, NIR and RGB settings, eye closure and mouth openness, head turns and gaze directions, as well as accessories, with the idea that you can tune down to what you like.
{
"version": 1,
"humans": [
{
"identities": {"ids": [80],"renders_per_identity": 1},
"3d_location": [{"specifications": {"type": "vehicle"}}],
"skin": {"highest_resolution": true},
"accessories": {
"glasses": [{
"style": ["all"],
"transparency": {"type":"list", "values": [1]},
"metalness": {"type":"list", "values": [0.03]},
"percent": 33
},{
"style": ["all"],
"transparency": {"type":"list", "values": [0.95]},
"metalness": {"type":"list", "values": [0.4]},
"percent": 33
},{
"style": ["none"],
"percent": 34
}]
},
"facial_attributes": {
"head_turn": [{
"pitch": {"type": "list", "values": [-30,0,30]},
"yaw": {"type": "list", "values": [-30,0,30]},
"roll": {"type": "list", "values": [-10,0,10]},
"percent": 100
}],
"gaze": [{
"vertical_angle": { "type": "list", "values": [-15, 0, 15]},
"horizontal_angle": { "type": "list", "values": [-15, 0, 15]},
"percent": 100
}],
"expression": [{
"name": ["eyes_closed","mouth_little_opened"],
"intensity": { "type": "list", "values": [0,0.5,1] },
"percent": 100
}]
},
"environment": {"hdri": {"name": ["vatican_road"],"intensity": { "type": "list","values": [1]},"rotation": {"type": "list","values": [-120]}}},
"camera_and_light_rigs": [
{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard_rgb",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [30]},
"wavelength": "visible"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
},{
"type": "preset_location",
"preset_name": "dashboard",
"cameras": [
{
"name": "dashboard_nir",
"specifications": {
"resolution_h": 1024,
"resolution_w": 1024,
"focal_length": {"type": "list","values": [30]},
"wavelength": "nir"
},
"relative_location": {
"y": {"type": "list","values": [0.25]},
"z": {"type": "list","values": [0.01]}
}
}
]
}
]
}
]
}
Visual examples of putting it all together: