[SWIFT] ARKit 4 LiDAR Depth API

ARKit 3.5, released at the same time as the launch of the iPad Pro, the first iOS device to feature a LiDAR scanner, provided a reconstracted 3D mesh using LiDAR, but it should have been used in the calculation ** depth ( Depth) Data could not be accessed **.

rectangle_large_type_2_8e10a807ae6bf848e5d9f98228be7602.png

With the ** ARKit 4 / iOS 14 (/ iPadOS 14) announced after that, it is finally possible to acquire the depth measured by LiDAR **. It seems that it is also called "Scene Depth" to distinguish it from the conventional depth.

How to get LiDAR depth

Depth data derived from LiDAR can be obtained by specifying the .sceneDepth option in the frameSemantics property when using ARWorldTrackingConfiguration.

let session = ARSession()
let configuration = ARWorldTrackingConfiguration()

//Check if the sceneDepth option is available
if type(of: configuration).supportsFrameSemantics(.sceneDepth) {
   // Activate sceneDepth
   configuration.frameSemantics = .sceneDepth
}
session.run(configuration)

Depth data measured by LiDAR can be acquired from the sceneDepth property newly added to ARFrame.

func session(_ session: ARSession, didUpdate frame: ARFrame) {
   guard let depthData = frame.sceneDepth else { return }
   // Use depth data
}

Available configurations

Since sceneDepth is set in the frameSemantics property defined in the ARConfiguration protocol, it can be used in terms of API, but as far as the description in the sceneDepth section of the API reference is read, it is other than world tracking. It seems that it is not available in the configuration of.

https://developer.apple.com/documentation/arkit/arconfiguration/framesemantics/3516902-scenedepth

If you enable this option on a world-tracking configuration's frameSemantics, ARKit includes depth information for regions of the current frame's capturedImage. The world-tracking configuration exposes depth information in the sceneDepth property that it updates every frame.

Hardware constraints

Also from the sceneDepth reference page.

ARKit supports scene depth only on LiDAR-capable devices

There seems to be no condition that the chip is A12 or higher. (LiDAR will be installed → Is it inevitably a high-performance chip?)

Difference from conventional depth data

It has always been possible to access depth data when using ARKit (under certain conditions). The depth API that has traditionally had the capturedDepthData property of ARFrame and the estimatedDepth. What is the difference from the depth API added this time?

Depth data type and acquisition conditions

First of all, of course, the type of depth data is different. And along with that, the conditions that can be obtained are also different. capturedDepthData is depth data derived from the True-Depth camera and can be acquired only when using face tracking.

estimatedDepth can be obtained when the personSegmentationWithDepth option is specified in frameSemantics, and it is not the depth derived from the dual camera or True-Depth camera, but the estimated depth data as you can see from the property name. A12 chip or more required.

As mentioned above, sceneDepth can be obtained when sceneDepth is specified for frame Semantics, and it is necessary to have only world tracking as the AR configuration and LiDAR as the device.

Difference in frame rate / generation algorithm

Since capturedDepthData is depth data derived from the True-Depth camera, it is updated less frequently than the color data (capturedImage property) of ARFrame, and if an effect using depth is applied, the subject (face). Since it is tracking, I sometimes could not follow when I) moved fast.

On the other hand, estimatedDepth is estimated based on machine learning and is updated at the same frame rate as the camera frame because it requires a high-performance chip of A12 or later.

sceneDepth is also generated using a machine learning algorithm based on the depth data acquired by LiDAR and the color data acquired from the wide-angle camera. It runs at 60fps, which means it's also updated every time an ARFrame is available.

(From the "Explore ARKit 4" session at WWDC 2020)

The colored RGB image from the wide-angle camera and the depth ratings from the LiDAR scanner are fused together using advanced machine learning algorithms to create a dense depth map that is exposed through the API. (Color RGB images from wide-angle cameras and depth ratings from LiDAR scanners are fused using advanced machine learning algorithms to create high-density depth maps exposed through APIs.)

This operation runs at 60 times per second with the depth map available on every AR frame. (This operation is performed 60 times per second and a depth map is available for each AR frame.)

Difference in type

capturedDepthData is obtained with AVDepthData type, estimatedDepth is obtained with CVPixelBuffer type, and sceneDepth is obtained with ARDepthData type.

ARDepthData

As mentioned above, the LiDAR-derived depth data obtained from the sceneDepth property is obtained in the ARDepthData type. This is a new class added in iOS 14.

https://developer.apple.com/documentation/arkit/ardepthdata?changes=latest_major

There is an official sample that makes good use of this, so I will write a detailed explanation in another article while reading the sample.

LiDAR accuracy

Although it is not WWDC 2020, when ARKit 3.5 was released, a Tech Talk called "Advanced Scene Understanding in AR" was released, and the accuracy of LiDAR was mentioned in it.

The new iPad Pro comes equipped with a LiDAR Scanner. This is used to determine distance by measuring at nanosecond speeds how long it takes for light to reach an object in front of you and reflect back. This is effective up to five meters away and operates both indoors and outdoors.  (The new iPad Pro is equipped with a LiDAR scanner, which determines distance by measuring the time it takes for light to reach an object in front of you, reflect it, and return at a nanosecond speed. It is used for up to 5 meters and can be used indoors or outdoors.)

I was asked several times how many meters the LiDAR could measure, but I had the impression that this was not officially mentioned, so I answered, "You should actually try it with a sample." Was there. As a guide, it is valuable official information, so make a note.

personSegmentationWithDepth and scene depth

When the personSegmentationWithDepth option is specified in the frameSemantics property of ARConfiguration, it seems that the scene depth can be obtained automatically if the device can obtain the scene depth.

let session = ARSession()
let configuration = ARWorldTrackingConfiguration()

// Set required frame semantics
let semantics: ARConfiguration.FrameSemantics = .personSegmentationWithDepth
       
// Check if configuration and device supports the required semantics
if type(of: configuration).supportsFrameSemantics(semantics) {
   // Activate .personSegmentationWithDepth
   configuration.frameSemantics = semantics
}
session.run(configuration)

Moreover, there is no additional power cost.

Additionally if you have an AR app that uses people occlusion feature, and then search the personSegmentationWithDepth frameSemantic, then you will automatically get sceneDepth on devices that support the sceneDepth frameSemantic with no additional power cost to your application.

smoothedSceneDepth

The above-mentioned sceneDepth was already released at the time of WWDC 2020 (as of iOS 14 beta 1), but suddenly in iOS 14 beta 5, an API called smoothedSceneDepth was added.

https://twitter.com/shu223/status/1295968108352479232

At this time (August 2020), I went to see the document immediately, but no details were written.

If you go to see it now, the explanation is written properly, and the difference from ** sceneDepth ** is also specified. The Japanese translations for using DeepL are also listed here.

smoothedSceneDepth: ARConfiguration.FrameSemantics

Type property of ARConfiguration.FrameSemantics

https://developer.apple.com/documentation/arkit/arconfiguration/framesemantics/3674208-smoothedscenedepth

An option that provides the distance from the device to real-world objects, averaged across several frames. (An option that provides the distance from the device to the real-world object and is averaged across multiple frames.)

Declaration

static var smoothedSceneDepth: ARConfiguration.FrameSemantics { get }

Discussion

Enable this option on a world-tracking configuration (ARWorldTrackingConfiguration) to instruct ARKit to provide your app with the distance between the user’s device and the real-world objects pictured in the frame's capturedImage. ARKit samples this distance using the LiDAR scanner and provides the results through the smoothedSceneDepth property on the session’s currentFrame. (When this option is enabled in the ARWorldTrackingConfiguration, ARKit tells the app to provide the app with the distance between the user's device and the real-world objects in the captured image of the frame. This distance is sampled using a LiDAR scanner and the result is provided through the smoothedSceneDepth property of the session's currentFrame.)

It is specified here to use a LiDAR scanner.

To minimize the difference in LiDAR readings across frames, ARKit processes the data as an average. The averaged readings reduce flickering to create a smoother motion effect when depicting objects with depth, as demonstrated in Creating a Fog Effect Using Scene Depth. Alternatively, to access a discrete LiDAR reading at the instant the framework creates the current frame, use sceneDepth. (ARKit treats the data as an average to minimize the difference in LiDAR readings between frames. The averaged readings are demonstrated in Creating Fog Effects Using Scene Depth. As you can see, it reduces flicker when depicting deep objects and produces smoother motion effects. You can also access individual LiDAR readings the moment the framework creates the current frame. Use sceneDepth.)

This is the most important. The difference from sceneDepth is clearly stated, and the merits and proper use are also described.

ARKit supports scene depth only on LiDAR-capable devices, so call supportsFrameSemantics(:) to ensure device support before attempting to enable scene depth. (ARKit only supports scene depth on LiDAR-enabled devices, so call supportsFrameSemantics ( :) to check device support before enabling scene depth.)

smoothedSceneDepth: ARDepthData

ARFrame's smoothedSceneDepth property

https://developer.apple.com/documentation/arkit/arframe/3674209-smoothedscenedepth

An average of distance measurements between a device's rear camera and real-world objects that creates smoother visuals in an AR experience. (The average distance measurement between the device's rear camera and real-world objects creates a smoother visual in the AR experience.)

Declaration

var smoothedSceneDepth: ARDepthData? { get }

Discussion

This property describes the distance between a device's camera and objects or areas in the real world, including ARKit’s confidence in the estimated distance. This is similar to sceneDepth except that the framework smoothes the depth data over time to lessen its frame-to-frame delta. (This property describes the distance between the device's camera and a real-world object or area. This is similar to sceneDepth, but the framework moves depth data over time to reduce the diffs between frames. Make it smooth.)

This property is nil by default. Add the smoothedSceneDepth frame semantic to your configuration’s frameSemantics to instruct the framework to populate this value with ARDepthData captured by the LiDAR scanner. (This property is nil by default. Add smoothedSceneDepth frame semantics to the settings frameSemantics to tell the framework to enter this value in the ARDepthData captured by the LiDAR scanner.)

Call supportsFrameSemantics(:) on your app’s configuration to support smoothed scene depth on select devices and configurations. (Call supportsFrameSemantics ( :) in the app settings to support smoothed scene depth with the selected device and settings.)

Relation

I'm writing a book on ARKit and a book on iOS Depth.

Practice ARKit --BOOTH Depth in Depth --Detailed iOS Depth --BOOTH

Advent calendar

This article is the 20th day article of iOS Advent Calendar. As of December 23, 2020, the person who entered the article had not posted the article, so I confirmed it and posted the article on behalf of him.

Recommended Posts

ARKit 4 LiDAR Depth API
Copying an object with ARKit + CoreML + LiDAR