[SWIFT] Reproduction of "You are in front of King Laputa" with ARKit + SceneKit + Metal

I often see 3D models appearing from walls and floors in AR, so I challenged. The theme was "Laputa, the castle in the sky." A scene where Muska and Sheeta appear from the ceiling in front of General General Mouro in Laputa's observation room. (I think a room with a hole in the floor is an "observation room", but it follows the expression on the wiki)

(Aside from the model being a red panda, ...) Completed image demo.pngdemo.gif The problem with reproduction was the light yellow part of the boundary between the character and the ceiling (although Laputa's technology may or may not make the boundary light yellow, it is unified here). The reproduction method is explained below.

Reproduction method

① Animate the Muska and Theta nodes up and down (2) Create depth information (hereinafter referred to as depth) to create a pale yellow surface on the ceiling boundary. Make the following three. ・ Depth of the boundary plane of the ceiling -Depth when drawing a character with cullMode = back -Depth when drawing with cullMode = front of the character ③ Judge the boundary surface and the cross section of the character from the information in ②, and add light yellow to the image in ①.

** Rendering path (Xcode Capture GPU Frame) ** renderpass.png Let's look at them individually below.

① Animate the Muska and Theta nodes up and down

This is set in Xcode's Scene Editor. -Arrange the characters (character model borrowed from WWDC 2017 SceneKit Demo). The character hangs under the coordinating node char_parent. -Place the boundary node slice_plane in the same row as char_parent. This interface node does not animate.

→ Make the color almost transparent. slice_plane_1.png → Decrease the value of Rendering Order and draw before the character so that the character is not drawn in the background. slice_plane_2.png The Category bit mask is set here. It will be used to distinguish the character from the interface when generating the depth later. Set 4 for the boundary surface and 2 for the character.

→ Set the animation scene_animation.png

② Create depth information to create a pale yellow surface on the ceiling boundary

The depth information on the front side (front side) of the character and the depth information on the back side (invisible side) of the character are acquired, and the substance of the character is obtained by the difference.

  1. Get the depth of the back part of the character → Only the back side is drawn by specifying cullMode = front (described later). ** The cross section of the character will be larger than the depth obtained here. ** **
  2. Get the depth of the front part of the character → Only the front side is drawn by specifying cullMode = back (described later). ** The cross section of the character will be smaller than the depth obtained here. ** **
  3. Obtain the depth of the boundary surface (ceiling plane) ** In the depths of 1) and 2) above, the cross section of the character is the range of the depth of this boundary surface **.

Depth information for each of the above three is generated by multipath rendering by SCNTechnique. The definition of multipath rendering is as follows.


    "targets" : {
        "color_scene"     : { "type" : "color" },
        "depth_slice"     : { "type" : "depth" },
        "depth_cullback"  : { "type" : "depth" },
        "depth_cullfront" : { "type" : "depth" }
    "passes" : {
        "pass_scene" : {
            "draw"    : "DRAW_SCENE",
            "outputs" : {
                "color" : "color_scene"
        "pass_slice" : {
            "draw"                : "DRAW_NODE",
            "includeCategoryMask" : 4,
            "outputs" : {
                "depth" : "depth_slice"
            "depthStates" : {
                "clear" : true,
                "func" : "less"
        "pass_cullback" : {
            "draw"                : "DRAW_NODE",
            "includeCategoryMask" : 2,
            "cullMode"            : "back",
            "outputs" : {
                "depth" : "depth_cullback"
            "depthStates" : {
                "clear" : true,
                "func" : "less"
        "pass_cullfront" : {
            "draw"                : "DRAW_NODE",
            "includeCategoryMask" : 2,
            "cullMode"            : "front",
            "outputs" : {
                "depth" : "depth_cullfront"
            "depthStates" : {
                "clear" : true,
                "func" : "less"
        "pass_mix" : {
            "draw"   : "DRAW_QUAD",
            "inputs" : {
                "colorScene"     : "color_scene",
                "depthSlice"     : "depth_slice",
                "depthCullBack"  : "depth_cullback",
                "depthCullFront" : "depth_cullfront"
            "metalVertexShader"   : "mix_vertex",
            "metalFragmentShader" : "mix_fragment",
            "outputs" : {
                "color" : "COLOR"
            "colorStates" : {
                "clear"      : "true",
                "clearColor" : "0.0 0.0 0.0 0.0"
    "sequence" : [

Let's look at it little by little.

        "pass_scene" : {
            "draw"    : "DRAW_SCENE",
            "outputs" : {
                "color" : "color_scene"

This is the definition of drawing the entire scene. By specifying DRAW_SCENE for draw, the camera capture image + character is drawn. The drawing result is only color information and is stored in a buffer named color_scene.

        "pass_slice" : {
            "draw"                : "DRAW_NODE",
            "includeCategoryMask" : 4,
            "outputs" : {
                "depth" : "depth_slice"
            "depthStates" : {
                "clear" : true,
                "func" : "less"

This is a drawing of the ceiling boundary surface. 4 is specified in includeCategoryMask, and it is set to draw only the boundary plane. No color information is required for this drawing, only the depth is stored in a buffer named depth_slice.

        "pass_cullback" : {
            "draw"                : "DRAW_NODE",
            "includeCategoryMask" : 2,
            "cullMode"            : "back",
            "outputs" : {
                "depth" : "depth_cullback"
            "depthStates" : {
                "clear" : true,
                "func" : "less"

This is a definition for acquiring the depth information of a character when viewed from the front. 2 is specified in includeCategoryMask, and only the character is set to be drawn. Back is specified for cullMode to draw the visible part and not to draw the invisible (back) (default is back). No color information is required for this drawing, only the depth is stored in a buffer named depth_cullback.

        "pass_cullfront" : {
            "draw"                : "DRAW_NODE",
            "includeCategoryMask" : 2,
            "cullMode"            : "front",
            "outputs" : {
                "depth" : "depth_cullfront"
            "depthStates" : {
                "clear" : true,
                "func" : "less"

This is a definition for getting the depth information on the back side of the character. Similar to "pass_cullback", but cullMode specifiesfront.

        "pass_mix" : {
            "draw"   : "DRAW_QUAD",
            "inputs" : {
                "colorScene"     : "color_scene",
                "depthSlice"     : "depth_slice",
                "depthCullBack"  : "depth_cullback",
                "depthCullFront" : "depth_cullfront"
            "metalVertexShader"   : "mix_vertex",
            "metalFragmentShader" : "mix_fragment",
            "outputs" : {
                "color" : "COLOR"
            "colorStates" : {
                "clear"      : "true",
                "clearColor" : "0.0 0.0 0.0 0.0"

This is the definition that finally displays the camera capture + character + character cross section. The output result (color information, depth) of each drawing path specified in inputs is combined with themix_fragmentfragment shader (described later) specified in metalFragmentShader to make the final image. It is drawn on the screen by specifying COLOR for color of outputs.

③ Judge the boundary surface and the cross section of the character from the information in ②, and add a light yellow color to the image in ①.

This is done with the mix_fragment shader mentioned above. The processing content is as described in the comment in the source, and it is determined whether to display light yellow in the depth information and added to the color of the entire scene.

fragment half4 mix_fragment(MixColorInOut vert [[stage_in]],
                            constant SCNSceneBuffer& scn_frame [[buffer(0)]],  //Drawing frame information
                            texture2d<float, access::sample> colorScene [[texture(0)]],
                            depth2d<float,   access::sample> depthSlice [[texture(1)]],
                            depth2d<float,   access::sample> depthCullBack [[texture(2)]],
                            depth2d<float,   access::sample> depthCullFront [[texture(3)]])
    //Depth of ceiling interface
    float ds = depthSlice.sample(s, vert.uv);
    //Depth of polygons facing toward you from the viewpoint
    float db = depthCullBack.sample(s, vert.uv);
    //Depth of polygons facing opposite from the point of view
    float df = depthCullFront.sample(s, vert.uv);
    float4 sliceColor = float4(0.0, 0.0, 0.0, 0.0);
    if (df < ds) {
        //The boundary surface is in front of the back side of the character
        if (ds < db) {
            //In addition, the boundary surface is behind the front side of the character.
            sliceColor = float4(0.5, 0.5, 0.0, 0.0);    //Light yellow
    //Add border color to the entire scene image, including the camera-captured image
    float4 fragment_color = colorScene.sample(s, fract(vert.uv));
    fragment_color += sliceColor;   //I think it's a rough process, but I'm not familiar with color handling, so I'll review it if I have a chance.
    return half4(fragment_color);

This is the end of the explanation.

I couldn't find a way to color the geometry-to-geometry contact section by google. This time, I think that I can see it somehow by trial and error, but since this method creates only two depth information of the front and back of the character in addition to the depth of the boundary surface, another character is behind the character. If so, there is a problem that the depth of the character behind is overwritten by the depth of the character in front, and the boundary surface is not drawn. I think there are other good ways, so please let me know if you know. The following are the contents that I investigated and tried in the process of trial and error.

Method tried for reproduction

  1. Do a lot of hit tests to explore the shape of the character at the interface Find the surface position of the character by arranging 100 hitTestWithSegment (from: to: options :) of SCNNode side by side on the boundary surface and hit-testing them from the front to the back of the character and from the back to the front. I made a cross section. → The accuracy of hitTestWithSegment was not the expected level, and the result was a little different from the shape of the geometry, so it could not be used. Especially in small areas such as ears and feet, the position of the hit result deviated greatly from the appearance. I don't think it's a completely different usage from the original purpose.

  2. Create boundary surface geometry in real time -Flat the geometry of the character at the boundary and make that part light yellow. → For example, when the geometry and the boundary surface are in contact with each other at multiple points like a foot, it seems to be quite troublesome to flatten the geometry on each of the right foot and the left foot. I haven't tried it. Also, if the geometry is low poly, it seems to be rattling, so it may be necessary to divide the geometry by tessellation (?). -Create a new planar geometry at the part that touches the boundary surface with the geometry of the character and place it on the boundary surface. → Again, even if you can get the vertices of the geometry near the boundary surface, it seems to be troublesome to make a closed plane geometry from it (can you do your best with normal information ??). → I found some methods by going around with "mesh slicing", but it seemed difficult and I stopped. ・ Algorithm or software for slicing a mesh → UE4 seems to be able to do Mesh Slice in real time. It doesn't seem to be in SceneKit. ・ Https://unrealengine.hatenablog.com/entry/2016/09/12/002115

Whole source code

・ Swift


class ViewController: UIViewController, ARSCNViewDelegate {

    @IBOutlet weak var scnView: ARSCNView!
    private let device = MTLCreateSystemDefaultDevice()!
    private var charNode: SCNNode!
    private var isTouching = false      //Touch detection
    override func viewDidLoad() {
        //Character loading. Borrowed WWDC2017 SceneKit Demo https://developer.apple.com/videos/play/wwdc2017/604/
        guard let scene = SCNScene(named: "art.scnassets/scene.scn"),
              let charNode = scene.rootNode.childNode(withName: "char_node", recursively: true) else { return }
        self.charNode = charNode
        self.charNode.isHidden = true
        //Scene Technique setup
        //AR Session started
        self.scnView.delegate = self
        let configuration = ARWorldTrackingConfiguration()
        configuration.planeDetection = [.horizontal]
        self.scnView.session.run(configuration, options: [.removeExistingAnchors, .resetTracking])
    //Called frame by frame
    func renderer(_ renderer: SCNSceneRenderer, updateAtTime _: TimeInterval) {
        if isTouching {
            //The screen was touched
            isTouching = false
            DispatchQueue.main.async {
                //Skip if displayed
                guard self.charNode.isHidden else { return }
                let bounds = self.scnView.bounds
                let screenCenter = CGPoint(x: bounds.midX, y: bounds.midY)
                let results = self.scnView.hitTest(screenCenter, types: [.existingPlaneUsingGeometry])
                guard let existingPlaneUsingGeometryResult = results.first(where: { $0.type == .existingPlaneUsingGeometry }),
                      let _ = existingPlaneUsingGeometryResult.anchor as? ARPlaneAnchor else {
                    //There is no plane in the center of the screen, so do nothing
                //Place Muska and Theta nodes in the center of the screen
                let position = existingPlaneUsingGeometryResult.worldTransform.columns.3
                self.charNode.simdPosition = SIMD3<Float>(position.x, position.y, position.z)
                self.charNode.isHidden = false
    private func setupSCNTechnique() {
        guard let path = Bundle.main.path(forResource: "technique", ofType: "json") else { return }
        let url = URL(fileURLWithPath: path)
        guard let techniqueData = try? Data(contentsOf: url),
              let dict = try? JSONSerialization.jsonObject(with: techniqueData) as? [String: AnyObject] else { return }
        //Enable multipath rendering
        let technique = SCNTechnique(dictionary: dict)
        scnView.technique = technique
    override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?) {
        guard let _ = touches.first else { return }
        isTouching = true

・ Metal

#include <metal_stdlib>
using namespace metal;
#include <SceneKit/scn_metal>

// SceneKit ->Shader delivery type
//The definition is https://developer.apple.com/documentation/scenekit/See scnprogram
struct VertexInput {
    float4 position [[attribute(SCNVertexSemanticPosition)]];   //Vertex coordinates

struct MixColorInOut {
    float4 position [[position]];
    float2 uv;

vertex MixColorInOut mix_vertex(VertexInput in [[stage_in]],
                                        constant SCNSceneBuffer& scn_frame [[buffer(0)]])
    MixColorInOut out;
    out.position = in.position;
    //Coordinate system-1.0 ~ 1.0 -> 0.0 ~ 1.Converted to 0. The y-axis is inverted.
    out.uv = float2((in.position.x + 1.0) * 0.5 , (in.position.y + 1.0) * -0.5);
    return out;

constexpr sampler s = sampler(coord::normalized,

fragment half4 mix_fragment(MixColorInOut vert [[stage_in]],
                            constant SCNSceneBuffer& scn_frame [[buffer(0)]],  //Drawing frame information
                            texture2d<float, access::sample> colorScene [[texture(0)]],
                            depth2d<float,   access::sample> depthSlice [[texture(1)]],
                            depth2d<float,   access::sample> depthCullBack [[texture(2)]],
                            depth2d<float,   access::sample> depthCullFront [[texture(3)]])
    //Depth of ceiling interface
    float ds = depthSlice.sample(s, vert.uv);
    //Depth of polygons facing toward you from the viewpoint
    float db = depthCullBack.sample(s, vert.uv);
    //Depth of polygons facing opposite from the point of view
    float df = depthCullFront.sample(s, vert.uv);
    float4 sliceColor = float4(0.0, 0.0, 0.0, 0.0);
    if (df < ds) {
        //The boundary surface is in front of the back side of the character
        if (ds < db) {
            //In addition, the boundary surface is behind the front side of the character.
            sliceColor = float4(0.5, 0.5, 0.0, 0.0);    //Light yellow
    //Add border color to the entire scene image, including the camera-captured image
    float4 fragment_color = colorScene.sample(s, fract(vert.uv));
    fragment_color += sliceColor;   //I think it's a rough process, but I'm not familiar with color handling, so I'll review it if I have a chance.
    return half4(fragment_color);

