Leaving slide mode.

WebSpatial API for Spatialized HTML/CSS and Spatialized PWAs on spatial and multimodal AI devices

Dexter Yang, Ruoya Sheng, Siyaman Sadagopan (PICO OS, ByteDance)

TPAC 2025
Kobe, Japan & online
10–14 November 2025

Notes.

Today's Agenda

  1. The Problems We're Facing
    (and What We Currently Have, Why That's Not Enough)
  2. What Solutions or APIs Are Required to Solve These Problems
  3. Discussion and To-Dos
  4. Real-World Practices

Notes.

1. The Problems We're Facing
(and What We Currently Have, Why That's Not Enough)

Notes.

Big Problem 1 of 3:
AI and Next-Gen Personal Computing Are Reshaping Apps, While Web Is Falling Behind

What's Happening?

Notes.

What's Happening?

Handheld Devices -> Wearable Devices -> Hands-free Wearable (Head-mounted)

Because AI is in such high demand, the next-gen personal computing devices and OS need much better multimodal interaction.

Notes.

AI Glasses

What you can see/hear, AI should also see/hear -- Hands-free. Always available. Multimodal Interactive Devices

Why is the next-gen personal device for AI basically a spatial computing device?

AR Glasses

Integrate with the surrounding physical space

This image shows the key point - it highlights why AI demands more from the web and introduces he emergence of spatial GUI environments.

MR Goggles (MR Headsets)

MR Goggles

More well-rounded and near-term app needs we’re about to face.

All apps - not just 3D games - need to shift to spatial use cases

Spatialized 2d App

Notes.

Examples for Spatialized 2d GUI

Notes.

Spatial UI Benifit - Split
Split the Web Page,
Free the UI
Spatial UI Benifit - Depth
Elevate HTML Elements,
Unlock Depth
Spatial UI Benifit - multiple Scenes
Multiple Scene Containers,
Native Power
Spatial UI Benifit - 3D Content
Add True 3D Content,
Blend Dimensions

Notes.

Next-gen operating systems have a far greater need than mobile OSes for "install-free apps" based on open standards.

Instant Apps

* "Install-free apps" are large-scale and hard to catalog; they launch via a link, run on demand, are disposable by default, and can be upgraded to installed apps when needed

Notes.

Why

Notes.

The sole "super app" of the desktop era - the browser - is making a comeback, but in new forms

ChatGPT app is a browser

ChatGPT app

Chat boxes are replacing address bars.
Message feeds are replacing tabs.
Camera app is a browser

TikTok/Snapchat camera

XR seethrough views are replacing
the address bars.
Window containers with spatial layout are replacing tabs.

Notes.

"Install-free apps" are not inherently bound to the Open Web

We already saw this split happen in China's mobile internet market, where non-standard "mini-app" ecosystems inside super apps like WeChat haven't just taken over most native app needs, but have also pretty much wiped out the Open Web in China.

Mini Apps

Notes.

What's the Problem

Apple and Google are extending their native 2D GUI frameworks and platform-specific app ecosystems to support experiences that go beyond screens, blend with spatial environments, and still meet mainstream app needs, whereas the Open Web today cannot satisfy these demands concurrently.

visionOS
  • Compatible iPhone and iPad apps -> visionOS apps
  • SwiftUI + RealityKit + ARKit
Android XR
  • XR compatible large screen app -> XR differentiated app
  • Jetpack Compose + SceneCore + ARCore

Switch to Siyaman

The Web might once again fall way behind native apps

The web fell behind during the paradigm shift from desktop to mobile.

Switch back to Ruoya

What We Currently Have and Why That's Not Enough

If the status quo persists, as AI/AR glasses and visionOS/Android XR devices take off, web devs will be forced to switch stacks - moving to native 2D GUI stacks or to hybrid ecosystems like React Native (React Native visionOS) / Mini-apps that diverge from the mainstream web.

Native

Development outcomes accumulate in closed, platform-exclusive walled gardens.

Native / React Native

Loss of the web's core advantages, like URLs, no-install, on-demand access.

Mini-apps

It neither inherits from nor integrates with the existing web ecosystem, it starts anew.

Notes.

Big Problem 2 of 3:
Mainstream Web Stack Lacks New UI Capabilities for Spatial Apps

What's Happening?

Switch to Siyaman

Paradigm shift in XR OS:
From Compositor-based Architecture to Unified Rendering Architecture

Through the visionOS platform, Apple pioneered a Unified Rendering architecture and a Shared Space model for multi-app coexistence, setting a new standard in the industry.

Apple announced Shared Space at WWDC 2025

Notes.

Shared Space

Multi-apps in Shared Space Multi-apps in Shared Space

Notes.

Unified Rendering

Unified Rendering Unified Rendering

Notes.

Notes.

With visionOS, Apple has established industry design patterns for spatially 2D+3D hybrid GUIs:

Notes.

Patterns for 2D+3D GUI

Notes.

Notes.

Examples for natural interaction

Notes.

What We Currently Have for Spatial UI and Unified Rendering

Switch to Ruoya

The current Immersive Web Working Group focuses on the WebXR API standard.

Immersive Web

Switch back to Siyaman

Why We Should Do More

Like OpenXR, WebXR API takes over the XR device's full stereo view and the entire space, renders on its own using low-level 3D graphics APIs (WebGL/WebGPU), submits only final frames to the OS compositor, and requires building natural interaction from scratch.

Notes.

What We Currently Have for Spatial UI and Unified Rendering

HTML/CSS/DOM are screen-based by now, even without a screen.

HTML/CSS/DOM

Notes.

Why Screen-based HTML/CSS Isn't Enough

Current layout systems in HTML/CSS/DOM only support the X and Y axes.

Notes.

Why Screen-based HTML/CSS Isn't Enough

All HTML elements are just flat 2D panels with no volume.

Notes.

Why Screen-based HTML/CSS Isn't Enough

Only fixed, solid colors are available, and CSS styles can only be manually authored based on static device states (media queries)

Notes.

Why Screen-based HTML/CSS Isn't Enough

Existing web window–related APIs are insufficient for spatial scene containers.

Notes.

Why Screen-based HTML/CSS Isn't Enough

Supports only low-level JS interaction events based on 2D positions, without natural gestures or spatial position tracking.

Notes.

The spatial web features added to Safari on visionOS

Model Element 1 Model Element 1

Notes.

The spatial web features added to Safari on visionOS

Fullscreen API 1 Fullscreen API 2

Notes.

Why We Should Do More

Model element

  • The <model> element can only render volumetric 3D content inside a "hole" in the page plane. It can't appear in the space in front of the page like a native SwiftUI app's Model3D view.
  • The 3D content in the <model> element can only come from pre-made 3D model files; you can't program it dynamically (no mainstream Web 3D engine features)
  • Notes.

    Why We Should Do More

    Immersive Media

  • Like a WebXR session, you have to call the Fullscreen API to switch into a special mode to view spatial photos and videos, instead of viewing that spatial content right in the webpage window.
  • Notes.

    Spatial Browsing

  • The Spatial Browsing capability introduced in Safari for visionOS 26 doesn't introduce or rely on new Web APIs, so it's merely an app-level, reader-mode-like feature available only on a limited small set of qualifying, article-centric pages.
  • Spatial Browsing

    Notes.

    Why We Should Do More

    Spatial Browsing-like auto-conversion of 2D pages into spatial UIs

    Notes.

    Why We Should Do More

    Spatial features in traditional browser UIs, like Spatial Browsing:

    Notes.

    A Current Technology Bypasses Traditional Browser UI Limits

    PWA

    PWA on Quest

    Notes.

    Why We Should Do More

    PWA

    Notes.

    Big Problem 3 of 3:
    Web 3D is Still Hard and Lacks Developers for General Computing Needs

    What's Happening?

    Notes.

    Paradigm shift in XR Development: From 3D containing 2D to 2D containing 3D

    3D vs 2D 3D vs 2D

    Notes.

    Traditional 3D development: "3D containing 2D"

    3D containing 2D

    Notes.

    3D containing 2D

    Notes.

    Spatial Development: "2D containing 3D"

    2D containing 3D

    Notes.

    Spatial Development: "2D containing 3D"

    Notes.

    Notes.

    2D containing 3D in SwiftUI

    Notes.

    Notes.

    2D containing 3D in SwiftUI

    Notes.

    What We Currently Have for 3D Development

    Notes.

    Why Canvas Isn't Enough

    The existing <canvas> element in HTML isn't really part of this new "2D including 3D" paradigm.

    Notes.

    Why Web 3D Engines Aren't Enough

    Web 3D Engines

    Notes.

    Why WebXR Isn't Enough

    WebXR

    Notes.

    2. What Solutions or APIs Are Required to Solve These Problems

    Notes.

    Break WebXR's limits in Spatial UI and Spatial Development:

    Solution Pillar 1 of 3:
    A New Dev Paradigm Enabling Unified Rendering and Developer Friendliness

    We need a new XR development system, distinct from WebXR - one that completely avoids custom rendering, lets the OS understand the content, fits into the Unified Rendering architecture and Shared Space, and is designed for regular 2D web developers by building on the familiar mainstream web ecosystem and mindset. It should continue to offer ease of use and high efficiency, while still offering enough spatial and 3D development power.

    WebSpatial Logo

    Notes.

    Close the gaps with native 2D GUI frameworks and native spatial apps:

    Solution Pillar 2 of 3
    Enabling Spatial UI in the Mainstream HTML/CSS Web Ecosystem

    The only solution is to introduce these spatial UI capabilities, which are currently exclusive to native 2D GUI frameworks, into the standardized, open, and mainstream HTML/CSS-based Web ecosystem at the earliest opportunity.

    WebSpatial Logo

    Notes.

    Systematically extend spatial web and spatial browsing trends toward a truly 2D-includes-3D paradigm:

    Solution Pillar 3 of 3
    Blend into the 2D HTML/CSS/DOM APIs with Minimal Additions

    A minimal but systematic HTML/CSS/JS API set, similar to SwiftUI's new APIs in visionOS, is required to extend existing 2D HTML/CSS/JS APIs specifically for spatial UI requirements. This extension should integrate seamlessly with the original 2D APIs, enabling spatial functionality to be implemented with precision and flexibility while preserving existing content and development experiences with minimal cost and disruption.

    WebSpatial Logo

    Notes.

    Proposed API 1

    Fix the issue that HTML/CSS layouts don't really have a Z-axis.

    Notes.

    Proposed API 1: Depth

    Depth Layout

    Code example of Depth Layout
    Example of Depth Layout
    Example of Depth Layout

    Notes.

    Proposed API 1: Depth

    Spatial Transform

    Code example of spatial transform
    Example of spatial transform
    Example of spatial transform

    Notes.

    Proposed API 2

    Fix the issue where HTML/CSS only support fixed solid colors and you cannot remove the window background and border.

    Notes.

    Proposed API 2: Material

    Code example of material background
    Example of material background

    Notes.

    Proposed API 3

    Fix the problem that HTML cannot support natural interactions in 3D space.

    Notes.

    Proposed API 3: Spatial Events

    Example of spatial events

    Notes.

    Proposed API 4

    Fix the problem where enabling spatial UI in traditional browsers forces you into a special dedicated mode or makes you install a PWA first

    Notes.

    Proposed API 4: Install-free PWA

    Reuse and extend the standalone window experience provided by the Web App Manifest and PWAs, without sacrificing the web's advantage in delivering install-free apps.

    PWA Menu example

    Notes.

    Proposed API 5

    Fix the problem where current web window APIs don't support initializing spatial scene containers or handling different scene types.

    Notes.

    Proposed API 5: Spatial Scenes

    Initialize the Start Scene

    • Web App Manifest
      • "start_url"
      • "start_scene"
        • "type": "window" | "volume" | "stage"
        • "defaultSize"

    Create/Manage More Scenes

    • <a href={newSceneUrl} target="_blank">
    • <a href={newSceneUrl} target="newSceneName">
    • window.open(newSceneUrl);
    • window.open(newSceneUrl, "newSceneName");

    Notes.

    Scene Initialization

    Scene Status (Size)

    Notes.

    Proposed API 6

    Expand what the existing model element can do, and fix the issue that HTML only has 2D elements and no real 3D elements

    Notes.

    Proposed API 6: 3D Containers

    3D Containers

    Notes.

    Proposed API 6: 3D Containers

    Example of 3D Containers Example of 3D Containers

    Notes.

    Proposed API 7

    Fix the issue where the current canvas element can't serve as the bridge between the 2D and 3D worlds in a "2D containing 3D" development paradigm.

    Notes.

    Proposed API 7: 3D Engine Elements

    Example of 3D Engine Code

    Notes.

    Proposed API 7: 3D Engine Elements

    Asset Declaration

    • <asset>
      • or (<prefab>, <texture>)
      • with <source />
    • <material />
    • <div attachment="info">

    Entities with Built-in Components

    • Root: <world>
    • Primitives: <box>, <sphere>, <plane>, <cone>, <cylinder>
    • Attachments: <attachment name="info">
    • Models: <entity prefab="exampleModel" />

    Entity Properties for Built-in Components

    • Transform: <entity position="0 0 0" rotation="0 0 0" />
    • Enable Physics Engine
    • Enable Particle Effects

    ...

    Notes.

    More missing APIs

    Explore cutting-edge APIs

    Notes.

    3. Discussion and To-Dos

    Switch to Ruoya and break for discussion

    4. Real-World Practices

    Notes.

    https://webspatial.dev

    Build an open-source SDK project called WebSpatial SDK based on existing declarative web frameworks and web build tools. This way, developers can start using WebSpatial APIs right away in HTML/CSS-style APIs like JSX and CSS-in-JS, without having to wait for browser engines to support them.

    Hybrid technology will be used to implement the WebSpatial Runtime across different spatial app platforms. On platforms where the WebSpatial Runtime can be integrated into the browser, web developers just need to run the site URL as a PWA to enable spatial UI. For platforms where the browser can't be modified, developers can add a packaging step to their workflow (similar to using Electron) to pre-package the site as a PWA, with the WebSpatial Runtime bundled in.

    Notes.

    https://github.com/webspatial/webspatial-sdk

    WebSpatial SDK Supports

    Notes.

    https://github.com/webspatial/webspatial-sdk

    WebSpatial SDK Supports

    Notes.

    WebSpatial Builder provides full support for packaging, simulator-based debugging, real device testing, and submission to the visionOS App Store, eliminating the need to interact with Xcode or native app shells throughout the entire process.

    WebSpatial SDK Supports

    Notes.

    Demo - Techshop

    https://github.com/webspatial/sample-techshop

    Just by adding some CSS styles and a bit of if-else logic to existing website, you can have a completely different spatialized UI on visionOS.

    Techshop Demo Techshop Demo

    Demo - Widget Generator

    https://github.com/webspatial/widget-generator

    Real App Store Apps Built with WebSpatial From the Developer Community

    Real App Store Apps Built with WebSpatial From the Developer Community

    Real App Store Apps Built with WebSpatial From the Developer Community

    https://youtu.be/VaoG2SETVKw?si=EVQa1hBmCMIMUwue

    Wait! Video plays automatically inline, don't click Youtube link

    Real App Store Apps Built with WebSpatial From the Developer Community

    https://youtu.be/6qTAYwV8B5M?si=vj05bJB8EfF1G4hI

    Cuire Demo

    Early Feedback and Data

    https://tpac2025.webspatial.dev