Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ view inside the editor.

## Features

- Supports streaming both iOS simulators and Android emulators
- Supports streaming iOS simulators and Android emulators, including WebRTC audio
- Full simulator control & inspection using private iOS accessibility APIs and Android UIAutomator - available using `simdeck` CLI
- Real-time screen `describe` command using accessibility view tree - available in token-efficient format for agents
- Profiling built-in: CPU, memory, disk writes, network throughput, hang signals, and stack sampling
Expand Down
13 changes: 11 additions & 2 deletions docs/api/rest.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ Performance query parameters:
| `GET` | `/api/simulators/{udid}/control` | Alias for input control WebSocket |
| `POST` | `/api/simulators/{udid}/refresh` | Request a fresh frame or keyframe |

For normal clients, copy the browser behavior instead of hand-coding a raw decoder. The UI supports WebRTC first and H.264 WebSocket fallback.
For normal clients, copy the browser behavior instead of hand-coding a raw decoder. The UI supports WebRTC first and H.264 WebSocket fallback. WebRTC carries H.264 video and, when the offer includes an audio receiver, a PCMU simulator-audio track sourced from the selected simulator or emulator process tree. The H.264 WebSocket fallback is video-only.

Minimal WebRTC request:

Expand All @@ -194,7 +194,16 @@ Response:
```json
{
"type": "answer",
"sdp": "v=0..."
"sdp": "v=0...",
"audio": {
"codec": "PCMU",
"sampleRate": 8000,
"channels": 1
},
"video": {
"width": 1179,
"height": 2556
}
}
```

Expand Down
15 changes: 15 additions & 0 deletions docs/guide/video.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ SimDeck streams live device video to the browser. Local sessions default to high

iOS simulator H.264 uses VideoToolbox for hardware encoding and x264 for software encoding.

WebRTC streams also include simulator audio. The browser menu exposes a Sound
toggle so viewers can keep playback muted until they want to hear the device.
H.264 WebSocket fallback remains video-only.

## When encoding runs

SimDeck starts encoding when a browser stream needs H.264 frames. The server
Expand Down Expand Up @@ -73,6 +77,17 @@ simdeck service restart --video-codec software --low-latency

The browser tries WebRTC first. If WebRTC cannot render a frame, the UI can fall back to H.264 over WebSocket when the browser supports WebCodecs.

Audio is carried on the WebRTC path using a browser-compatible PCMU track. On
macOS 14.2 and newer, SimDeck uses Core Audio process taps over the selected
simulator or emulator process tree, then routes that tap through a private
aggregate device into the WebRTC audio track. If macOS has not granted system
audio recording access, video still streams and the server logs the
audio-capture failure. While the tap is being read, Core Audio mutes the tapped
simulator process at the hardware output; browser playback is controlled by the
Sound toggle. Android emulators launched by SimDeck are started with host audio
enabled, so restart older no-audio emulator processes before testing Android
sound.

Force a mode while debugging:

```text
Expand Down
21 changes: 20 additions & 1 deletion packages/client/src/app/AppShell.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,10 @@ import {
simulatorUsesInsetChromeButtons,
} from "../features/simulators/simulatorDisplay";
import { useSimulatorList } from "../features/simulators/useSimulatorList";
import { sendWebRtcControlMessage } from "../features/stream/streamWorkerClient";
import {
sendWebRtcControlMessage,
setActiveStreamAudioMuted,
} from "../features/stream/streamWorkerClient";
import type {
StreamConfig,
StreamEncoder,
Expand Down Expand Up @@ -560,6 +563,8 @@ export function AppShell({
const [streamTransport, setStreamTransport] = useState<StreamTransport>(
initialStreamTransportRef.current,
);
const [streamAudioMuted, setStreamAudioMuted] = useState(true);
const streamAudioMutedRef = useRef(streamAudioMuted);
const [streamConfigApplyKey, setStreamConfigApplyKey] = useState(0);
const [streamConfigReady, setStreamConfigReady] = useState(false);
const [touchIndicators, setTouchIndicators] = useState<TouchIndicator[]>([]);
Expand Down Expand Up @@ -812,6 +817,7 @@ export function AppShell({
streamBackend,
streamCanvasKey,
} = useLiveStream({
audioMuted: streamAudioMuted,
canvasElement: streamCanvasElement,
paused: !streamConfigReady,
remote: remoteStream,
Expand Down Expand Up @@ -877,6 +883,17 @@ export function AppShell({
[remoteStream],
);

const toggleStreamAudioMuted = useCallback(() => {
const next = !streamAudioMutedRef.current;
streamAudioMutedRef.current = next;
setActiveStreamAudioMuted(next);
setStreamAudioMuted(next);
}, []);

useEffect(() => {
streamAudioMutedRef.current = streamAudioMuted;
}, [streamAudioMuted]);

useEffect(() => {
if (
!selectedSimulator ||
Expand Down Expand Up @@ -2931,6 +2948,7 @@ export function AppShell({
onStreamFpsChange={updateStreamFps}
onStreamQualityChange={updateStreamQuality}
onStreamTransportChange={updateStreamTransport}
onToggleStreamAudioMuted={toggleStreamAudioMuted}
onShutdown={() => {
if (!selectedSimulator) {
return;
Expand Down Expand Up @@ -2989,6 +3007,7 @@ export function AppShell({
!selectedSimulatorTransitionKind,
)}
streamConfig={streamConfig}
streamAudioMuted={streamAudioMuted}
streamTransport={streamTransport}
simulatorMenuOpen={simulatorMenuOpen}
simulatorMenuRef={simulatorMenuRef}
Expand Down
12 changes: 12 additions & 0 deletions packages/client/src/features/simulators/SimulatorMenu.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ interface SimulatorMenuProps {
onStreamFpsChange: (fps: StreamFps) => void;
onStreamQualityChange: (quality: StreamQualityPreset) => void;
onStreamTransportChange: (transport: StreamTransport) => void;
onToggleStreamAudioMuted: () => void;
onToggleAppearance: () => void;
onToggleDebug: () => void;
onToggleMenu: () => void;
Expand All @@ -47,6 +48,7 @@ interface SimulatorMenuProps {
showBootButton: boolean;
showStopButton: boolean;
streamConfig: StreamConfig;
streamAudioMuted: boolean;
streamTransport: StreamTransport;
touchOverlayVisible: boolean;
}
Expand Down Expand Up @@ -74,6 +76,7 @@ export function SimulatorMenu({
onStreamFpsChange,
onStreamQualityChange,
onStreamTransportChange,
onToggleStreamAudioMuted,
onToggleAppearance,
onToggleDebug,
onToggleMenu,
Expand All @@ -87,6 +90,7 @@ export function SimulatorMenu({
showBootButton,
showStopButton,
streamConfig,
streamAudioMuted,
streamTransport,
touchOverlayVisible,
}: SimulatorMenuProps) {
Expand Down Expand Up @@ -200,6 +204,14 @@ export function SimulatorMenu({
)}
</select>
</label>
<label className="menu-toggle">
<input
checked={!streamAudioMuted}
onChange={() => onToggleStreamAudioMuted()}
type="checkbox"
/>
<span>Sound</span>
</label>
</div>
<div className="menu-divider" />
<div className="menu-actions">
Expand Down
1 change: 1 addition & 0 deletions packages/client/src/features/stream/streamTypes.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import type { Size } from "../viewport/types";

export interface StreamConnectTarget {
audioMuted?: boolean;
clientId?: string;
platform?: string;
remote?: boolean;
Expand Down
106 changes: 103 additions & 3 deletions packages/client/src/features/stream/streamWorkerClient.ts
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@ export function sendWebRtcStreamControl(options: {
);
}

export function setActiveStreamAudioMuted(muted: boolean) {
activeStreamClient?.setAudioMuted(muted);
}

function sendStreamQualityConfig(config: StreamConfig): boolean {
const encoded = JSON.stringify({
config: streamQualityPayload(config),
Expand Down Expand Up @@ -230,6 +234,7 @@ function compareVideoToImage(
export function buildStreamTarget(
udid: string,
options: {
audioMuted?: boolean;
clientId?: string;
platform?: string;
remote?: boolean;
Expand All @@ -238,6 +243,7 @@ export function buildStreamTarget(
} = {},
): StreamConnectTarget {
return {
audioMuted: options.audioMuted,
clientId: options.clientId,
platform: options.platform,
remote: options.remote,
Expand Down Expand Up @@ -290,6 +296,7 @@ interface StreamClientBackend {
disconnect(): void;
applyStreamConfig?(config?: StreamConfig): void | Promise<void>;
sendControl?(payload: unknown): boolean;
setAudioMuted?(muted: boolean): void;
}

export interface VisualArtifactSample {
Expand Down Expand Up @@ -389,6 +396,11 @@ interface WebCodecsVideoDecoderConstructor {
}

interface WebRtcAnswerPayload extends RTCSessionDescriptionInit {
audio?: {
channels?: number;
codec?: string;
sampleRate?: number;
};
video?: {
height?: number;
width?: number;
Expand Down Expand Up @@ -1295,6 +1307,8 @@ function hexByte(byte: number): string {
}

class WebRtcStreamClient implements StreamClientBackend {
private audioElement: HTMLAudioElement | null = null;
private audioMuted = true;
private animationFrame = 0;
private canvas: HTMLCanvasElement | null = null;
private canvasContext: CanvasRenderingContext2D | null = null;
Expand Down Expand Up @@ -1408,6 +1422,7 @@ class WebRtcStreamClient implements StreamClientBackend {
this.shouldReconnect = true;
this.remoteMode = Boolean(target.remote);
this.streamTarget = target;
this.audioMuted = target.audioMuted ?? true;
if (!wasReconnecting) {
this.reconnectDelayMs = WEBRTC_RECONNECT_BASE_DELAY_MS;
}
Expand Down Expand Up @@ -1435,6 +1450,14 @@ class WebRtcStreamClient implements StreamClientBackend {
const useRgbaTransport = shouldUseLocalAndroidRgbaWebRtc(target);
this.rgbaMode = useRgbaTransport;
this.attachDiagnostics(peerConnection, target, generation);
const audioTransceiver = peerConnection.addTransceiver("audio", {
direction: "recvonly",
});
configureAudioReceiverCodecPreferences(audioTransceiver);
configureLowLatencyReceiver(
audioTransceiver.receiver,
receiverBufferSeconds(target),
);
if (!useRgbaTransport) {
this.startReceiverStatsPolling(peerConnection, target, generation);
const transceiver = peerConnection.addTransceiver("video", {
Expand Down Expand Up @@ -1485,17 +1508,21 @@ class WebRtcStreamClient implements StreamClientBackend {
};

peerConnection.ontrack = (event) => {
if (useRgbaTransport) {
if (generation !== this.connectGeneration) {
return;
}
if (generation !== this.connectGeneration) {
if (event.track.kind === "audio") {
this.attachAudioTrack(event.track, generation);
return;
}
if (useRgbaTransport || event.track.kind !== "video") {
return;
}
event.track.contentHint = "motion";
for (const receiver of peerConnection.getReceivers()) {
configureLowLatencyReceiver(receiver, receiverBufferSeconds(target));
}
const stream = event.streams[0] ?? new MediaStream([event.track]);
const stream = new MediaStream([event.track]);
const video = document.createElement("video");
video.autoplay = true;
video.className = "stream-video";
Expand Down Expand Up @@ -1606,6 +1633,19 @@ class WebRtcStreamClient implements StreamClientBackend {
return sendDataChannelMessage(this.controlChannel, JSON.stringify(payload));
}

setAudioMuted(muted: boolean) {
this.audioMuted = muted;
if (!this.audioElement) {
return;
}
this.audioElement.muted = muted;
if (!muted) {
void this.audioElement.play().catch(() => {
// Some browsers require the menu click that unmutes to happen in the page.
});
}
}

async applyStreamConfig(config?: StreamConfig) {
if (!config) {
return;
Expand Down Expand Up @@ -1703,6 +1743,12 @@ class WebRtcStreamClient implements StreamClientBackend {
this.video.remove();
}
this.video = null;
this.audioElement?.pause();
if (this.audioElement) {
this.audioElement.srcObject = null;
this.audioElement.remove();
}
this.audioElement = null;
this.reportedVideoHeight = 0;
this.reportedVideoWidth = 0;
this.controlChannel?.close();
Expand Down Expand Up @@ -2122,6 +2168,36 @@ class WebRtcStreamClient implements StreamClientBackend {
}
}

private attachAudioTrack(track: MediaStreamTrack, generation: number) {
this.audioElement?.pause();
if (this.audioElement) {
this.audioElement.srcObject = null;
this.audioElement.remove();
}
const audio = document.createElement("audio");
audio.autoplay = true;
audio.muted = this.audioMuted;
audio.preload = "auto";
audio.srcObject = new MediaStream([track]);
audio.style.display = "none";
document.body.appendChild(audio);
this.audioElement = audio;
const startPlayback = () => {
if (
generation !== this.connectGeneration ||
audio !== this.audioElement
) {
return;
}
void audio.play().catch(() => {
// Muted autoplay is best effort; unmuting from the menu retries playback.
});
};
audio.addEventListener("loadedmetadata", startPlayback);
audio.addEventListener("canplay", startPlayback);
startPlayback();
}

private attachRgbaDataChannel(channel: RTCDataChannel, generation: number) {
this.rgbaChannel?.close();
this.rgbaChannel = channel;
Expand Down Expand Up @@ -2756,6 +2832,26 @@ function configureReceiverCodecPreferences(transceiver: RTCRtpTransceiver) {
]);
}

function configureAudioReceiverCodecPreferences(
transceiver: RTCRtpTransceiver,
) {
if (!transceiver.setCodecPreferences) {
return;
}
const capabilities = RTCRtpReceiver.getCapabilities("audio");
const codecs = capabilities?.codecs ?? [];
const preferred = codecs.filter(
(codec) => codec.mimeType.toLowerCase() === "audio/pcmu",
);
if (preferred.length === 0) {
return;
}
transceiver.setCodecPreferences([
...preferred,
...codecs.filter((codec) => codec.mimeType.toLowerCase() !== "audio/pcmu"),
]);
}

function safariBaselineH264Offer(
offer: RTCSessionDescriptionInit,
): RTCSessionDescriptionInit {
Expand Down Expand Up @@ -3033,6 +3129,10 @@ export class StreamWorkerClient {
);
}

setAudioMuted(muted: boolean) {
this.backend?.setAudioMuted?.(muted);
}

applyStreamConfig(config?: StreamConfig) {
try {
const result = this.backend?.applyStreamConfig?.(config);
Expand Down
Loading
Loading