Media Technologies

Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

Photos & Camera

Audio

General

Streaming

Video

All subtopics

Post

Replies

Boosts

Views

Activity

Swift Array Out of Bounds Crash in VTFrameProcessor when using VTLowLatencyFrameInterpolationParameters

Hi everyone, Our team is encountering a reproducible crash when using VTLowLatencyFrameInterpolation on iOS 26.3 while processing a live LL-HLS input stream. 🤖 Environment Device: iPhone 16 OS: iOS 26.3 Xcode: Xcode 26.3 Framework: VideoToolbox 💥 Crash Details The application crashes with the following fatal error: Fatal error: Swift/ContiguousArrayBuffer.swift:184: Array index out of range The stack trace highlights the following: VTLowLatencyFrameInterpolationImplementation processWithParameters:frameOutputHandler: Called from VTFrameProcessor.process(parameters:) Here is the simplified implementation block where the crash occurs. (Note: PrismSampleBuffer and PrismLLFIError are our internal custom wrapper types). // Create `VTFrameProcessorFrame` for the source (previous) frame. let sourcePTS = sourceSampleBuffer.presentationTimeStamp var sourceFrame: VTFrameProcessorFrame? if let pixelBuffer = sourceSampleBuffer.imageBuffer { sourceFrame = VTFrameProcessorFrame(buffer: pixelBuffer, presentationTimeStamp: sourcePTS) } // Validate the source VTFrameProcessorFrame. guard let sourceFrame else { throw PrismLLFIError.missingImageBuffer } // Create `VTFrameProcessorFrame` for the next frame. let nextPTS = nextSampleBuffer.presentationTimeStamp var nextFrame: VTFrameProcessorFrame? if let pixelBuffer = nextSampleBuffer.imageBuffer { nextFrame = VTFrameProcessorFrame(buffer: pixelBuffer, presentationTimeStamp: nextPTS) } // Validate the next VTFrameProcessorFrame. guard let nextFrame else { throw PrismLLFIError.missingImageBuffer } // Calculate interpolation intervals and allocate destination frame buffers. let intervals = interpolationIntervals() let destinationFrames = try framesBetween(firstPTS: sourcePTS, lastPTS: nextPTS, interpolationIntervals: intervals) let interpolationPhase: [Float] = intervals.map { Float($0) } // Create VTLowLatencyFrameInterpolationParameters. // This sets up the configuration required for temporal frame interpolation between the previous and current source frames. guard let parameters = VTLowLatencyFrameInterpolationParameters( sourceFrame: nextFrame, previousFrame: sourceFrame, interpolationPhase: interpolationPhase, destinationFrames: destinationFrames ) else { throw PrismLLFIError.failedToCreateParameters } try await send(sourceSampleBuffer) // Process the frames. // Using progressive callback here to get the next processed frame as soon as it's ready, // preventing the system from waiting for the entire batch to finish. for try await readOnlyFrame in self.frameProcessor.process(parameters: parameters) { // Create an interpolated sample buffer based on the output frame. let newSampleBuffer: PrismSampleBuffer = try readOnlyFrame.frame.withUnsafeBuffer { pixelBuffer in try PrismLowLatencyFrameInterpolation.createSampleBuffer(from: pixelBuffer, readOnlyFrame.timeStamp) } // Pass the newly generated frame to the output stream. try await send(newSampleBuffer) } 🙋 Questions Are there any known limitations or bugs regarding VTLowLatencyFrameInterpolation when handling live 60fps streams? Are there any undocumented constraints we should be aware of regarding source/previous frame timing, pixel buffer attributes, or how destinationFrames and interpolationPhase arrays must be allocated? Is a "warm-up" sequence recommended after startSession() before making the first process(parameters:) call?

Media Technologies Streaming VideoToolbox HTTP Live Streaming

464

Question about PT Framework channel tone behaviour

I've been wondering if there is a way to modify or even disable tones for indicating channel states. The behaviour regarding tones seems like a black box with little documentation. During migration to Apple's PT Framework we've noticed that there are few scenarios where a tone is played which doesn't match certain certifications. For example; moving from a channel to another produces a tone which would fail a test case. I understand the reasoning fully, as it marks that the channel is ready to transmit or receive, but this doesn't mirror the behaviour of TETRA which would be wanted in this case. I'm also wondering if there would be any way to directly communicate feedback regarding PT Framework?

Media Technologies Audio Push To Talk

418

Oct ’25

Lock screen media controls for MusicKit/ ApplicationMusicPlayer

Hi, when using ApplicationMusicPlayer from MusicKit my app automatically gets the media controls on the lock screen: Play/ Pause, Skip Buttons, Playback Position etc. I would like to customize these. Tried a bunch of things, e.g. using MPRemoteCommandCenter. So far I haven't had any success. Does anyone know how I can customize the media controls of ApplicationMusicPlayer. Thank you.

Media Technologies Audio MusicKit

535

Sep ’25

AVAssetDownloadConfiguration: How many video variants are actually downloaded when multiple variants exist in the HLS master playlist?

Hi, I’m trying to better understand how AVAssetDownloadConfiguration selects video variants when downloading HLS content for offline playback. Suppose I have an HLS master playlist (.m3u8) that contains several video variants defined with #EXT-X-STREAM-INF. For example, the master playlist may contain multiple video streams like this: Same resolution, different BANDWIDTH Or different resolutions (for example 720p, 1080p, etc.) My question is: How many video variants are actually downloaded when using AVAssetDownloadConfiguration without specifying any variantQualifiers? In other words: If the master playlist contains multiple video variants, will the download task fetch only one variant, or multiple variants? Does the behavior differ depending on whether the variants differ only by BANDWIDTH or also by RESOLUTION? What I observed in testing In my tests, I always end up with only one video variant downloaded, specifically the one with the highest BANDWIDTH parameter. In the m3u8 files I tested, all video variants had identical parameters (resolution, codec, frame rate, etc.) and differed only by the BANDWIDTH attribute in the master playlist. However, when inspecting the downloaded .movpkg, I noticed something interesting in boot.xml. It lists two video streams: one with complete="true" (the one with highest bandwidth) another with complete="no" (the one with lowest bandwidth) I actually had 3 video streams listed in m3u8, but the one with middle bandwidth wasn't listed in boot.xml file at all. There are also additional streams for audio and subtitles in boot.xml file. This made me wonder whether the system initially attempts to download another video variant (possibly a lower bitrate one), but then switches to the highest-quality variant and only completes that one. Additional question about variantQualifiers If I provide a predicate such as: NSPredicate(format: "peakBitRate > 0") which should theoretically match all variants, will the download task attempt to download all matching video variants, or will it still select only one? Summary So the main questions are: Without variantQualifiers, does AVAssetDownloadConfiguration always download a single video variant, and if so, how is it chosen? Does the behavior differ if variants have different resolutions vs only different bitrates? When a predicate matches multiple variants, can multiple video variants actually be downloaded in a single .movpkg? Why might boot.xml list multiple video streams when only one appears to be fully downloaded? Any clarification on the intended behavior would be greatly appreciated. Thanks!

Media Technologies Video Video HTTP Live Streaming AVFoundation

298

Mar ’26

On iOS 18, Mandarin is read aloud as Cantonese

Please include the line below in follow-up emails for this request. Case-ID: 11089799 When using AVSpeechUtterance and setting it to play in Mandarin, if Siri is set to Cantonese on iOS 18, it will be played in Cantonese. There is no such issue on iOS 17 and 16. 1.let utterance = AVSpeechUtterance(string: textView.text) let voice = AVSpeechSynthesisVoice(language: "zh-CN") utterance.voice = voice 2.In the phone settings, Siri is set to Cantonese

Media Technologies Audio iOS Siri and Voice AVFoundation

817

Mar ’26

Improving Speech Analyzer Transcription for technical terms

I am developing an app with transcription and I am exploring ways to improve the transcription from the SpeechAnalyzer/Transcriber for technical terms. SFSpeech... recognition had the capability of being augmented by contextualStrings. Does something similar exist for SpeechAnalyzer/Transcriber? If so please point me towards the documentation and any sample code that may exist for this. If there are other options, please let me know.

Media Technologies Audio Speech

313

Sep ’25

Only fetch PHAssets from the current user's iCloud Library

I'm working on a photo app and I want to allow the user to display, edit and delete photos. I can fetch all photos using PHAsset.fetchAssets(with: options). This works as intended. However, I can't seem to find a way to prevent the user from seeing photos from a Shared Library. The PHAssetSourceType only contains typeCloudShared to only show items from a specific album; not library. How can I filter by iCloud Shared Library?

Media Technologies Photos & Camera Photos and Imaging PhotoKit

518

Sep ’25

Has the `externalMetadata` property of `AVPlayerItem` been removed?

(This only started happening as of Xcode 26.) I know macOS and watchOS don't support this property, but all other platforms do (did?) up until I upgraded Xcode. Now when I compile I get this: Value of type 'AVPlayerItem' has no member 'externalMetadata'

Media Technologies Video AVFoundation

275

Sep ’25

Audio session activation occasionally fails from CarPlay

I'm working on adding CarPlay support to an audio app and am running into an issue. Occasionally, when a user opens the app from CarPlay while the main app scene is either not connected or is currently in the background, I will receive an error when attempting to activate the audio session. The code below mimics my setup: do { try AVAudioSession.sharedInstance().setCategory(.playback, mode: .spokenAudio) try AVAudioSession.sharedInstance().setActive(true) } catch { print(error) // NSOSStatusErrorDomain - 560557684: Session activation failed } That error code maps to AVAudioSession.ErrorCode.cannotInterruptOthers. Once in this state, all subsequent attempts to play different pieces of content will fail. However, things will start working normally if the user opens the app on their phone and tries again from CarPlay (while the app is in the foreground on their phone). I'm not sure why it would behave this way and want to note that I do have the audio background mode capability enabled. Has anyone else encountered this? Are there any workarounds or changes I could make to prevent this from happening?

Media Technologies Audio CarPlay Audio AVAudioSession

202

Apr ’25

AVSpeechSynthesizer system voices (SLA clarification)

Hello, I am building an iOS-only, commercial app that uses AVSpeechSynthesizer with system voices, strictly using the APIs provided by Apple. Before distributing the app, I want to ensure that my current implementation does not conflict with the iOS Software License Agreement (SLA) and is aligned with Apple’s intended usage. For a better playback experience (more accurate estimation of utterance duration and smoother skip forward/backward during playback), I currently synthesize speech using: AVSpeechSynthesizer.write(_:toBufferCallback:) Converting the received AVAudioPCMBuffer buffers into audio data Storing the audio inside the app sandbox Playing it back using AVAudioPlayer / AVAudioEngine The cached audio is: Generated fully on-device using system voices Stored only inside the app’s private container Used only for internal playback controls (timeline, seek, skip ±5 seconds) Never shared, exported, uploaded, or exposed outside the app The alternative approaches would be: Keeping the generated audio entirely in memory (RAM) for playback purposes, without writing it to the file system at any point Or using AVSpeechSynthesizer.speak(_:) and playing speech strictly in real time which has a poorer user experience compared to my approach I have reviewed the current iOS Software License Agreement: https://www.apple.com/legal/sla/docs/iOS18_iPadOS18.pdf In particular, section (f) mentions restrictions around System Characters, Live Captions, and Personal Voice, including the following excerpt: “…use … only for your personal, non-commercial use… No other creation or use of the System Characters, Live Captions, or Personal Voice is permitted by this License, including but not limited to the use, reproduction, display, performance, recording, publishing or redistribution in a … commercial context.” I do not see a specific reference in the SLA to system text-to-speech voices used via AVSpeechSynthesizer, and I want to be certain that temporarily caching synthesized speech for internal, non-exported playback is acceptable in a commercial app. My question is: Is caching AVSpeechSynthesizer system-voice output inside the app sandbox for internal playback acceptable, or is Apple’s recommended approach to rely only on real-time playback (speak(_:)) or strictly in-memory buffering without file storage? If this question falls outside DTS technical scope and is instead a policy or licensing matter, I would appreciate guidance on the authoritative Apple documentation or the correct Apple team/contact. Thank you.

Media Technologies Audio Speech Audio

447

Feb ’26

Are frames returned in presentation or decode order with AVAssetReader

I read somewhere that the frames are returned in decode order instead of presentation order when using AVAssetReader. The documentation seems sparse on the subject. I have so far failed to find a video file where the frames are not returned in presentation order. Can anyone confirm the frames are actually returned in decode order?

Media Technologies Video AVFoundation

115

Feb ’26

AVCaptureMetadataOutput .face detection not working on iOS 26 Beta with high sessionPreset

In iOS 26 (Developer Beta), the AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when metadataOutput.metadataObjectTypes = [.face] is set. On earlier iOS versions the issue does not occur. Interestingly, face detection works if I set the sessionPreset to .medium, but not with .high — except on the iPhone 16 Pro Max, where it works regardless.

Media Technologies Photos & Camera AVFoundation

370

Sep ’25

Configuring capture pipeline with ProResRAW codec

I am unable to find any clearcut documentation on configuring AVCaptureSession pipeline to capture video with proResRAW codec type, which is 16 bit format. Is it supported only with AVCaptureMovieFileOutput or one can have AVCaptureVideoDataOutput emitting 16-bit sample buffers that can be vended to AVAssetWriter?

Media Technologies Photos & Camera AVFoundation

1.2k

Feb ’26

offline with FairPlay

I am working on offline license support for FairPlay. I am trying to identify offline vs. live streaming requests. How do I do that? Is there any way to identify the request type (offline or streaming) from SPC?

Media Technologies Streaming FairPlay Streaming

Clarification on SPC Version 3 Availability and Requirements (SDK 26 Certificate Bundle)

Hello, I’m using a valid certificate bundle generated with SDK 26 (combined RSA‑1024 + RSA‑2048). However, all my devices currently still generate SPC v2 during playback, including my iPhone 16 under iOS 26.2. Apple staff mentioned that future iOS versions will send SPC v3 when using an SDK 26 certificate bundle. Could you please clarify: Which iOS/macOS versions will first support SPC v3? Are there any additional client‑side requirements (Safari version, playback APIs, headers, etc.) to trigger SPC v3? Is there any way to test SPC v3 today, e.g., using beta builds? Thank you!

Media Technologies Streaming FairPlay Streaming

471

Feb ’26

AVAudioSession.outputVolume not reporting correctly in iOS 18+ devices

I’m using the shared instance of AVAudioSession. After activating it with .setActive(true), I observe the outputVolume, and it correctly reports the device’s volume. However, after deactivating the session using .setActive(false), changing the volume, and then reactivating it again, the outputVolume returns the previous volume (before deactivation), not the current device volume. The correct volume is only reported after the user manually changes it again using physical buttons or Control Center, which triggers the observer. What I need is a way to retrieve the actual current device volume immediately after reactivating the audio session, even on the second and subsequent activations. Disabling and re-enabling the audio session is essential to how my application functions. I’ve tested this behavior with my colleagues, and the issue is consistently reproducible on iOS 18.0.1, iOS 18.1, iOS 18.3, iOS 18.5 and iOS 18.6.2. On devices running iOS 17.6.1 and iOS 16.0.3, outputVolume correctly reflects the current volume immediately after calling .setActive(true) multiple times.

Media Technologies Audio Audio AVAudioSession

353

Feb ’26

Get device Voice Isolation status via Core Audio?

Is there any feasible way to get a Core Audio device's system effect status (Voice Isolation, Wide Spectrum)? AVCaptureDevice provides convenience properties for system effects for video devices. I need to get this status for Core Audio input devices.

Media Technologies Audio Core Audio

893

Nov ’25

Disabling Hardware OIS via AVFoundation — Clarification on AVCaptureVideoStabilizationMode

Hello everyone, I'm looking for a definitive clarification on how to completely disable all video stabilization, including the hardware OIS, using AVFoundation. The goal is to achieve a completely raw, unstabilized video feed, which is crucial when using external equipment like gimbals to avoid conflicting stabilization motions. My research points to using the AVCaptureConnection property preferredVideoStabilizationMode and setting it to AVCaptureVideoStabilizationMode.off. The documentation for the .off case states: A mode that doesn’t stabilize video capture. This description is slightly ambiguous. It's unclear whether this only affects software-level stabilization (EIS, EIS+OIS, etc) or if it guarantees the complete deactivation of the physical OIS module. For professional video applications, this is a critical distinction. So, I'd like to ask the community: Has anyone been able to definitively confirm that setting preferredVideoStabilizationMode to .off also disables the hardware OIS? Are there any known tests or documentation that prove this behavior? Is there an alternative or more direct method to ensure the OIS module is physically inactive during video capture? What is the community's best practice for ensuring absolutely no stabilization is applied to the video pipeline? Any insights or shared experiences on this topic would be greatly appreciated. Thank you!

Media Technologies Video Camera AVFoundation

311

Sep ’25

New FairPlay Keys

Hello, My company has an in-store app with FPS SDK 4.x (1024) keys. We've handed those keys over to a trusted third-party and we do not have them. We've been in-store for several years. The person that created the keys in our organization mistakenly stored them encrypted to our third-party's PGP keys, so we cannot decrypt them, and the third party also has no mechanism to provide us with the keys even though it is in their runtime environment. They only have secure mechanisms for us to upload keys onto their servers. We are trying to migrate to a different third-party DRM provider, and would like to obtain new keys. Unfortunately, the developer portal won't let me create new keys, saying that we have exceeded the number of keys allowed, which I assume is one. Additionally, the new DRM provider can only support SDK 4.x keys, and it appears that we can only request SDK 5.x keys on the Apple Developer portal, as the SDK 4.0 option is grayed out. Regardless, it seems that we are not able to request any keys. We've submitted a request to the support e-mail address and received an automated e-mail that the response should take a few days, but may take longer on occasion. It's now been a month. The e-mail says that the reply address is not monitored. Is there any way we can accelerate this? Thank you, Carlos

Media Technologies Streaming FairPlay Streaming

291

Aug ’25

Always audio from latest connected external USB mic

Hello! I've two mics connected to a USB-hub. The USB-hub is then connected to my iPad. Both mics are part of the audio session's list of available inputs. The problem is that regardless of which mic I select in my app (using setPreferredInput() on the audio session), the audio keeps coming from the mic that was last connected to the USB-hub. Anyone that knows if this is a limitation in iPadOS/iOS?

Media Technologies Audio

209

Jul ’25

Swift Array Out of Bounds Crash in VTFrameProcessor when using VTLowLatencyFrameInterpolationParameters