Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📢 Version 0.6 Development Status #330

Open
4 tasks
pdeljanov opened this issue Dec 27, 2024 · 10 comments
Open
4 tasks

📢 Version 0.6 Development Status #330

pdeljanov opened this issue Dec 27, 2024 · 10 comments
Assignees
Labels
help wanted Extra attention is needed
Milestone

Comments

@pdeljanov
Copy link
Owner

pdeljanov commented Dec 27, 2024

Hey all,

As some may have noticed, most of the development activity on Symphonia is occurring in the dev-0.6 branch. This is because I believe the 0.5.x series has reached a developmental dead-end where it is not possible to satisfactorily fix many of Symphonia's issues due to API constraints. Therefore, I've decided to focus my limited time on completing version 0.6 instead.

Version 0.6 includes many breaking changes to support popular requests such as making it easier to get audio out of Symphonia, or returning None when the end of the media is reached, but it also includes forward looking enhancements such as stubbing out video and subtitle decoding support.

Development Plan

My development plan is as follows:

  1. For the time being, I will be keeping 0.6 development in the dev-0.6 branch until there are no more breaking changes being made.
  2. Once the APIs are stabilised, I will cut a final 0.5.5 release from the mainline branch that includes some fixes for annoyances people have had.
  3. Once 0.5.5 is released, I will merge the 0.6 development branch into mainline and begin more thorough testing.
  4. Once 0.6 has been tested (including some fuzzing time), it will be released.

Timeline

I am hopeful for an early 2025 release, but some help from the community would improve the odds!

Help Wanted

You can help the development and release of version 0.6 by picking up any of these tasks:

  • Add fuzz testing harnesses to each format and codec crate.
  • Begin playing around with version 0.6 and submitting feedback and suggestions on the new APIs. Note: At this time, some of the new APIs are still in flux, but you can still get an idea, and different perspectives will be helpful to me.
  • Read, critique, fix, or update documentation.
  • Any help fixing decode or demuxing issues are always welcome.

I will add to this list over time...

Thanks all!

@pdeljanov pdeljanov added the help wanted Extra attention is needed label Dec 27, 2024
@pdeljanov pdeljanov added this to the v0.6.0 milestone Dec 27, 2024
@pdeljanov pdeljanov self-assigned this Dec 27, 2024
@pdeljanov pdeljanov pinned this issue Dec 27, 2024
@pdeljanov pdeljanov changed the title Version 0.6 Development Status 📢 Version 0.6 Development Status Dec 27, 2024
@crutchcorn
Copy link

Hey! I'm using the 0.6 branch against my very very very small Tauri application, since it fixes many issues with FLAC currently

Just wanted to say that you're crushing it and I'm extremely happy as a downstream consumer of this package and that the API changes to 0.6 were pretty easy to adapt to

I'm still extremely green to Rust in general, so might not be a huge help, but if I end up with some spare cycles and find something critically blocking I'm happy to jump in

@caass
Copy link

caass commented Jan 2, 2025

One thing that could help with no_std support (which seems to have stalled) would be to move the core API away from std::io traits; for example, the tokio project has the bytes crate which provides APIs similar to ReadBytes.

If you'd like to maintain the fallibility of the API (bytes panics implicitly instead), we could instead be generic over the error type, since Error has been stabilized in core for a couple releases now.

Unfortunately, the Read and Write (and Seek) traits are probably not going to get moved to core any time soon, and I saw here that it's impossible to copy those traits over due to licensing issues.

@sscobici
Copy link

sscobici commented Jan 5, 2025

Adding some points that would be good to address or consider before the 0.6 release:

  1. Packet's Track ID: This is a unique identifier for the track. Previously, it was used as an index for the tracks array, but this is no longer the case, at least for MP4. Should we add a track_num field to the Packet structure?

  2. Packet PTS/DTS: The introduction of Packet Presentation Time Stamp (PTS) and Decode Time Stamp (DTS) fields is needed.

  3. Packet Struct Design: Should we keep the Packet struct for all types of packets (Audio, Video, Subtitles), or should we switch to an Enum, similar to how CodecParameters is structured?

  4. Track Metadata: The Track struct currently lacks a TrackType field and a default flag. While analyzing CodecParams can be an alternative, it is not always convenient. Including a default flag would also provide useful information for users.

@hasezoey
Copy link

hasezoey commented Jan 9, 2025

(symphonia 0.6 (516f1d4), project termusic)
I could not find any changelog or migration guide, so i had to try and guess what the new values would be, mostly symphonia-play was helpful, but there are still some questions i have:

  • What happened to ProbeResult and more specifically its metadata value? (which, IIRC, was part of the container's metadata, right?)
  • What happened to SampleBuffer and what should be used as the official alternative?
  • Why is AudioSpec not Copy anymore?

Regarding SampleBuffer, i at first though i could use AudioBuffer, but it is not actually possible because of The trait "AsMut<[u8]>" is not implemented for "AudioBuffer<i16>", which is required by "&mut AudioBuffer<i16>: AsMut<[u8]>" (for more see the stack traces below in the details), the only function that was not complaining about types was decoded.copy_to(&mut buffer), but it errored at runtime with mismatching frame lengths.
But for now i could work-around it by using plain Vec<i16> similar to what symphonia-play uses and just cache the additional things like frames() from decoded.

Full compiler output
error[E0277]: the trait bound `AudioBuffer<i16>: AsMut<[u8]>` is not satisfied
   --> playback/src/rusty_backend/decoder/mod.rs:239:45
    |
239 |         decoded.copy_bytes_interleaved(&mut buffer);
    |                 ----------------------      ^^^^^^ the trait `AsMut<[u8]>` is not implemented for `AudioBuffer<i16>`, which is required by `&mut AudioBuffer<i16>: AsMut<[u8]>`
    |                 |
    |                 required by a bound introduced by this call
    |
    = note: required for `&mut AudioBuffer<i16>` to implement `AsMut<[u8]>`
note: required by a bound in `GenericAudioBufferRef::<'_>::copy_bytes_interleaved`
   --> /mnt/ssd/projects/rust/Symphonia/symphonia-core/src/audio/generic.rs:395:14
    |
392 |     pub fn copy_bytes_interleaved<Sout, Dst>(&self, dst: Dst)
    |            ---------------------- required by a bound in this associated function
...
395 |         Dst: AsMut<[u8]>,
    |              ^^^^^^^^^^^ required by this bound in `GenericAudioBufferRef::<'_>::copy_bytes_interleaved`

or

error[E0277]: the trait bound `AudioBuffer<i16>: AsMut<[_]>` is not satisfied
   --> playback/src/rusty_backend/decoder/mod.rs:239:48
    |
239 |         decoded.copy_to_slice_interleaved(&mut buffer);
    |                 -------------------------      ^^^^^^ the trait `AsMut<[_]>` is not implemented for `AudioBuffer<i16>`, which is required by `&mut AudioBuffer<i16>: AsMut<[_]>`
    |                 |
    |                 required by a bound introduced by this call
    |
    = note: required for `&mut AudioBuffer<i16>` to implement `AsMut<[_]>`
note: required by a bound in `GenericAudioBufferRef::<'_>::copy_to_slice_interleaved`
   --> /mnt/ssd/projects/rust/Symphonia/symphonia-core/src/audio/generic.rs:349:14
    |
346 |     pub fn copy_to_slice_interleaved<Sout, Dst>(&self, dst: Dst)
    |            ------------------------- required by a bound in this associated function
...
349 |         Dst: AsMut<[Sout]>,
    |              ^^^^^^^^^^^^^ required by this bound in `GenericAudioBufferRef::<'_>::copy_to_slice_interleaved`

where

let mut buffer = AudioBuffer::<i16>::new(decoded.spec().clone(), decoded.capacity());

Finally, there is also this warning when i use symphonia by path:

warning: field `bitrate` is never read
  --> /mnt/ssd/projects/rust/Symphonia/symphonia-bundle-mp3/src/common.rs:99:9
   |
96 | pub struct FrameHeader {
   |            ----------- field in this struct
...
99 |     pub bitrate: u32,
   |         ^^^^^^^
   |
   = note: `FrameHeader` has a derived impl for the trait `Debug`, but this is intentionally ignored during dead code analysis
   = note: `#[warn(dead_code)]` on by default

warning: `symphonia-bundle-mp3` (lib) generated 1 warning

Full result of my symphonia 0.6 try can be seen in this compare or in this repository & branch (note that i might not keep that branch around forever).

@pdeljanov
Copy link
Owner Author

One thing that could help with no_std support (which seems to have stalled) would be to move the core API away from std::io traits; for example, the tokio project has the bytes crate which provides APIs similar to ReadBytes.

If you'd like to maintain the fallibility of the API (bytes panics implicitly instead), we could instead be generic over the error type, since Error has been stabilized in core for a couple releases now.

Unfortunately, the Read and Write (and Seek) traits are probably not going to get moved to core any time soon, and I saw here that it's impossible to copy those traits over due to licensing issues.

I think, for the sake of getting 0.6 this year, I will leave no_std to the next major release. :)

Once we have 0.6 released, then we can start making internal changes (though not semver breaking), such as using alloc/core instead of std, transitioning lookup tables over to phf, etc.

The Error trait looks like it stabilised fairly recently: 1.81. That's a bit new for Symphonia. I recently changed the MSRV to 1.77 and even that felt like pushing it. It'll probably be old enough for 0.7, though.

ReadBytes should cover all of Read's use-cases, and I think we could also add our own Seek trait as well, or just use MediaSource for that.

@pdeljanov
Copy link
Owner Author

Updating Packet is one of the last changes that need to go in for 0.6. Provisioning any fields for video and subtitle tracks will be important.

Adding some points that would be good to address or consider before the 0.6 release:

  1. Packet's Track ID: This is a unique identifier for the track. Previously, it was used as an index for the tracks array, but this is no longer the case, at least for MP4. Should we add a track_num field to the Packet structure?

The track ID should never be assumed to be a track index. It should always be matched against Track::id.

Perhaps it's better to replace it with the track index/number instead. What do you think? I've warmed up to having both in the packet, however, I'm not sure how useful the track ID itself is in practice, especially since it can essentially be a random number.

Additionally, I think we need to extend track IDs to u64 since for some demuxer's like MKV it's being truncated right now.

  1. Packet PTS/DTS: The introduction of Packet Presentation Time Stamp (PTS) and Decode Time Stamp (DTS) fields is needed.

Agreed. We can have both to replace ts right now. For audio we will simply have pts = dts.

  1. Packet Struct Design: Should we keep the Packet struct for all types of packets (Audio, Video, Subtitles), or should we switch to an Enum, similar to how CodecParameters is structured?

Hard to say, but I'm not seeing a clear need right now. It depends what other fields we must carry for a video track.

  1. Track Metadata: The Track struct currently lacks a TrackType field and a default flag. While analyzing CodecParams can be an alternative, it is not always convenient. Including a default flag would also provide useful information for users.

The Track::flags bitflags does have a default flag. Along with a few others. Agreed that TrackType would be easier.

@pdeljanov
Copy link
Owner Author

pdeljanov commented Jan 15, 2025

Hi @hasezoey,

Hopefully will have a migration guide and updated examples when the API changes are finalised. I do recommend running cargo doc and reading the updated API docs in the meantime, I think they are quite good.

  • What happened to ProbeResult and more specifically its metadata value? (which, IIRC, was part of the container's metadata, right?)

The metadata returned in the ProbeResult in 0.5.x was below was metadata the probe operation found before the container was detected. This was typical for ID3v2 tags since it's just sort of glued on to the MPEG audio stream.

This metadata is now passed to the format reader and you can access it through the metadata function.

  • What happened to SampleBuffer and what should be used as the official alternative?

You can now use the copy_to/copy_bytes_to family of functions on AudioBuffer, AudioSlice, GenericAudioBuffer, or GenericAudioBufferRef to extract the audio data in a format of your choice.

Here's a description of the different types of copy functions.

First, we have sample-oriented copy functions:

  • copy_to_slice_interleaved copies all audio frames to a &mut [S] of the same or differing sample format in interleaved order.
  • copy_to_slice_planar copies all audio planes to a &mut [&mut [S]] of the same or differing sample format.
  • copy_to_vec_interleaved/copy_to_vec_planar same as the other two, except takes a &mut Vec<S> /&mut Vec<Vec<S>> and resizes them accordingly. The slice versions need to be exactly sized, much like copy_from_slice in the standard library.

Next we have byte-oriented equivalents of the above:

  • copy_bytes_interleaved copies interleaved audio to a &mut [u8] in the original sample format
  • copy_bytes_interleaved_as same as above, except copies to a differing sample format.
  • copy_bytes_planar copies all audio planes to a &mut [&mut [u8]] in the original sample format.
  • copy_bytes_planar_as same as above, except copies to a differing sample format.
  • copy_bytes_to_vec_interleaved/copy_bytes_to_vec_interleaved_as/copy_bytes_to_vecs_planar/copy_bytes_to_vecs_planar_as same as the other four except takes a &mut Vec<u8> /&mut Vec<Vec<u8>> and resizes them accordingly.

Note: I missed some of these copy-to-vec functions, so you should pull!

Therefore, you can simply replace (Raw)SampleBuffer with the equivalent code below. However, I recommend you read over the docs to see if something works better for you now. For example, if you have your own buffers, you could use the copy-to-slice functions.

// SampleBuffer
let mut samples: Vec<f32> = Default::default();
audio.copy_to_vec_interleaved(&mut samples);

// RawSampleBuffer
let mut raw_samples: Vec<u8> = Default::default();
audio.copy_bytes_to_vec_interleaved_as::<f32>(&mut raw_samples);
  • Why is AudioSpec not Copy anymore?

In previous versions, audio channels were represented using a set of bitflags. However, that was limiting for more advanced audio codecs, so channels can now also be represented as a Vec of channel labels. Since a Vec is not copyable, AudioSpec can no longer be Copy. However, if you don't use these more advanced channel layouts, then there is no additional overhead despite it being a "clone".

@hasezoey
Copy link

This metadata is now passed to the format reader and you can access it through the metadata function.

Thanks for clarifying, this makes it a lot easier to deal with metadata. If i understand this correctly, you mean that it is included in this function Box<dyn FormatReader>::metadata.

I do recommend running cargo doc and reading the updated API docs in the meantime, I think they are quite good.

That was what i was doing, though it only contained the current state not how to migrate from old types. (though yes, i have not run into any old docs yet)

Therefore, you can simply replace (Raw)SampleBuffer with the equivalent code below. However, I recommend you read over the docs to see if something works better for you now. For example, if you have your own buffers, you could use the copy-to-slice functions.

That is basically what i am doing now, just wanted clarification if a plain Vec<_> is the replacement, or if i had missed anything.

For context, i had made a wrapper buffer struct, as we need more values than the raw samples (like frame_len).

TL;DR: the copy function looks like this:

/// Copy passed [`GenericAudioBufferRef`] into our Buffer
#[inline]
fn copy_buffers(&mut self, decoded: &GenericAudioBufferRef<'_>) {
    let required_capacity = decoded.byte_len_as::<i16>();
    self.buf.resize(required_capacity, 0);

    decoded.copy_to_vec_interleaved(&mut self.buf);
    self.frame_len = decoded.frames();
}

PS: completely unrealted, but because i dont know much about audio, would it be better to switch from i16 to something higher like f32 or any good material to read up on that?

@pdeljanov
Copy link
Owner Author

This metadata is now passed to the format reader and you can access it through the metadata function.

Thanks for clarifying, this makes it a lot easier to deal with metadata. If i understand this correctly, you mean that it is included in this function Box<dyn FormatReader>::metadata.

Yup.

That is basically what i am doing now, just wanted clarification if a plain Vec<_> is the replacement, or if i had missed anything.

I see, yeah what you have looks like it'd work!

PS: completely unrealted, but because i dont know much about audio, would it be better to switch from i16 to something higher like f32 or any good material to read up on that?

For all intents and purposes, i16 is good enough since it has plenty of dynamic range. However, it's pretty typical for systems use f32 internally for their audio processing pipelines and mixer so I'd recommend using that since many codecs decode to f32 so you can avoid the f32 -> i16 -> f32 conversion.

A classic intro video to digital media is this one: https://xiph.org/video/vid1.shtml.

If you want to get into more detail, I found this article to be quite detailed as well: https://mu.krj.st/wave/.

@edwloef
Copy link

edwloef commented Jan 18, 2025

I personally would appreciate being able to create iterators over the various AudioBuffer-style types, without having to put my own Vec in the middle and copying to and from that. I'm not entirely sure how that would look in planar form, but at least interleaved sounds like a fairly trivial implementation to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants