Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple streams #20

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

berysaidi
Copy link

@berysaidi berysaidi commented Apr 24, 2024

Hey @RouquinBlanc, thank you for this nice library.

disclaimer: I started learning RTSP last week, so please forgive any stupid mistakes.

I needed to support video and audio (or just audio) for a project and I found that I might be able to extend your work and the pending PR #13

I basically ran a test with ffmpeg, recorded the rtsp messages exchanged, and read a little bit of the various RFCs and tried to replicate that behavior with the building blocks you have provided us with.
I have tested this running examples/camera_feed.py on two Veezoom Cameras and so far, the results are good. I can use ffmpeg to play the h264 video, and the raw PCM audio samples.
stream_tcp.log
stream_udp.log

this work is mainly motivated by the fact that the veezoom cameras, when requested for audio only, they would still send video on interleaved channel 0-1, and audio on interleaved channel 2-3 (even though the acknowledge the setup request for audio only on interleaved 0-1)

I am ordering a couple of new cameras by another vendor just to confirm things work as expected.

some of the unit tests were modified on separate commits to replicate this multi stream scenario, but even without changes, they all passed. so if you don't like them, I can drop them from this PR.

There's definitely room for improvement on this; for instance, on the fact that if a camera has two audio channels (stereo), we only select the first. it could be addressed later if you're content with these changes.

I appreciate your feedback on this

@RouquinBlanc
Copy link
Collaborator

Hi there!

Thanks a lot for the contribution ❤️

I really like the idea of being able to also support audio - historically, all my work is 100% on video, but as you've seen, the low level is not too far from being capable of supporting multiple streams.

I'm going on vacation now (posting from the airport!), so please expect some delays, but here is what I can already say.

  • My first point would be about backward compatibility. Ideally, I see it going 2 ways:
    • either we can make this API compatible, and push a minor change to keep all clients working
    • or we accept that some parts are better changed, and then we work on a v2. But in that case, I would rather work (with you?) based on the dev/v2 branch already started, which also aims at bringing RTSP server capability (acting as an RTSP server and serving clients)
  • For the UDP transport, I'm wondering if we can just keep a single socket pair on the client side, and just multiplex based on remote ports?

I will need to test this against the different models of cameras I have as well as proxies like Mediamtx, but that's definitely a good start, thx!

@berysaidi
Copy link
Author

berysaidi commented Apr 24, 2024

Thank you for the quick feedback @RouquinBlanc
I took a quick glance at the RFC and it doesn't look like it's mandated that we can't use the same socket pair for multiple streams. However, as you already know https://datatracker.ietf.org/doc/html/rfc2326#section-14.2 the exchange captured here uses a socket pair for each stream (negotiated by the client)
But out of curiosity, I tried it with this diff (basically by forcing num_streams to be equal to 1)

diff --git a/aiortsp/transport/udp.py b/aiortsp/transport/udp.py
index 04b7b3c..58b4016 100644
--- a/aiortsp/transport/udp.py
+++ b/aiortsp/transport/udp.py
@@ -112,6 +112,7 @@ class UDPTransport(RTPTransport):
 
         self.receive_buffer = kwargs.get('receive_buffer', DEFAULT_BUFFER_SIZE)
         self.send_buffer = kwargs.get('send_buffer', DEFAULT_BUFFER_SIZE)
+        self.num_streams = 1
         self.rtp_sinks = [None] * self.num_streams
         self.rtcp_sinks = [None] * self.num_streams
         self.server_rtp = [None] * self.num_streams
@@ -212,11 +213,13 @@ class UDPTransport(RTPTransport):
         self.handle_rtcp_data(data)
 
     def on_transport_request(self, headers: dict, stream_number=0):
+        stream_number = 0
         rtp_port = self.rtp_sinks[stream_number].local_port
         rtcp_port = self.rtcp_sinks[stream_number].local_port
         headers['Transport'] = f'RTP/AVP;unicast;client_port={rtp_port}-{rtcp_port}'
 
     def on_transport_response(self, headers: dict, stream_number=0):
+        stream_number = 0
         if 'transport' not in headers:
             raise RTSPError('error on SETUP: Transport not found')
 
@@ -270,5 +273,6 @@ class UDPTransport(RTPTransport):
                 self.send_upstream(self.rtcp_sinks[i], self.connection.host, self.server_rtcp[i])
 
     async def send_rtcp_report(self, rtcp: RTCP, stream_number=0):
+        stream_number = 0
         if self.rtcp_sender:
             self.rtcp_sinks[stream_number].sendto(bytes(rtcp), self.rtcp_sender)

sadly, the veezoom cameras that I have only send back media on the last socket pair chosen in SETUP
multistream_single_sock_pair.log
I receive new cameras tomorrow, so I'll be able to know if this is peculiar to veezoom, or a general assumption.
Veezoom allows multiple streams with the same session id
veezom.log

2024-04-26 13:15:06,325 - DEBUG - RESPONSE received: <Response status=200 msg="OK"" headers={'cseq': '4', 'transport': 'RTP/AVP/TCP;interleaved=0-1;unicast;mode=play', 'session': '88016000;timeout=60'} content-length=0>
2024-04-26 13:15:06,365 - DEBUG - RESPONSE received: <Response status=200 msg="OK"" headers={'cseq': '5', 'transport': 'RTP/AVP/TCP;interleaved=2-3;unicast;mode=play', 'session': '88016000;timeout=60'} content-length=0>

Reolink uses different session id per stream
reolink.log

2024-04-26 13:15:43,854 - DEBUG - RESPONSE received: <Response status=200 msg="OK"" headers={'cseq': '4', 'date': 'Fri, Apr 26 2024 17:15:44 GMT', 'transport': 'RTP/AVP/TCP;unicast;destination=192.168.0.4;source=192.168.0.162;interleaved=0-1', 'session': '92B48581;timeout=65'} content-length=0>
2024-04-26 13:15:43,888 - DEBUG - RESPONSE received: <Response status=200 msg="OK"" headers={'cseq': '5', 'date': 'Fri, Apr 26 2024 17:15:44 GMT', 'transport': 'RTP/AVP/TCP;unicast;destination=192.168.0.4;source=192.168.0.162;interleaved=2-3', 'session': '12B16894;timeout=65'} content-length=0>

In terms of backward compatiblity: the reader itself would still work as it used to. the only exception is that we pass an array of the media_types (by default ['video'] instead of 'video')
and for the iterator, we return a pair of ('media_type', pkt). we can definitely just return the packet and let the user filter based on pkt.pt. but in order to know which is which, we have to look at the SDP and figure it out. That's why I went with (media_type, pkt)
an alternative is to have add a method to PKT (rtp.py) which returns the media type based on https://www.rfc-editor.org/rfc/rfc3551#section-6 ?
If you are leaning one way or the other, I can keep/revert those changes.
for V2, I can definitely take a look at the work done, and I am quite interested in adding server capabilities too!
Have a great vacation

Thanks,

@berysaidi berysaidi force-pushed the multiple-streams branch 3 times, most recently from ac42099 to fdfd76e Compare April 25, 2024 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants