Create WebRTC session

WebRTC is a Web-based real-time communication solution that supports the construction of audio and video channels through the browser's built-in API.

In short, first negotiate each other's media and communication parameters in the signaling channel, and then transmit audio and video media data through the media channel.

The three main objects used in JavaScript are:

  • MediaStream gets and renders audio and video streams
  • RTCPeerConnection supports audio and video media data communication
  • RTCDataChannel supports application level data communication

For the media transport layer, WebRTC specifies that ICE/STUN/TURN is used to connect, DTLS is used to negotiate SRTP key, SRTP is used to transmit media data, and

SCTP to transmit application data.

In the signaling layer, WebRTC does not specify that each application can use its favorite signaling protocol for media negotiation. Generally, SDP is used to negotiate through HTTP, WebSocket or SIP

The protocol carries specific media session descriptions.

If we want to have video chat, the basic call process is roughly as follows:

WebRTC flow

  1. Collect local media sources (microphones, cameras) as MediaStream media streams
  2. Two opposite ends create signaling channels with each other and exchange session description information SDP
  3. Exchange each other's session description information SDP through signaling
  4. Through ICE/STUN/TURN protocol, negotiate a connectable candidate pair to create PeerConnection
  5. After the PeerConnection is created, the audio and video data is encapsulated through SRTP for transmission

In short, both sides of communication need to know two pieces of information

  1. ICE Candidates: includes address information that can be used for communication
  2. Session Description: including media type, code, format, etc.

The full name of ICE is "Interactive Connectivity Establishment", that is, the establishment of interactive connection: a protocol for network address translation traversal

The general process is as follows. Alice needs to go through these steps to chat with Bob online (including text, voice and video). It looks very complex. Let's break it down in detail

call flow example

Give two examples

  1. Local Peer Connection
  2. Remote peer connectionremote peer connection

1. Local peer-to-peer connection

<!DOCTYPE html>
<html xmlns="">
//... omit the imported css and js files
<!-- <a href=""><img style="position: absolute; top: 0; left: 0; border: 0; z-index: 1001;" src="" alt="Fork me on GitHub"></a>
<nav class="navbar navbar-default navbar-static-top">
<div class="container">
    <div class="row">
        <div class="col-lg-12">
            <div class="page-header">
                <h1>WebRTC example of Peer Connection </h1>
            <div class="container" id="details">
          <div class="row">
              <div class="col-lg-12">
                <p>Click the button to open or close connection</p>
                    <button class="btn btn-default" autocomplete="off" id="startButton">Start Video</button>
                    <button class="btn btn-default" autocomplete="off" id="stopButton">Stop Video</button>
                      <button class="btn btn-default" autocomplete="off" id="callButton">Call</button>
                      <button class="btn btn-default" autocomplete="off" id="hangupButton">Hangup</button>
              <div class="col-lg-12">
                    <div class="col-lg-6"><video id="localVideo" autoplay></video></div>
                    <div class="col-lg-6"><video id="remoteVideo" autoplay></video></div>
                    <div class="box">
                        <span>SDP Semantics:</span>
                        <select id="sdpSemantics">
                            <option selected value="">Default</option>
                            <option value="unified-plan">Unified Plan</option>
                            <option value="plan-b">Plan B</option>
                        <button class="btn btn-default" autocomplete="off" id="sdpButton">Display SDP</button>
                       <textarea id="output"></textarea>
                           interface RTCOfferAnswerOptions {
                                voiceActivityDetection?: boolean;
                           interface RTCOfferOptions extends RTCOfferAnswerOptions {
                                iceRestart?: boolean;
                                offerToReceiveAudio?: boolean;
                                offerToReceiveVideo?: boolean;
          <!-- Omit several HTML Fragment -->
<script type="text/javascript" src="js/local_peer_connection_demo.js"></script>
'use strict';
const startButton = document.getElementById('startButton');
const stopButton = document.getElementById('stopButton');
const callButton = document.getElementById('callButton');
const hangupButton = document.getElementById('hangupButton');
const sdpButton = document.getElementById('sdpButton');
const outputTextarea = document.querySelector('textarea#output');
stopButton.disabled = true;
callButton.disabled = true;
hangupButton.disabled = true;
startButton.addEventListener('click', start);
stopButton.addEventListener('click', stop);
callButton.addEventListener('click', call);
hangupButton.addEventListener('click', hangup);
sdpButton.addEventListener('click', displaySdp);
let startTime;
const localVideo = document.getElementById('localVideo');
const remoteVideo = document.getElementById('remoteVideo')
localVideo.addEventListener('loadedmetadata', function() {
  console.log(`Local video videoWidth: ${this.videoWidth}px,  videoHeight: ${this.videoHeight}px`);
remoteVideo.addEventListener('loadedmetadata', function() {
  console.log(`Remote video videoWidth: ${this.videoWidth}px,  videoHeight: ${this.videoHeight}px`);
remoteVideo.addEventListener('resize', () => {
  console.log(`Remote video size changed to ${remoteVideo.videoWidth}x${remoteVideo.videoHeight}`);
  // We'll use the first onsize callback as an indication that video has started
  // playing out.
  if (startTime) {
    const elapsedTime = - startTime;
    console.log('Setup time: ' + elapsedTime.toFixed(3) + 'ms');
    startTime = null;
let localStream;
let pc1;
let pc2;
const offerOptions = {
  offerToReceiveAudio: 1,
  offerToReceiveVideo: 1,
  voiceActivityDetection: true
function getName(pc) {
  return (pc === pc1) ? 'pc1' : 'pc2';
function getOtherPc(pc) {
  return (pc === pc1) ? pc2 : pc1;
//start the video stream
async function start() {
  console.log('Requesting local stream');
  startButton.disabled = true;
  stopButton.disabled = false;
  try {
    const stream = await navigator.mediaDevices.getUserMedia({audio: true, video: true});
    weblog('Received local stream');
    localVideo.srcObject = stream;
    localStream = stream;
    callButton.disabled = false;
  } catch (e) {
    alert(`getUserMedia() error: ${}`);
  function stop(e) {
        const stream = localVideo.srcObject;
        const tracks = stream.getTracks(); = true;
        startButton.disabled = false;
        callButton.disabled = true;
        tracks.forEach(function(track) {
        localVideo.srcObject = null;
  function getSelectedSdpSemantics() {
    const sdpSemanticsSelect = document.querySelector('#sdpSemantics');
    const option = sdpSemanticsSelect.options[sdpSemanticsSelect.selectedIndex];
    return option.value === '' ? {} : {sdpSemantics: option.value};
  //call the remote peer
  async function call() {
    callButton.disabled = true;
    hangupButton.disabled = false;
    weblog('Starting call');
    startTime =;
    const videoTracks = localStream.getVideoTracks();
    const audioTracks = localStream.getAudioTracks();
    if (videoTracks.length > 0) {
      weblog(`Using video device: ${videoTracks[0].label}`);
    if (audioTracks.length > 0) {
      weblog(`Using audio device: ${audioTracks[0].label}`);
    const configuration = getSelectedSdpSemantics();
    weblog('RTCPeerConnection configuration:', configuration);
    pc1 = new RTCPeerConnection(configuration);
    weblog('Created local peer connection object pc1');
    pc1.addEventListener('icecandidate', e => onIceCandidate(pc1, e));
    pc2 = new RTCPeerConnection(configuration);
    weblog('Created remote peer connection object pc2');
    pc2.addEventListener('icecandidate', e => onIceCandidate(pc2, e));
    pc1.addEventListener('iceconnectionstatechange', e => onIceStateChange(pc1, e));
    pc2.addEventListener('iceconnectionstatechange', e => onIceStateChange(pc2, e));
    pc2.addEventListener('track', gotRemoteStream);
    localStream.getTracks().forEach(track => pc1.addTrack(track, localStream));
    weblog('Added local stream to pc1');
    try {
      weblog('pc1 createOffer start');
      const offer = await pc1.createOffer(offerOptions);
      await onCreateOfferSuccess(offer);
    } catch (e) {
  function onCreateSessionDescriptionError(error) {
    console.log(`Failed to create session description: ${error.toString()}`);
  async function onCreateOfferSuccess(desc) {
    weblog(`Offer from pc1\n${desc.sdp}`);
    weblog('pc1 setLocalDescription start');
    try {
      await pc1.setLocalDescription(desc);
    } catch (e) {
    weblog('pc2 setRemoteDescription start');
    try {
      await pc2.setRemoteDescription(desc);
    } catch (e) {
    weblog('pc2 createAnswer start');
    // Since the 'remote' side has no media stream we need
    // to pass in the right constraints in order for it to
    // accept the incoming offer of audio and video.
    try {
      const answer = await pc2.createAnswer();
      await onCreateAnswerSuccess(answer);
    } catch (e) {
  function onSetLocalSuccess(pc) {
    weblog(`${getName(pc)} setLocalDescription complete`);
  function onSetRemoteSuccess(pc) {
    weblog(`${getName(pc)} setRemoteDescription complete`);
  function onSetSessionDescriptionError(error) {
    weblog(`Failed to set session description: ${error.toString()}`);
  function gotRemoteStream(e) {
    if (remoteVideo.srcObject !== e.streams[0]) {
      remoteVideo.srcObject = e.streams[0];
      weblog('pc2 received remote stream');
  async function onCreateAnswerSuccess(desc) {
    weblog(`Answer from pc2:\n${desc.sdp}`);
    weblog('pc2 setLocalDescription start');
    try {
      await pc2.setLocalDescription(desc);
    } catch (e) {
    console.log('pc1 setRemoteDescription start');
    try {
      await pc1.setRemoteDescription(desc);
    } catch (e) {
  async function onIceCandidate(pc, event) {
    try {
      await (getOtherPc(pc).addIceCandidate(event.candidate));
    } catch (e) {
      onAddIceCandidateError(pc, e);
    console.log(`${getName(pc)} ICE candidate:\n${event.candidate ? event.candidate.candidate : '(null)'}`);
  function onAddIceCandidateSuccess(pc) {
    weblog(`${getName(pc)} addIceCandidate success`);
  function onAddIceCandidateError(pc, error) {
    weblog(`${getName(pc)} failed to add ICE Candidate: ${error.toString()}`);
  function onIceStateChange(pc, event) {
    if (pc) {
      weblog(`${getName(pc)} ICE state: ${pc.iceConnectionState}`);
      weblog('ICE state change event: ', event);
  function hangup() {
    weblog('Ending call');
    pc1 = null;
    pc2 = null;
    hangupButton.disabled = true;
    callButton.disabled = false;
  async function displaySdp() {
    const configuration = getSelectedSdpSemantics();
    let peerConnection = new RTCPeerConnection(configuration);
    const offer = await peerConnection.createOffer(offerOptions);
    await peerConnection.setLocalDescription(offer);
    outputTextarea.value = offer.sdp;

The entire program implementation can be accessed here

I printed out the whole connection steps on the page

[49.461] Received local stream
[72.743] Starting call
[72.743] Using video device: USB Video Device (046d:081d)
[72.743] Using audio device: default - Microphone (USB Audio Device) (046d:081d)
[72.743] RTCPeerConnection configuration:
[72.745] Created local peer connection object pc1
[72.746] Created remote peer connection object pc2
[72.746] Added local stream to pc1
[72.747] pc1 createOffer start
[72.765] Offer from pc1 v=0 o=- 8212739043455445815 2 IN IP4 s=- t=0 0 a=group:BUNDLE 0 1 a=msid-semantic: WMS xlhFA5NOFj9VpQ7Z1ylg9jmfNytu6l7jTKhQ m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126 c=IN IP4 a=rtcp:9 IN IP4 a=ice-ufrag:nTn7 a=ice-pwd:1XyHlJ5xBTJ7NBuU5Y5mBqCn a=ice-options:trickle a=fingerprint:sha-256 51:62:BB:13:05:A7:38:05:47:78:BA:70:A6:A7:64:29:6C:45:00:AC:B3:7F:92:45:80:F5:5A:4B:10:7A:36:42 a=setup:actpass a=mid:0 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level [...truncated for brevity] a=rtpmap:111 opus/48000/2 a=rtcp-fb:111 transport-cc a=fmtp:111 minptime=10;useinbandfec=1 a=rtpmap:8 PCMA/8000 [...audio parameters truncated...] m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 121 127 120 125 107 108 109 124 119 123 118 114 115 116 [...video parameters truncated...] urn:ietf:params:rtp-hdrext:sdes:mid [...video codec parameters truncated...] a=rtpmap:96 VP8/90000 a=rtcp-fb:96 goog-remb a=rtcp-fb:96 transport-cc level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f [...H264 parameters truncated...] a=rtcp-fb:124 nack a=rtcp-fb:124 nack pli [...remaining video parameters truncated...] mslabel:xlhFA5NOFj9VpQ7Z1ylg9jmfNytu6l7jTKhQ a=ssrc:4014440443 label:906550e7-7a71-4c6a-a7ca-8f81fa0efe6c
[72.765] pc1 setLocalDescription start
[72.772] pc1 setLocalDescription complete
[72.772] pc2 setRemoteDescription start
[72.937] pc2 received remote stream
[72.937] pc2 setRemoteDescription complete
[72.937] pc2 createAnswer start
[72.959] Answer from pc2: v=0 o=- 1094912348166165889 2 IN IP4 s=- t=0 0 a=group:BUNDLE 0 1 a=msid-semantic: WMS m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126 [...audio/video parameters truncated...] a=recvonly a=rtcp-mux
[72.959] pc2 setLocalDescription start
[72.960] pc1 addIceCandidate success
[72.961] pc1 addIceCandidate success
[72.961] pc1 addIceCandidate success
[72.964] pc1 addIceCandidate success
[72.964] pc1 addIceCandidate success
[72.964] pc1 addIceCandidate success
[72.964] pc1 addIceCandidate success
[72.965] pc1 addIceCandidate success
[72.965] pc1 addIceCandidate success
[72.965] pc1 addIceCandidate success
[72.965] pc1 addIceCandidate success
[72.965] pc1 addIceCandidate success
[72.965] pc1 addIceCandidate success
[73.292] pc2 setLocalDescription complete
[73.294] pc2 ICE state: checking
[73.294] ICE state change event:
[73.295] pc1 ICE state: checking
[73.295] ICE state change event:
[73.295] pc2 ICE state: connected
[73.295] ICE state change event:
[73.297] pc1 ICE state: connected
[73.297] ICE state change event:
[73.297] pc1 setRemoteDescription complete
[73.305] pc2 addIceCandidate success
[73.305] pc2 addIceCandidate success
[73.305] pc2 addIceCandidate success
[73.306] pc2 addIceCandidate success

