Why FreeSWITCH Mod_Verto is the Superior Choice for Browser-Based WebRTC
When developers set out to build browser-based phones or communication applications with FreeSWITCH, the instinctual approach is often to go the "SIP over WebSockets" route. This involves relying on well-known client-side libraries like JsSIP or SIP.js to create browser SIP endpoints.
While this approach works and is heavily documented across the web, FreeSWITCH has a vastly superior, purpose-built alternative for browser environments: Mod_Verto and the Verto protocol.
In this article, we will dive into what makes Mod_Verto the better architectural choice for real-time web applications, focusing specifically on session recovery, protocol overhead, and browser-centric design.
The Problem with SIP in the Browser
Session Initiation Protocol (SIP) is the undisputed king of VoIP signaling. It was designed for dedicated hardware phones and persistent background processes running on desktops or servers. These devices are generally static—they maintain stable network connections and don't arbitrarily disappear.
Web browsers, however, are chaotic environments:
- Users accidentally close tabs or hit the "Back" button.
- Mobile browsers aggressively suspend background tabs to save battery.
- Network transitions (e.g., Wi-Fi to Cellular) disrupt WebSocket connections.
When a standard SIP over WebSocket client (like JsSIP) experiences a sudden disconnection or a page reload, the SIP session state is lost on the client-side. Because SIP relies on a strict state machine with sequential dialogs and transactions, the client cannot simply reconnect and say "I'm still here." The call drops, and there is no native way in standard SIP to seamlessly recover the media session without starting a completely new call.
The Verto Solution: Built for the Web
Verto is an RTC (Real-Time Communication) signaling protocol developed directly by the FreeSWITCH team. Instead of cramming a heavy, telecom-era textual protocol (SIP) into a WebSocket, Verto is built entirely on lightweight JSON-RPC.
Verto was designed from day one with the realities of the modern web browser in mind. Here is why it outperforms SIP over WebSockets.
1. Seamless Session Recovery (verto.attach)
The single biggest advantage of Verto is its built-in session recovery mechanic.
When a Verto WebSocket drops (due to a page refresh or network blip), the call does not die. FreeSWITCH retains the session state and the active media streams on the server side (CF_RECOVERING flag) for a configurable timeout period.
When the Verto client reconnects and authenticates, FreeSWITCH detects the orphaned call. The client can then send a verto.attach JSON-RPC request. FreeSWITCH immediately maps the new WebSocket connection back to the existing call and reattaches the media.
- The user's audio and video seamlessly resume.
- No new calls need to be dialed.
- The UI state can be completely restored.
This level of resiliency is practically impossible to achieve gracefully with SIP over WSS.
2. Drastically Reduced Payload Size and Parsing Overhead
SIP is a verbose, text-based protocol that carries a lot of legacy baggage. A standard SIP INVITE with SDP can easily exceed 1,000 bytes. When traversing multiple proxies, SIP headers (Via, Record-Route, Contact) bloat the payload even further.
Verto utilizes minimal JSON. A Verto verto.invite payload contains strictly what is necessary to establish the WebRTC call: the dialog ID, the destination number, caller ID data, and the WebRTC SDP.
Because JSON is the native language of the web:
- Smaller payloads mean faster transmission over constrained mobile networks.
- Native parsing: Browsers parse JSON natively and instantly (
JSON.parse()) at the C++ browser-engine level. SIP requires heavy client-side JavaScript regex parsers to decode every incoming message string, eating up CPU and battery on mobile devices.
3. Tighter Integration with Web Paradigms
Because Verto is JSON-RPC, it feels exactly like a native web API. You aren't forcing front-end developers to learn telecom concepts like CSeq, Max-Forwards, or SIP response codes (e.g., 487 Request Terminated).
Instead, developers work with intuitive JSON parameters:
{
"jsonrpc": "2.0",
"method": "verto.invite",
"params": {
"dialogParams": {
"destination_number": "1000",
"caller_id_name": "Web User",
"caller_id_number": "1000"
},
"sdp": "v=0\r\no=-..."
}
}
This dramatically lowers the barrier to entry for web developers integrating voice and video into their applications, abstracting the telecom complexity into familiar paradigms.
4. Effortless Custom Variables
Passing custom data (like user context, application state, or CRM IDs) via SIP usually requires hacking custom X- headers into the SIP request, which must be carefully stripped and parsed on the server.
Verto handles this elegantly. You can map any JSON property directly to FreeSWITCH channel variables by using the variables JSON object in your request. This data is instantly accessible in the FreeSWITCH dialplan or event socket, allowing for incredibly tight integration between your web frontend and your backend routing logic.
Conclusion
While SIP over WebSockets is a functional approach for legacy interoperability, it brings heavy telecom baggage into an environment where it doesn't belong.
If you are building a modern WebRTC application backed by FreeSWITCH, Mod_Verto is the architectural path forward. Its lightweight JSON-RPC structure significantly improves performance, simplifies front-end development, and most importantly, offers robust session recovery that saves calls from the chaotic nature of web browsers.