Prathamesh Dukare

Dec 15, 2024 • 3 min read • 

ChatGPT Uses SSE & Why it's a Great Idea

Learn about how chatGPT uses server-sent events for streaming chats in realtime

ChatGPT Uses SSE & Why it's a Great Idea

Today, while casually checking network requests of chatGPT I stumbled upon something interesting. The /conversation http request was receiving the data in the form of an event which was forming the ChatGPT response in real time, I used to think they might be using sockets for this but this proves they’re not!

Geeking out on that I explored some more network requests and got to know how their chat flow works, how they use SSE (Server Sent Events), and how they resume the conversation in the future.

Let's see why using SSE (server-sent events) is a good idea in this case. But first, I’ll give a basic overview of SSE.

How Does SSE (Server-Sent Events) Work

Server Sent Events is a communication design pattern based on an underlying TCP connection which is used for a one-way long-lived connection to send data in chunks from the server. The client understands the streamed delta chunks with the help of the eventSource object (Polyfill’s can also be used), which combines the streamed chunks as they come.

Image source:

https://static.apiseven.com/uploads/2024/01/29/VM79w4PW_SSE-1.png?imageMogr2/format/webp

You can learn more about it here.

How ChatGPT Use SSE for Real-time Streaming

The ChatGPTs /conversation API sends a special header called Content-Type: text/event-stream; in the initial request as you can see in the screenshot below.

The server is configured to understand this and return the response on the go in the form of chunks as the AI model generates it. The client combines these delta chunks in the same order and keeps forming a response on the page until the connection is marked as closed by the server. This is a brilliant implementation of SSE for streaming server response. You can see how the response comes in the sequence below.

The simplicity of SSE makes it interesting. Server-sent events (SSE) have some limitations as well, especially when compared to other real-time communication methods like Web sockets and polling.

  1. One-way communication - limited to one-way streaming, not useful if you need bi-directional communication.

  2. Multiple connections on HTTP/1.1 may cause connection limit issues as it allows 6 TCP connections at a time (Using HTTP/2 solves this as multiplexing allows a single connection to handle multiple streams which removes the per-domain connection limit.)

  3. No Binary support - you cannot send binary data in SSE events

Despite these cons, SSE is great for scenarios like live updates, notifications, or event streaming when the requirements fit within its limitations.

Why SSE is a Great Choice for ChatGPT

SSE offers a lightweight solution for real-time one-way streaming, it relies on a simple HTTP connection. Unlike WebSockets, which need a more complex protocol and are comparatively heavier, SSE on the other hand is straightforward and efficient for scenarios where only server-to-client updates are needed.

In the chatGPTs case, when constructing a response in real-time as the model generates it, SSE shines. So streaming with SSE is beneficial as compared to using any other protocol.


If you have any thoughts or questions do share them in the comments below.

Thanks for reading till the end, stay curious :)

Join Prathamesh on Peerlist!

Join amazing folks like Prathamesh and thousands of other people in tech.

Create Profile

Join with Prathamesh’s personal invite link.

3

9

0