WebSockets guarantee order - so why are my messages scrambled?
25 points by friendlysock
25 points by friendlysock
From an API perspective this underlying API doesn’t make a whole lot of sense – the underlying websocket stream is ordered, so why is there an await point to receive the body of a message?
And so I double checked. It appears you can configure websocket.binaryType = 'arraybuffer'
in which case onmessage is called only after a complete message arrives, which should make things serial as you’d expect. No need to build this queuing abstraction on top of it, yuck.
From an API perspective this underlying API doesn’t make a whole lot of sense – the underlying websocket stream is ordered, so why is there an await point to receive the body of a message?
You’re not waiting to “receive the body,” you’re waiting to parse. You could call text()
(MDN docs) instead to read UTF-8, for example or stream()
(MDN docs) for a ReadableStream
. You should only set a binaryType
on the entire stream if you want to parse every message the same way.
Not quite: if you set the binaryType
to arraybuffer
, onmessage
is called after the message body is read from the socket and the body will always be stored in memory as an ArrayBuffer
. If you set binaryType
to blob
, onmessage
is called after only the message header is read from the socket, and the message body hasn’t been received yet.
If you don’t need to fully read the message (maybe the body is an archive and you only want to extract one file from it) or you have a decoder that supports ReadableStream
s (like a SAX XML parser) you might want to set the binaryType
to blob
to keep memory usage down. In the author’s case it would make much more sense to set the binaryType
to arraybuffer
, as evmar suggested.
What I think is going on in the post is that the first onmessage
is called, then the handler asynchronously reads the body into an ArrayBuffer
. After the body has been fully read and stored in an ArrayBuffer
, the websocket handler has also read the next message’s header, so both the yield point in the current message’s await
and the next message’s onmessage
callback are scheduled. Then it’s a matter of which of these coroutines the scheduler will call first.
If you set binaryType to blob, onmessage is called after only the message header is read from the socket, and the message body hasn’t been received yet.
Is that right? I thought onmessage
always gets called after the final frame of the message was received.
In the author’s case it would make much more sense to set the
binaryType
toarraybuffer
, as evmar suggested.
Setting binaryType
works in the author’s case. The underlying browser API seems fine, though?
Is that right? I thought onmessage always gets called after the final frame of the message was received.
You’re right, I went to check the spec myself and this is what it says:
When a WebSocket message has been received with type type and data data, the user agent must queue a task to follow these steps: […] 5. Fire an event named
message
at theWebSocket
object
RFC6455 defines “a WebSocket message has been received” as (conceptually) after the message’s body has been recv’d, so I stand corrected.
Then I suppose the default binaryType
being blob
would allow the browser to handle the message’s body in a more efficient way internally, since the Blob
cannot be directly indexed as an array like the ArrayBuffer
(through a DataView
) but must be interacted with using async
methods?
A small nuance that might need some clarification: Promise.all only guarantees order of results, it does not guarantee order of execution. It is a parallel operation. Use a for of async or some other sequential method to do that.
That’s… what TFA says? And wants?
Promise.all
to preserve result ordering. This design allows for parallel processing, but ensures the final results are handled in the correct order.
(although saying that it allows parallel processing is incorrect, it allows concurrent processing, as the only way to do parallelism for in-browser JS is to use workers).
Yes, but i feel that it should be emphasized that order happens because of the message queue, which is ordered through pushing into an array as messages arrive. This is aligned with the prelude talk about TCP synchronization.
However using const messages = await Promise.all(toProcess.map(async (message)
afterwards is a different beast. It works in this particular case because processing is expected to be out of order from the start, which not only falls out of the initial ordering assumption, but also quite tailored for this particular of using timeouts and some randomness expectation. To make it clear, this pattern is a classic trap, and if used wrong might lead you into a similar situation as with the array buffer that triggered the post.
Anyway all good as long as you know where you are going to be at.