Fwd: Comment on fibers and async library interfaces
Nils Henrik Lorentzen
nils.lorentzen at gmail.com
Fri May 24 20:06:23 UTC 2019
---------- Forwarded message ---------
From: Nils Henrik Lorentzen <nils.lorentzen at gmail.com>
Date: Fri, May 24, 2019 at 10:05 PM
Subject: Re: Comment on fibers and async library interfaces
To: Ron Pressler <ron.pressler at oracle.com>
.. perhaps a custom fiber scheduler that interoperates somehow with the
message passing queue would help here so that if the message processing
calls blocking I/O as detected by the scheduler, it would re-schedule the
message queue on another fiber. Perhaps possible though somewhat hackish
again (but isolated to the custom fiber scheduler).
In Erlang the receive op that gets swapped is calling that blocking
operation on the mailbox aka message queue, so seems the Erlang process
switching is sort of integrated with the message queue processing. Don't
know all the details though. Erlang is slightly different model from state
machines, however one could model async reactiveness as statemachines there
too , just receive any message on the mailbox and forward to current state.
On Fri, May 24, 2019 at 9:26 PM Nils Henrik Lorentzen <
nils.lorentzen at gmail.com> wrote:
> I think the issue is when passing messages though a state machine message
> passing framework (or probably any such completely async message passing),
> fibers will indeed "block" the processing for the duration of the call to
> the fiber since one cannot process other events in the mean time. It would
> swap out the whole call stack of the message passing framework basically,
> and put message processing on hold until result comes back.
> "Instead of trying to recast fibers as state machines, think of how you’d
> program if creating and blocking threads were essentially free."
> For a telco server I would still model it as statemachines since they need
> to be able to handle events from multiple sources (call teardown from each
> end, no media, call transfer request, mute, unmute, call on hold,
> disconnect from web console, etc) all occuring simultaneously or in any
> order, thus it just does not follow a blocking pattern by nature, one can't
> block on all of them. State machines model these cases well since reactive
> and one can easier verify by code that cases are covered and give errors
> for cases that are not.
> Lightweight threads are indeed a nice way to handle and increase
> throughput for the case when the fiber initiator is calling out to perform
> a number of tasks.
> For telco that is opposite, the outside world pounds and ongoing call or
> conference with events in any order, and the event processor just has to
> handle those reactively, it can't decide order.
> So for message passing case here I think the problem is not lack of
> asynchronousity in execution of lightweight threads, but that they no
> matter what are blocking by nature as the path of execution is swapped out
> for some time till a response comes back, meanwhile eg. in a phone call
> lots of other events may have occured but have not been duly processed from
> the incoming message queue. One could perhaps spawn another fiber to
> re-trigger message queue processing if one knows that state machine action
> might block but that seems very hackish.
> Erlang has selective receive on the process mailbox which helps with all
> this, it was made for telco but I a not sure how to do that with Java
> fibers since one would have to wait for multiple fibers. Not saying it
> can't be done, but I certainly don't see it as a good fit.
> State machines/message passing is no matter what a tried and tested
> pattern and I don't think one should unnecessarily narrow down the
> possibilities for how to design an application. Imagine an employer that
> has a number of people that are very well versed in async programming and
> state machines, they would have to retrain to another paradigm that might
> not even be the best fit to express the problem domain.
> Nils Henrik Lorentzen
> On Fri, May 24, 2019 at 8:43 PM Ron Pressler <ron.pressler at oracle.com>
>> But the underlying (JDK core-library) IO used by the driver is
>> fiber-friendly, i.e. you use it with a synchronous API, but it’s entirely
>> non blocking.
>> You can schedule as many fibers as you want onto as few kernel threads as
>> you want (even a single one).
>> I don’t see anything of what you said being hampered by fibers; if
>> anything, they assist.
>> And if you happen to find async APIs easier to understand, it’s easy to
>> expose (fiber) blocking code as an async APIs than vice-versa.
>> I’ve found that people who know async well often find it harder to
>> understand lightweight threads than those who don’t.
>> Instead of trying to recast fibers as state machines, think of how you’d
>> program if creating and blocking threads were essentially free.
>> On May 24, 2019 at 7:29:18 PM, Nils Henrik Lorentzen (
>> nils.lorentzen at gmail.com) wrote:
>> After thinking about how it had to be implemented under the hood, I
>> figured too it had to be async/non-blocking in the implementation
>> (otherwise it would have to spawn a thread to do I/O...).
>> Also documented on wiki
>> The implementation of the networking APIs in the *java.net
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__java.net&d=DwMFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=byB-shhsjoiOrLx9u4sONYabdsVPM2fIg3m8CqKCMTY&m=Rk0iutvndURnCXe8gsNlTkzAYHRa4Xkawc5MdiOJXRI&s=GltwVUu-fGN3X9TkLgVloKVnsHkf7qKj_90P3TOmBcw&e=> *
>>> and *java.nio.channels * packages have as been updated so that fibers
>>> doing blocking I/O operations park, rather than block in a system call,
>>> when a socket is not ready for I/O. When a socket is not ready for I/O it
>>> is registered with a background multiplexer thread. The fiber is then
>>> unpacked when the socket is ready for I/O. These same blocking I/O are also
>>> updated to support cancellation. If a fiber is cancelled while in a
>>> blocking I/O operation then it will abort with an IOException.
>> "In what cases will fibers not suffice?"
>> One probably said the same about threads back in the day too :) The
>> unknown unknowns thing.
>> Async message passing can be one. What I mean by pure async server (very
>> poorly explained) is for example a telecom server where everything is
>> solely based on messages passed asynchronously between state machines.
>> There is no blocking I/O invoked at all, I wrote such a thing many years
>> ago (in C++ though)
>> - a single mainloop using select() or epoll()
>> - every implemented telco protocol adapter registered similar to
>> nonblock NIO selectors, but with method to detect complete-length inbound
>> protocol message and to decode such message.
>> - all communication in business logic (like call menus - "press 1 for
>> xyz") on non-blocking message queues.
>> - no blocking receive either (unlike Erlang).
>> Nature of async I/O allowed for the whole system could do call setup,
>> teardown, business logic and media conferencing all to run on one single
>> thread, with smooth media mixing. Shows the power of async processing.
>> One could still configure on the side how many threads, what protocol ran
>> in which thread, etc. , separates processing completely from threading
>> Not a bad programming model, statemachines do well for reasoning about
>> concurrent events from multiple sources, as long as one avoids "state
>> explosion" problem (which fibers admittedly could help avoid, but purely
>> reactive state machines are easy to reason about as all state transitions
>> are expressed in the code as state tables, easy to see if one missed a
>> state/signal combination).
>> *Now at long last to the issue*
>> In this case, I'd probably want to integrate any protocol as completely
>> If this was Java and I wanted to integrate say JDBC into this, I'd might
>> want to have an async JDBC driver where I could get the selector, add it to
>> the main loop of the server and then perform queries async from
>> statemachines, all of it happening on the same thread. So setting the
>> statemachine to QUERYING state and to QUERIED or similar after received
>> The issue of using fibers here and call a blocking API is that the
>> statemachine would not be able to process other events while the query to
>> DB was going on, that call chain would get swapped out til DB returned
>> response, no matter what.
>> Now one could perhaps work around it by spawning threads instead, but I
>> am not quite sure how it would work out. A model where all processing is
>> done by message passing statemachines or protocol adapters is very simple
>> and clean, once used to the idea.
>> There are probably other such cases that one has not thought about, they
>> tend to turn up after a while...
>> The point here is that if main network entry points are implemented async
>> (eg. JDBC drivers) then one can easily implement reactive/statemachine
>> patterns on top of them while also easily add a synchronous API. The other
>> way around would require spawning threads, which seems needless given that
>> one could have the driver be async directly.
>> Thus as I see it now at least, one way (sync driver) has a risk of
>> somewhat shutting out or make more difficult perhaps rare but quite doable
>> implementation patterns, the other allows for both.
>> If driver is already async I/O, fiber has to do nothing particular but
>> for sync, yield when it hits the object.wait() that waits for completion of
>> async JDBC (if I have understood fibers correctly).
>> Nils Henrik Lorentzen
>> On Fri, May 24, 2019 at 6:56 PM Ron Pressler <ron.pressler at oracle.com>
>>> There are two different issues here: async IO and async APIs (or
>>> programming style).
>>> Fibers do async IO automatically given a synchronous API. I don’t think
>>> I understand your concern about deemphasizing async APIs.
>>> Is a server that uses blocking APIs with fibers considered a “pure async
>>> server” or not? If not, why not?
>>> Under the covers, only async IO is used. This is the same as in Erlang
>>> and Go.
>>> You also write that "The opposite requires threads to be spawned and
>>> that defeats the purpose of async for scalability/throughput”,
>>> but the whole point of lightweight threads is that spawning (and
>>> blocking) them is cheap so that it does harm scalability/throughput.
>>> In what cases will fibers not suffice?
>>> On May 24, 2019 at 10:11:32 AM, Nils Henrik Lorentzen (
>>> nils.lorentzen at gmail.com) wrote:
>>> I am a longtime Java developer just becoming aware of project Loom and
>>> lightweight threads. It seems like an idea well worth implementing in the
>>> core JVM/libraries for easier making scalable server applications.
>>> From reading the proposal at
>>> https://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html, I do
>>> have a
>>> few concerns though, if I have understood fibers correctly.
>>> Not subscribed to the list so sending this as food for thought.
>>> These are mainly related to that there probably will be corner cases
>>> one has to write async code, and fibers just will not suffice. These
>>> might be not be known yet but will surface in the future.
>>> What is of concern is the statement "In addition to making concurrent
>>> applications simpler and/or more scalable, this will make life easier for
>>> library authors, as there will no longer be a need to provide both
>>> synchronous and asynchronous APIs for a different simplicity/performance
>>> I understand this is written with the best intentions, who wouldn't want
>>> make life easier for library writers, and I have no intention to
>>> the author on this.
>>> What I am wary of here is that this might discourage providing async
>>> even at the low level, which will then make it way more difficult to
>>> pure async servers if need be. Or one just prefers that way of
>>> (better logs solve much of the no-proper-stacktrace issues, and better
>>> capabilities are also a plus in prodution deployed systems)
>>> Consider JDBC as an example, one is now at long last working on providing
>>> async JDBC drivers that can be useful for high throughput processing and
>>> reactive/async apps.
>>> When it comes to network communication, similarily to what the proposal
>>> states that async/await can be easily implemented by continuations, so
>>> a synchronous network driver API easily be made on top of an asynchronous
>>> driver. The opposite requires threads to be spawned and that defeats the
>>> purpose of async for scalability/throughput.
>>> Keep in mind that even for request/response protocols, the base
>>> communication is always async by nature. There is no blocking operation
>>> an ethernet card :) Thus async operation on top of a sync driver means
>>> async network => sync API => thread to simulate asynchronousity => async
>>> application, which is a long chain for something that was asynchronous in
>>> the first place.
>>> I would argue that for essential drivers (especially proprietary ones
>>> JDBC), one should always implement an async API at the base using NIO and
>>> then just have a generic sync wrapper on top.
>>> Async driver at the core does imply either spawning a thread in the
>>> for its own select() mainloop or an API for integrating NIO Selectors
>>> another mainloop (eg. of an application server) but should be manageable.
>>> An example of this architecture is Erlang. From what I can tell, socket
>>> communication is non-blocking and done via message passing between
>>> processes. The trick (and elegance) of Erlang is that it has a "selective
>>> receive for messages" and from what I can tell, 'receive' is pretty much
>>> the only place in all of Erlang that it would suspend lightweight threads
>>> (probably a setjmp()/longjmp() libc call at that place in its VM).
>>> For an async network driver in Java would be the blocking API doing
>>> Object.wait()/notify() for threads. For suspend in fibers, the underlying
>>> sync/async wrapper implementation could continue the fiber when there is
>>> input (or on writeability for writes).
>>> Just raising a flag here a bit because even if it is not such now, it
>>> become a classic case of group think where async becomes discouraged, and
>>> then at some point one figures one needs it anyways. Except that all APIs
>>> have adopted synchronous functioning and it would be even more difficult
>>> convince someone provide async network drivers as they would argue that
>>> fibers should solve it, so no need for it.
>>> Lightweight threads have a bright future but hopefully not at the expense
>>> of tried and proven patterns for high throughput servers :)
>>> Kind regards,
>>> Nils Henrik Lorentzen
More information about the loom-dev