Compute needs are rapidly growing, and individual CPU speeds have not kept pace with that increase. The simple fact is they aren’t ever going to be able to - we face the everpresent constraint of the speed of light, regardless of how many transistors we can cram onto a chip. Sure, stacking CPU cores solves the raw processing needs, but just cobbling more cores into a single server can’t provide the redundancy for zero-downtime or the scale for massive computing jobs.
The Problem: Feeding data into the CPU
The fundamental problem is aligning data with compute. After all, the CPU needs to have some numbers to crunch. Moving data to the CPU for computation in a single machine is doable, but scaling such that we can execute computation on a multiplicity of machines presents a cornucopia of new challenges
We need a way to run code efficiently in a concurrent and distributed fashion.
Today’s State-of-the-Art: Event-driven architectures and their limitations
Today’s best-in-class architectures rely on a message/event-driven model. However, all event-driven system ultimately rely on centralized databases to store new data and make it available to other systems. That compels you either towards a monolithic database (and single point-of-failure) or islands of disconnected state that don’t reconcile with each other.
For many applications, relying on a centralized database is not an insurmountable hurdle. Modern databases are well-optimized and downright speedy. Latency can be overcome by vertical or horizontal scaling of your database. So what’s the problem? Data distribution. The best centralized database in the world is useless if client applications can’t connect to it (for example, during an outage). Large companies have recourse to the infrastructure necessary to mitigate downtime from database availability failures this, but most users sure don’t. Resiliency also becomes an increasing concern with this model, especially as you consider hardware outside of the datacenter.
Distributed Actors: Battle-tested for decades and ready for tomorrow’s challenges
Enter the distributed actor.
Actors are a programming model that addresses the requirements of concurrent computation by defining what each actor in a system can do and how they interact with one another. It allows for the breaking apart of massive and complex computation tasks into individual ones that can be distributed across multiple machines in multiple data centers.
And sure, if you squint, actors do look like messaging systems. After all, actors send messages to one another and react to the messages that they’ve received.
But there is an important difference in how actors and event systems operate. Actors define their identity in a stateful and evolving fashion. In other words, an actor contains both logic (code that describes how to manipulate data passed via a message) and the most recent data computed by the actor (also known as “state”).
For example, if we create a new actor and set its value to zero, the state of that actor is 0. If we send the actor an "add 1" message, the actor will add 1 to 0 and record its new state as 1. The "add 1" logic is the actor's code, the current value of the actor is its state.
Actors shine in a number of contexts, specifically because of this bundling of logic and state. With actors, code executes using the actor’s existing internal data (its state) in combination with new data passed to the actor. Everything the actor needs to do its job is locally-accessible - the actor has no need to connect to an external data source such as a message bus or database.
Building with an actor model provides an excellent foundation for first-class offline support - allowing applications to continue to function even disconnected from the network.
Actors at scale: Constructing infinitely complex systems from simple building blocks
In the distributed actor model, the developer defines the system. The system then defines the environment that an actor operates in. The actor itself is a fully self-contained atomic unit that is only responsible for its one job. Actors operate sequentially by default, but support concurrent execution, as well. The actor is only concerned with the messages it receives and how it reacts. It doesn’t share memory with other actors and other actors can never directly alter its state.
Composing actors together creates a complex system out of simple building blocks. Each actor is a block of encapsulated logic and state, and developers can utilize as many of these blocks as needed. If one actor fails, that compute job is rescheduled to an available node. When an actor fails, the system notifies other concerned actors of the failure. Those concerned actors can then restart themselves or otherwise react to the notification.
In addition to the complexity and resiliency benefits of actors, the actor model also provides comparatively lower latency. While starting and stopping atomic processes by the millions may seem computationally slow and expensive, actors are incredibly lightweight and they cold start and shut down in mere microseconds.
Distributed Stateful Actors: An architectural paradigm shift to meet tomorrow’s needs
Great speed, resiliency, availability, and offline support make actors a compelling model for the next generation of applications.
Mycelial believes that the future is highly concurrent. We need to be ready for problems that involve a medley of machines, latency tolerances and connectivity stories. Actors elegantly empower developers to tackle these challenges in a performant and flexible fashion.