Functional
Roc is designed to have a small number of simple language primitives. This goal leads Roc to be a functional language, while its performance goals lead to some design choices that are uncommon in functional languages.
Immutable by default
By default, Roc values are semantically immutable. In many languages, everything is mutable by default, and it's up to the programmer to "defensively" clone to avoid undesirable modification. Roc's approach means that cloning happens automatically, which can be less error-prone than defensive cloning (which might be forgotten), but which—to be fair—can also increase unintentional cloning. It's a different default with different tradeoffs.
A reliability benefit of semantic immutability is that it rules out data races. These concurrency bugs can be difficult to reproduce and time-consuming to debug, and they are only possible through direct mutation.
Direct mutation primitives have benefits too. Some algorithms are more concise or otherwise easier to read when written with direct mutation, and direct mutation can make the performance characteristics of some operations clearer. To address this, Roc provides opt-in mutable variables (described in the next section), while keeping immutability as the default.
As such, Roc's design means that data races and reference cycles can be ruled out for the vast majority of code, and that functions will tend to be more amenable for chaining, while mutable variables provide an escape hatch for algorithms where direct mutation leads to clearer code.
No reassignment or shadowing by default
In some languages, the following is allowed.
x = 1
x = 2
In Roc, you can only execute that code when using the --allow-errors flag.
That flag is intended to give you the freedom to quickly debug something or try something out even though some parts of the code contain errors.
For cases where reassignment is the most natural way to express something, Roc provides mutable variables. These are declared with var and marked with a $ prefix. For example, var $count = 0 declares a mutable variable that can later be reassigned with $count = $count + 1. The $ prefix makes it immediately clear at every use site that a value might change, preserving the readability benefits of immutability by default while providing a convenient way to express algorithms that are more natural with mutation.
Avoiding regressions
A benefit of this design is that it makes Roc code easier to rearrange without causing regressions. Consider this code:
func = |arg| greeting = "Hello" welcome = |name| "${greeting}, ${name}!" # … message = welcome("friend") # …
Suppose I decide to extract the welcome function to the top level, so I can reuse it elsewhere:
func = |arg| # … message = welcome("Hello", "friend") # … welcome = |prefix, name| "${prefix}, ${name}!"
Even without knowing the rest of func, we can be confident this change will not alter the code's behavior.
In contrast, suppose Roc allowed reassignment. Then it's possible something in the # … parts of the code could have modified greeting before it was used in the message = declaration. For example:
func = |arg| greeting = "Hello" welcome = |name| "${greeting}, ${name}!" # … if someCondition greeting = "Hi" # … else # … # … message = welcome("friend") # …
If we didn't read the whole function and notice that greeting was sometimes (but not always) reassigned from "Hello" to "Hi", we might not have known that changing it to message = welcome("Hello", "friend") would cause a regression due to having the greeting always be "Hello".
Even if Roc disallowed reassignment but allowed shadowing, a similar regression could happen if the welcome function were shadowed between when it was defined here and when message later called it in the same scope. Because Roc allows neither shadowing nor reassignment for regular bindings, these regressions can't happen, and rearranging code can be done with more confidence. (Mutable variables, with their $ prefix, make it obvious which names can change.)
Mutable variables work naturally with Roc's for loop syntax. For example, here's a function that sums a list of numbers:
sum = |num_list| { var $total = 0 for num in num_list { $total = $total + num } $total }
Looping can also be done with convenience functions like List.walk or with recursion (Roc implements tail-call optimization).
Managed effects over side effects
Many languages support first-class asynchronous effects, which can improve a system's throughput (usually at the cost of some latency) especially in the presence of long-running I/O operations like network requests.
Asynchronous effects are commonly represented by a value such as a Promise or Future (Roc calls these Tasks), which represent an effect to be performed. Tasks can be composed together, potentially while customizing concurrency properties and supporting I/O interruptions like cancellation and timeouts.
Most languages also have a separate system for synchronous effects, namely side effects. Having two different ways to perform every I/O operation—one synchronous and one asynchronous—can lead to a lot of duplication across a language's ecosystem.
Instead of having side effects, Roc functions exclusively use managed effects in which they return descriptions of effects to run, in the form of Tasks. Tasks can be composed and chained together, until they are ultimately handed off (usually via a main function or something similar) to an effect runner outside the program, which actually performs the effects the tasks describe.
Having only (potentially asynchronous) managed effects and no (synchronous) side effects both simplifies the language's ecosystem and makes certain guarantees possible. For example, the combination of managed effects and semantically immutable values means all Roc functions are pure—that is, they have no side effects and always return the same answer when called with the same arguments.
Pure functions
Pure functions have some valuable properties, such as referential transparency and being trivial to memoize. They also have testing benefits; for example, all Roc tests which either use simulated effects (or which do not involve Tasks at all) can never flake. They either consistently pass or consistently fail. Because of this, their results can be cached, so roc test can skip re-running them unless their source code (including dependencies) changed. (This caching has not yet been implemented, but is planned.)
Roc does support tracing via the dbg keyword, an essential debugging tool which is unusual among side effects in that using it should not affect the behavior of the program. As such, it typically does not impact the guarantees of pure functions in practice.
Pure functions are notably amenable to compiler optimizations, and Roc already takes advantage of them to implement function-level dead code elimination. Here are some other examples of optimizations that will benefit from this in the future; these are planned, but not yet implemented:
- Loop fusion, which can do things like combining consecutive
List.mapcalls (potentially intermingled with other operations that traverse the list) into one pass over the list. - Hoisting, which moves certain operations outside loops to prevent them from being re-evaluated unnecessarily on each step of the loop. It's always safe to hoist calls to pure functions, and in some cases they can be hoisted all the way to the top level, at which point they become eligible for compile-time evaluation.
There are other optimizations (some of which have yet to be considered) that pure functions enable; this is just a sample!
Get started
If this design sounds interesting to you, you can give Roc a try by heading over to the tutorial!