Cats (II) : Citing transparency and equality reasoning

This post was posted by Yison on the ScalaCool team blog.

The last article introduced the thinking of functional programming, this time we will understand the charm of functional programming.

As we’ve said, the most important thing about functional programming is doing “composition.” But what exactly are these “functional components” that can be composed? In other words, they must conform to certain laws and principles.

What we now know is that functional programming is a process that approximates reasoning in mathematics. So let’s think about, what are the characteristics of reasoning in mathematics?

We soon discover one of the greatest advantages of mathematical reasoning – “as long as the reasoning is correct, the result is true”.

In fact, this is one of the great advantages of functional programming that this article describes, called “equation reasoning.”

So, let’s go one step further and ask, “What are the principles and methods that make functional programming so special?” .

Referential transparency

The answer is to cite transparency, which has an approximate definition in both mathematics and computing.

An expression is said to be referentially transparent if it can be replaced with its corresponding value without changing the program’s behavior. As a result, evaluating a referentially transparent function gives the same value for same arguments. Such functions are called pure functions.

An expression can be replaced by its equivalent value in a program without affecting the result. If a function has the same input and the same result, it has “referential transparency” and can be called a “pure function.”

Here’s an example:

def f(x: Int, y: Int) = x + y

println(f(2.3))
Copy the code

We could have just substituted 5 for f of 2, 3, and it wouldn’t have made any difference.

This one is easy to understand, but what is “non-referential transparency” in reverse?

Here’s another example:

var a = 1

def count(x: Int) = {
	a = a + 1
	x + a
}

count(1) / / 3
count(1) / / 4
Copy the code

In the above code, we can see that multiple calls to count(1) give different results. This is obviously influenced by the external variable A, which we call a side effect.

Side effects

Changing something is a side effect.

Changed the value of an external variable
IO operations, such as writing data to disks
UI actions, such as modifying the actionable state of a button

Therefore, it is not difficult to find that side effects are often related to “mutable data” and “shared state”. Common examples are “lock contention”, which is an obvious example in the scenario where we use multi-threading to handle high concurrency. In functional programming, however, because of “referential transparency” and “data immutability,” we can even combine “two functions that return asynchronous results,” increasing the reasoning power of the code and reducing the complexity of the system.

In summary, reference transparency ensures the independence of “functional components,” which are isolated from the outside world, can be analyzed separately, and are therefore easy to compose and reason about.

Note: The asynchronous operation function, for example, could be a database read and write operation, which we will explain in a later article.

immutability

We have already mentioned that “immutability” is a key feature to promote referential transparency. In Haskell, any variable is immutable, and in Scala we can declare immutable variables using val (instead of var).

Apparently, more and more programming languages support this feature. Such as let in Swift, const in ES6. And some well-known open source projects, such as Facebook’s Immutable.

So, are we done with the “referential transparency” part?

Wait, one of the key points of reference transparency mentioned earlier is to return the same calculation. Here, we’re going to take it a step further and look at what “same result” means, does it just mean returning the same value?

The substitution model

Let’s take a look at this code, which conforms to what we call reference transparency:

def f1(x: Int, y: Int) = x
def f2(x: Int) :Int = f2(x)
f1(1, f2(2))
Copy the code

This is suicidal code, and if we execute it, F2 will be called over and over again, causing an endless loop.

Seems to have the answer, the so-called “same calculation”, can also be an endless loop…

A Haskell programmer walked by, smiled, and took 10 seconds to translate the following version:

f1: :Int -> Int -> Int
f1 x y = x

f2: :Int -> Int
f2 x = f2 x
Copy the code

Run the GHCI loading function and call F11 (F22) and you will find: Nani! Result 1 was successfully returned. What’s going on here?

Apply order vs regular order

Probably a lot of developers have never thought about this question: What is the expression evaluation strategy in a programming language?

In fact, there are two different models of substitution in programming languages: applied ordering and regular ordering.

Most of us are familiar with Scala, C, and Java as “application sequencing” languages. When a procedure is to be executed, the procedure parameters are evaluated. This is also the reason why the Scala code above causes an infinite loop. The f2 function is constantly called.

Haskell, however, uses a different logic, delaying evaluation of a procedure parameter until it is actually needed. This is known as “regular ordering,” or lazy evaluation. When we call f1 (f2), it does not evaluate f2 and returns x, which is 1, since f1 does not need y at all.

Note: The above situation is described in terms of “pass-value calls” and “reference calls” as you may know.

So what are the benefits of this?

Inertia is evaluated

Haskell is the default language for lazy evaluation, and in other languages (such as Scala and Swift) we can also declare lazy variables and functions using the lazy keyword.

Lazy evaluation has many advantages, such as what some people know as the “infinite list structure.” Of course, it can also cause some problems, such as making the program evaluation model more complex, and abusing lazy evaluation can lead to a loss of efficiency.

We don’t want to get into the pros and cons of lazy evaluation here, which is not an easy question. So why do we introduce lazy evaluation?

That’s because it has something to do with what we’ve been talking about.

How to combine side effects

Functional programming thinking is about abstracting and combining everything, including the side effects of reality.

How do common side effects, such as IO operations, combine?

Here’s a code:

println("I am a IO operation.")
Copy the code

Obviously, println here is not a pure function, it’s not good for composition. How can we solve this problem?

Take a look at how lazy evaluation is implemented in Haskell.

Thunk

A thunk is a value that is yet to be evaluated. It is used in Haskell systems, that implement non-strict semantics by lazy evaluation.

Lazy evaluation in Haskell is implemented by a mechanism called Thunk. We can simulate a similar effect in other programming languages by providing a thunk function.

Thunk is easy to understand. For example, for the above impure function, we can make it lazy:

object Pure {
  def println(msg: String) = () = >Predef.println(msg)
}
Copy the code

Thus, when our program calls pure. println(“I am an IO operation.”), it simply returns a function that can println, which is inert and fungible. In this way, we can combine these IO operations in our program and then execute them.

You might also wonder when the Thunk function here is called, and how we can avoid the random mass of thunk functions in business processes if we want to develop the business in the same way.

This, which we’ll cover in a future article, has to do with the so-called Free Monad.

conclusion

The second article further explores several features and advantages of functional programming, and seems to have yet to mention Cats. Take your time. In the next installment, we’ll get down to business, and we plan to start with “higher-order genres.”