Response to the RIO monad

This is a response to The RIO Monad.. It currently makes the most sense if read side-by-side with that post.

pointwise

What’s happening in Stack?

This seems like a great description of how Stack bootstraps itself – each layer of config informing the next. But I feel like the description belies the need for any kind of hierarchy. E.g., once we load up the logging config, we should turn it into a function that does all the right coloring and forget about the config that mentions coloring and log level. It’s in the function now. We should also no longer care about “do we use the system-wide GHC” (Bool or otherwise). After we read that, we just have a path (or richer structure) to the GHC we’re using.

“Some parts of our code base are pure. We’ll ignore those for now, and focus only on the parts that perform some kind of IO.” – this is an ominous statement. Even in a rich GUI application, very little of the code should have anything to do with IO. Also, the introduction of IO doesn’t mean you should change anything else about your code … but you should still minimize where IO shows up. It’s a big hammer.

Pass individual values

Pass composite values

ReaderT IO

Monad transformers aren’t a solution – this is a whole thing. Instead of dealing with stacks, they’re another place we should be using the pain of awkward types to point us toward a refactoring. Notice the readConfig example above. There is IO in some places and Reader (aka (->)) in others, but neither exists in both. This gives you a much more testable API, among other things.

What transformers are is a newtype for selecting a particular Monad instance.

The example in this section finally introduces why someFunc has IO – logging. We’ll come back to that.

MonadLogger

This section is absolutely right.

Custom ReaderT

Nope. No no no.

The LoggingT situation leading to here is exactly the kind of thing that should make you rethink what you’re doing, but newtyping a newtype (which isn’t completely unreasonable in some cases) is not the way to do it.

Be more general!

Monad* classes are a patch over transformers. For most of them, there is one “best” representation, akin to the notion of a universal construction. E.g., MonadIO says, “there is IO in this stack.” If you take the advice from earlier and use stacks as a hint to refactor, “there is IO in this stack” and IO are isomorphic.

Has type classes

HasFoo a ~ Foo :<: a ~ Lens' a Foo

and there’s also

AsFoo a ~ a :>: Foo ~ Prism' a Foo

look at that nice duality, and lack of type class proliferation. Each one requires a lawful optic.

I don’t think this part is necessarily bad (other than wanting to avoid the type class proliferation), but I don’t think the composite values have been justified quite yet.

Exception handling

Please god, not MonadBaseControl.

But more generally – why are you even using execptions? Either is the correct answer here. And this could get into a whole thing about duoids (like the relationship Either and Validation have to each other.

Introducing MonadUnliftIO

Nooooooooo!

Avoid MonadCatch m => StateT s m a problems by avoiding state stacks, not by adding more magic in the stack classes.

More concrete

“We’re not writing a generally-consumable library. We’re writing an application.” – This is the wrong mindset. It forces contributors (both different teams on a corporate project and someone trying to make a PR in OSS) to have much more of the project in their head in order to make a change. An application should be thought of as a number of interacting libraries with a relatively thin layer that really turns it into an application.

“We’ve already added a MonadUnliftIO constraint on our functions” – but you shouldn’t have. Again, those classes are just duct tape on code that should really be refactored.

Should we be that general?

Sure – not much here.

Do we need a transformer?

No. Lowering the transformer here totally makes sense.

The m*n instance problem

Same. Monad* is totally an anti-pattern. Glad to hear it.

Why not ReaderT?

So, I agreed that ReaderT doesn’t buy us anything (and, in general, transformers should be seen as instance selectors), but I don’t see how ReaderT is a problem compared to RIO. If we’re using

Some notes on the Has typeclasses

Can we get the superclass behavior with :<:?

class element :<: container where
  get :: container -> element
  set :: container -> element -> container

-- | Every instance of `(:<:)` must form a lawful `Lens'`.
-- >>> view (has @Runner) someConfig :: Runner
has :: element :<: container => Lens' container element
has = lens get set

-- | covers `HasRunner Runner`, `HasConfig Config`, etc.
instance a :<: a where
  get = id
  set = const id

-- | reversed version of `class HasRunner env => HasConfig env`, which simultaneously provides `HasRunner Config`, `HasRunner BuildConfig`, etc.
instance Config :<: a => Runner :<: a where
  get = configRunner . get
  set old new = old { configRunner = new }

instance BuildConfig :<: a => Config :<: a where
  get = buildConfigConfig . get
  set old new = old { buildConfigConfig = new }

instance EnvConfig :<: a => BuildConfig :<: a where
  get = envConfigBuildConfig . get
  set old new = old { envConfigBuildConfig = new }

Now you only need to define the highest instance in the tree – and often not even that. E.g., we already have `Runner :<: EnvConfig` in the above definitions.

Have all the desirable behavior, fewer type classes, more laws, and fewer instances!

How about pure code?

The last option here is what I strive for. Again, the rest are usually hints that you need to refactor. My overarching point is that you shouldn’t stop doing it this way when IO shows up. It’s even more important to not have stacks when IO is around.

Using RIO in your application

Don’t.

Additional points

Eliminating boolean blindness

I think the simplest example of this is probably filter. It’s type has a Bool in it: filter :: (a -> Bool) -> [a] -> [a]. There’s only one boolean, but people still often forget what it means (i.e., does True mean “filter it out” or “keep it”?). We can answer the question decisively by changing the type to filter :: (a -> Maybe b) -> [a] -> [b]. The only thing we can keep in the list are bs (which may be the same as as), but we need a Just to keep something. If we weakened the type just a bit, by replacing b with a, we still have a better hint than the original Bool version, but it’s still possible to implement it either way:

-- filter _out_ the elements that return `Just`
filter test =
  foldr
  (\a acc -> case a of
      Nothing -> a : acc
	  Just _ -> acc)
  []

That implementation is counter-intuitive, but still legit. The version with b makes it impossible.

There are a few similar tricks like this for eliminating Bool (and other enum-like cases). Another one is that if you have some flag that triggers one side or the other of a branch,

foo :: Bool -> W
foo myFlag =
  ...
  if myFlag
    then doA x y z
    else doB x y
  ...

you can give control to the caller by accepting a function instead of the Bool.

foo :: (X -> Y -> Z -> V) -> Q
foo fn =
  ...
  fn x y z
  ...

doA, doB :: X -> Y -> Z -> V
doA x y z = ...
doB x y _z = ...

There are already blog posts on this … should just link to one or two of those.

Logging should be outside of your pure code

There are multiple kinds of logging, but they’re often conflated. I feel like it’s important to distinguish at least two kinds here (even though their solution is similar), because Stack’s kind is not the one that I think comes up more often in practice.

First, some points that apply to both:

Those two points may sound contradictory, so let’s look at how we can apply them both.

service debugging

Most logging is used in running distributed services to try to figure out what happened when, and make it possible to replicate some failure or other behavior. This is the more common case, I think, and so I’ll cover it first.

Actually fixing a bug in a distributed service is something that should be done locally … or at least on machines distinct from prod. In order to make that possible, you need to start by creating an environment where you can reliably replicate the bug. Then you can fix it, ensure it’s fixed, and deploy a new version.

Logging mostly exists to make it possible to figure out how to create an environment that matches the one where the error appeared.

Since pure functions don’t admit IO (and therefore, don’t admit logging as described in our earlier points), the “deepest” that we can log is the point at which we call a pure function. So, let’s do that … right around the call to a pure function, we log its inputs and its result.

Something goes wrong. We look at the logs and see that the result of the call to that pure function is not what we’d expect given the inputs. Well, darn … now we know something is wrong inside that function, but we have no logging information inside it, so we’re stuck. Right?

Not at all! It’s a pure function, which means that when we call it with those same arguments, we should get the same result. We quickly load the module into a REPL (make sure you have the same SHA as the deployed service!) and try it out. Sure enough, we get the same incorrect result. We’ve managed to perfectly replicate enough of the server environment (basically none) to reliably reproduce the bug. Now it’s a simple matter of local debugging, with all our usual local tooling (including tracing). No need for adding any effects to our pure functions

user information

The Stack use case for logging is different. It’s providing information to the user (instead of the developer) to let them see what’s happening along the way. They don’t care what part of your code is pure or not, they just want to see various pieces of information as they’re available. Often this information is not stuff that gets returned directly, but perhaps has already been processed into some other form by the time you’re back at a level with IO.

How to refactor your stacks away

This is really a whole separate blog post, but when you see yourself with a monad stack, don’t try to hide it, understand that the types are telling you to refactor your code. It’s generally quite easy to get IO out of a stack:

All of these things should be pushed out from otherwise pure functions, and generally each of them should also have a separate function that is either String -> Either E A or A -> String to convert to/from some more structured data than whatever IO gives you.

Greg Pfeil 26 February 2020
blog comments powered by Disqus