Transducers are coming to Clojure

  • This sort of reminds me of the Church-encoded form of a list.

        newtype Fold a = Fold (forall r . (a -> r -> r) -> r -> r)
    
        fold :: [a] -> Fold a
        fold xs = Fold (spin xs) where
          spin []     cons nil = nil
          spin (a:as) cons nil = cons a (spin as cons nil)
    
        refold :: Fold a -> [a]
        refold (Fold f) = f (:) []
    
    Notably, since `fold` and `refold` are isomorphisms then we can do everything we can do to `[a]` to `Fold a`

        map :: (a -> b) -> (Fold a -> Fold b)
        map x (Fold f) = Fold $ \cons nil -> f (cons . x) nil
    
        filter :: (a -> Bool) -> Fold a -> Fold a
        filter p (Fold f) =
          Fold $ \cons nil -> f (\a r -> if p a then cons a r else r) nil
    
    but all of this work is done without concrete reference to `(:)` and `[]`... you instead just use stand-ins I've been calling cons and nil. What's nice about this is that `Fold` can be used to build anything which can be "constructed from the left"

        foldSet :: Fold a -> Set a
        foldSet (Fold f) = f Set.insert Set.empty
    
    It's sort of dual to the stuff I was exploring in Swift here [0]. It also creates laziness for free because you can't really execute the chain until the end—Church-encoding is really a form of continuation passing.

    The downside of this idea is that each time you "consume" a Fold you redo work—there's no place to put caching necessarily.

    Maybe that's what they're solving with the Fold transformers representation.

    [0] http://tel.github.io/2014/07/30/immutable_enumeration_in_swi...

  • I'm not quite sure what this means, so here's my attempt to translate this into Python. A reducer is a function such as `add`:

        def add(sum, num):
          return sum + num
      
    Of course you can plug `add` directly in `reduce(add, [1, 2, 3], 0)` which gives `6`.

    A transducer is an object returned by a call such as `map(lambda x: x + 1)`.

    You can now apply the transducer to a reducer and get another reducer.

        map_inc = map(lambda x: x + 1)
        add_inc = map_inc(add)
        
    Our first reducer simply added, but the next one increments and then adds. We can use it as `reduce(add_inc, [1, 2, 3], 0)` which gives, I'm guessing, `9`.

    Since the transducer returns a reducer as well, we can compose transducers:

         r1 = filter(is_even)(map(increment)(add))
         # use r1 in reduce()
         
    It seems in clojure, reduce() isn't the only useful function that works with reducers, there are others which makes this all worthwhile.

    Is my translation accurate?

  • I think that a good way to understand transducers is to look at their implementation (shortened a bit). Here it is for map:

        ([f]
        (fn [f1]
          (fn
            ([result input]
               (f1 result (f input)))
            ([result input & inputs]
               (f1 result (apply f input inputs))))))
    
    filter:

        ([pred]
        (fn [f1]
          (fn
            ([result input]
               (if (pred input)
                 (f1 result input)
                 result)))))
    
    And it gets more interesting with take:

        ([n]
         (fn [f1]
           (let [na (atom n)]
             (fn
               ([result input]
                  (let [n @na
                        nn (swap! na dec)
                        result (if (pos? n)
                                 (f1 result input)
                                 result)]
                    (if (not (pos? nn))
                      (reduced result) ; a terminal value indicating "don't reduce further"
                      result)))))))
    
    The transducer is supplied with the reducer next in the chain (f1) and returns a reducer function that gets fed with the reduced value by the preceding reduction (result) and the next element (input). Note how the take transducer maintains internal state with an atom. This could get a little tricky for more elaborate reductions, as how the internal state is maintained might have a significant effect on performance, depending on exactly how the reduction is performed. For example, if the reduction is done in parallel (say, with fork-join), then an internal state that's updated with locks (like refs) might significantly slow down -- or even deadlock -- the reduction.

    AFAICT mapcat still only returns lazy-seqs.

  • As someone who tried Clojure and failed, serious question: Does anyone actually use all these crazy features/patterns that keep getting added/discovered and talked about?

    I ask because even though I can imagine someone smart mastering these things and programming faster, I can't imagine a second person being able to understand his code, maintain it, and generally be productive. I imagine the second person losing a lot of time trying to understand what is going on, or even thinking he understood but in reality he didn't and messing things up.

    So how do you even form a Clojure team?

  • I just saw this morning that many functions in core.async are marked "Deprecated - this function will be removed. Use transformer instead." I guess Tranducers will provide a generic replacement for those. Looking forward to seeing some examples.

  • This looks exciting, but I'm confused about the decision to add extra arity to collection-manipulating functions. "filter" that returns a collection or a transducer depending only on arity seems a little counter-intuitive.

  • I'm sorry, but from the OP I can't be at all sure I can understand notation such as:

         ;;reducing function signature
         whatever, input -> whatever
    
    or

         ;;transducer signature
         (whatever, input -> whatever) -> (whatever, input ->   whatever)
    
    Or, in mathematics there is some notation

         f: A --> B
    
    where A and B are sets and f is a function. This notation means that for each element x in set A, function f returns value f(x) in set B. Seems clear enough. Maybe the notation in the OP is related? How I can't be sure I can guess at all correctly.

  • Clojure transducers are exactly signal functions from Haskell FRP literature, for those interested in such a connection.

  • Tentative benchmark results have surfaced: https://github.com/thheller/transduce-bench

    Add salt according to taste.

  • This is about as close as I could get in Haskell so far. It uses a slight twist on (x -> a -> x) called a Fold (which has a lot of great properties—it's a profunctor, an applicative, and a comonad).

    Nicely, this construction lets us write `take` purely!

        {-# LANGUAGE GADTs         #-}
        {-# LANGUAGE RankNTypes    #-}
        {-# LANGUAGE TypeOperators #-}
    
        import           Control.Arrow
        import           Control.Category
        import qualified Prelude
        import           Prelude hiding (id, (.))
    
        data Fold a r where
          Fold :: (a -> x -> x) -> x -> (x -> r) -> Fold a r
    
        data Pair a b = Pair !a !b
    
        pfst :: Pair a b -> a
        pfst (Pair a b) = a
    
        psnd :: Pair a b -> b
        psnd (Pair a b) = b
    
        newtype (~>) a b = Arr (forall r . Fold b r -> Fold a r)
    
        instance Category (~>) where
          id = Arr id
          Arr f . Arr g = Arr (g . f)
    
        amap :: (a -> b) -> (a ~> b)
        amap f = Arr (\(Fold cons nil fin) -> Fold (cons . f) nil fin)
    
        afilter :: (a -> Bool) -> (a ~> a)
        afilter p = Arr $ \(Fold cons nil fin) ->
          let cons' = \a x -> if p a then cons a x else x
          in Fold cons' nil fin
    
        fold :: Fold a r -> [a] -> r
        fold (Fold cons nil fin) = fin . spin where
          spin []     = nil
          spin (a:as) = cons a (spin as)
    
        asequence :: (a ~> b) -> ([a] -> [b])
        asequence (Arr f) = fold (f (Fold (:) [] id))
        
        aflatmap :: (a -> [b]) -> (a ~> b)
        aflatmap f = Arr $ \(Fold cons nil fin) ->
          Fold (\a x -> foldr cons x (f a)) nil fin
        
        atake :: Int -> (a ~> a)
        atake n = Arr $ \(Fold cons nil fin) ->
          let cons' = \a x n -> if n > 0 then cons a (x (n-1)) else x n
          in Fold cons' (const nil) (\x -> fin (x n))

  • Not sure I understand - so Clojure is getting first class support for lazy collections and curried combinators? Or am I missing the important part?

  • > "these transformers were never exposed a la carte, instead being encapsulated by the macrology of reducers."

    What does 'macrology' mean in this context? Is this a common usage? Or a novel application of a word that ordinarily means "long and tedious talk without much substance"