Akka (18): Stream: Composite Data Stream, Component - Graph components

Posted by ziong on Sat, 01 Jun 2019 21:27:54 +0200

The data stream of akka-stream can be composed of some components. These components are collectively called data flow graph Graph, which describes data flow and processing. Source,Flow,Sink are the most basic Graph. More complex composite Graph can be combined with basic Graph. If all ports of a Graph (input and output) are connected, then it is a closed flow graph Runnable Graph, otherwise it belongs to.. Open flow graph Partial Graph. A complete (computable) data stream is a Runnable Graph. Graph's output port can be described by Shape:

/**
 * A Shape describes the inlets and outlets of a [[Graph]]. In keeping with the
 * philosophy that a Graph is a freely reusable blueprint, everything that
 * matters from the outside are the connections that can be made with it,
 * otherwise it is just a black box.
 */
abstract class Shape {
  /**
   * Scala API: get a list of all input ports
   */
  def inlets: immutable.Seq[Inlet[_]]

  /**
   * Scala API: get a list of all output ports
   */
  def outlets: immutable.Seq[Outlet[_]]

...

Shape-type abstract functions inlets,outlets represent Graph-shaped input and output ports, respectively. Several existing shapes provided by aka-stream are listed below:

final case class SourceShape[+T](out: Outlet[T @uncheckedVariance]) extends Shape {...}
final case class FlowShape[-I, +O](in: Inlet[I @uncheckedVariance], out: Outlet[O @uncheckedVariance]) extends Shape {...}
final case class SinkShape[-T](in: Inlet[T @uncheckedVariance]) extends Shape {...}
sealed abstract class ClosedShape extends Shape
/**
 * A bidirectional flow of elements that consequently has two inputs and two
 * outputs, arranged like this:
 *
 * {{{
 *        +------+
 *  In1 ~>|      |~> Out1
 *        | bidi |
 * Out2 <~|      |<~ In2
 *        +------+
 * }}}
 */
final case class BidiShape[-In1, +Out1, -In2, +Out2](
  in1:  Inlet[In1 @uncheckedVariance],
  out1: Outlet[Out1 @uncheckedVariance],
  in2:  Inlet[In2 @uncheckedVariance],
  out2: Outlet[Out2 @uncheckedVariance]) extends Shape {...}
object UniformFanInShape {
  def apply[I, O](outlet: Outlet[O], inlets: Inlet[I]*): UniformFanInShape[I, O] =
    new UniformFanInShape(inlets.size, FanInShape.Ports(outlet, inlets.toList))
}
object UniformFanOutShape {
  def apply[I, O](inlet: Inlet[I], outlets: Outlet[O]*): UniformFanOutShape[I, O] =
    new UniformFanOutShape(outlets.size, FanOutShape.Ports(inlet, outlets.toList))
}

Shape is a parameter of the Graph type:

trait Graph[+S <: Shape, +M] {
  /**
   * Type-level accessor for the shape parameter of this graph.
   */
  type Shape = S @uncheckedVariance
  /**
   * The shape of a graph is all that is externally visible: its inlets and outlets.
   */
  def shape: S
...

The Shape of the Runnable Graph type is ClosedShape:

/**
 * Flow with attached input and output, can be executed.
 */
final case class RunnableGraph[+Mat](override val traversalBuilder: TraversalBuilder) extends Graph[ClosedShape, Mat] {
  override def shape = ClosedShape

  /**
   * Transform only the materialized value of this RunnableGraph, leaving all other properties as they were.
   */
  def mapMaterializedValue[Mat2](f: Mat ⇒ Mat2): RunnableGraph[Mat2] =
    copy(traversalBuilder.transformMat(f.asInstanceOf[Any ⇒ Any]))

  /**
   * Run this flow and return the materialized instance from the flow.
   */
  def run()(implicit materializer: Materializer): Mat = materializer.materialize(this)
...

We can build Graph with GraphDSL provided by akka-stream. GraphDSL inherits GraphApply's create method. GraphDSL.create(...) is the way to build Graph:

object GraphDSL extends GraphApply {...}
trait GraphApply {
  /**
   * Creates a new [[Graph]] by passing a [[GraphDSL.Builder]] to the given create function.
   */
  def create[S <: Shape]()(buildBlock: GraphDSL.Builder[NotUsed] ⇒ S): Graph[S, NotUsed] = {
    val builder = new GraphDSL.Builder
    val s = buildBlock(builder)

    createGraph(s, builder)
  }
...
def create[S <: Shape, Mat](g1: Graph[Shape, Mat])(buildBlock: GraphDSL.Builder[Mat] ⇒ (g1.Shape) ⇒ S): Graph[S, Mat] = {...}
def create[S <: Shape, Mat, M1, M2](g1: Graph[Shape, M1], g2: Graph[Shape, M2])(combineMat: (M1, M2) ⇒ Mat)(buildBlock: GraphDSL.Builder[Mat] ⇒ (g1.Shape, g2.Shape) ⇒ S): Graph[S, Mat] = {...}
...
def create[S <: Shape, Mat, M1, M2, M3, M4, M5](g1: Graph[Shape, M1], g2: Graph[Shape, M2], g3: Graph[Shape, M3], g4: Graph[Shape, M4], g5: Graph[Shape, M5])(combineMat: (M1, M2, M3, M4, M5) ⇒ Mat)(buildBlock: GraphDSL.Builder[Mat] ⇒ (g1.Shape, g2.Shape, g3.Shape, g4.Shape, g5.Shape) ⇒ S): Graph[S, Mat] = {
...}

BuilBlock function type: buildBlock: GraphDSL.Builder[Mat] ("g1. Shape, g2. Shape,..., g5. Shape") S, g?) represents the open flow graph after merging. Here are some of the most basic examples of Graph construction:

import akka.actor._
import akka.stream._
import akka.stream.scaladsl._

object SimpleGraphs extends App{

  implicit val sys = ActorSystem("streamSys")
  implicit val ec = sys.dispatcher
  implicit val mat = ActorMaterializer()

  val source = Source(1 to 10)
  val flow = Flow[Int].map(_ * 2)
  val sink = Sink.foreach(println)


  val sourceGraph = GraphDSL.create(){implicit builder =>
    import GraphDSL.Implicits._
    val src = source.filter(_ % 2 == 0)
    val pipe = builder.add(Flow[Int])
    src ~> pipe.in
    SourceShape(pipe.out)
  }

  Source.fromGraph(sourceGraph).runWith(sink).andThen{case _ => } // sys.terminate()}

  val flowGraph = GraphDSL.create(){implicit builder =>
    import GraphDSL.Implicits._

    val pipe = builder.add(Flow[Int])
    FlowShape(pipe.in,pipe.out)
  }

  val (_,fut) = Flow.fromGraph(flowGraph).runWith(source,sink)
  fut.andThen{case _ => } //sys.terminate()}


  val sinkGraph = GraphDSL.create(){implicit builder =>
     import GraphDSL.Implicits._
     val pipe = builder.add(Flow[Int])
     pipe.out.map(_ * 3) ~> Sink.foreach(println)
     SinkShape(pipe.in)
  }

  val fut1 = Sink.fromGraph(sinkGraph).runWith(source)

  Thread.sleep(1000)
  sys.terminate()

We demonstrated Source,Flow,Sink's Graph writing above, and we used Flow[Int] as a common base component. We know that akka-stream Graph can be combined with simpler Partial-Graph, and all Graphs are ultimately combined with basic flow graphs such as Source, Flow and Sink. In the example above, we add a Flow Graph to an empty Graph template using builder.add(...), and builder.add returns Shape pipe to expose the input and output ports of the added Graph. Then we designed the data flow graph by connecting the ports of the pipe according to the functional requirements of the target Graph. Tests show that these Graph functions are in line with expectations. Next, we can try to customize a similar Pipe type Graph to get a more detailed understanding of the process of Graph composition. All the basic components Core-Graph must define Shape to describe its input and output ports, and GraphStateLogic in GraphStage to describe the specific way of reading and writing data stream elements.

import akka.actor._
import akka.stream._
import akka.stream.scaladsl._
import scala.collection.immutable

case class PipeShape[In,Out](
    in: Inlet[In],
    out: Outlet[Out]) extends Shape {

  override def inlets: immutable.Seq[Inlet[_]] = in :: Nil

  override def outlets: immutable.Seq[Outlet[_]] = out :: Nil

  override def deepCopy(): Shape = 
    PipeShape(
      in = in.carbonCopy(),
      out = out.carbonCopy()
    )
}

PipeShape has an input port and an output port. Because the Shape class is inherited, the abstract function of the Shape class must be implemented. Suppose we design a Graph that uses a function provided by the user to apply input elements, such as source. via (ApplyPipe (myFunc). runWith (sink). Of course, we can use source. map (r => myFunc). runWith (sink) directly, but what we need is that there may be many preset common functions involved in ApplyPipe, and myFunc is part of the code. If you use map(...), the user must provide all the code. Apply Pipe is shaped like Pipe Shape. Here is its GraphState design:

  class Pipe[In, Out](f: In => Out) extends GraphStage[PipeShape[In, Out]] {
    val in = Inlet[In]("Pipe.in")
    val out = Outlet[Out]("Pipe.out")

    override def shape = PipeShape(in, out)

    override def initialAttributes: Attributes = Attributes.none

    override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
      new GraphStageLogic(shape) with InHandler with OutHandler {

        private def decider =
          inheritedAttributes.get[SupervisionStrategy].map(_.decider).getOrElse(Supervision.stoppingDecider)
        
        override def onPull(): Unit = pull(in)

        override def onPush(): Unit = {
          try {
            push(out, f(grab(in)))
          }
          catch {
            case NonFatal(ex) ⇒ decider(ex) match {
              case Supervision.Stop ⇒ failStage(ex)
              case _ ⇒ pull(in)
            }
          }
        }

        setHandlers(in,out, this)
      }
  }

In this definition of Pipe GraphStage, the input and output ports in and out are defined firstly, and then GraphStageLogic, InHandler and outHandler are defined by createLogic. InHandler and OutHandler correspond to the activity processing of data elements on the input and output ports respectively.

/**
 * Collection of callbacks for an input port of a [[GraphStage]]
 */
trait InHandler {
  /**
   * Called when the input port has a new element available. The actual element can be retrieved via the
   * [[GraphStageLogic.grab()]] method.
   */
  @throws(classOf[Exception])
  def onPush(): Unit

  /**
   * Called when the input port is finished. After this callback no other callbacks will be called for this port.
   */
  @throws(classOf[Exception])
  def onUpstreamFinish(): Unit = GraphInterpreter.currentInterpreter.activeStage.completeStage()

  /**
   * Called when the input port has failed. After this callback no other callbacks will be called for this port.
   */
  @throws(classOf[Exception])
  def onUpstreamFailure(ex: Throwable): Unit = GraphInterpreter.currentInterpreter.activeStage.failStage(ex)
}

/**
 * Collection of callbacks for an output port of a [[GraphStage]]
 */
trait OutHandler {
  /**
   * Called when the output port has received a pull, and therefore ready to emit an element, i.e. [[GraphStageLogic.push()]]
   * is now allowed to be called on this port.
   */
  @throws(classOf[Exception])
  def onPull(): Unit

  /**
   * Called when the output port will no longer accept any new elements. After this callback no other callbacks will
   * be called for this port.
   */
  @throws(classOf[Exception])
  def onDownstreamFinish(): Unit = {
    GraphInterpreter
      .currentInterpreter
      .activeStage
      .completeStage()
  }
}

The input and output processing of akka-stream Graph implements Reactive-Stream protocol. So we'd better use akka-stream to provide ready-made pull and push to rewrite abstract functions onPull and onPush. Then setHandlers are used to set the input, output and processing function handler of the GraphStage:

  /**
   * Assign callbacks for linear stage for both [[Inlet]] and [[Outlet]]
   */
  final protected def setHandlers(in: Inlet[_], out: Outlet[_], handler: InHandler with OutHandler): Unit = {
    setHandler(in, handler)
    setHandler(out, handler)
  }
 /**
   * Assigns callbacks for the events for an [[Inlet]]
   */
  final protected def setHandler(in: Inlet[_], handler: InHandler): Unit = {
    handlers(in.id) = handler
    if (_interpreter != null) _interpreter.setHandler(conn(in), handler)
  }
  /**
   * Assigns callbacks for the events for an [[Outlet]]
   */
  final protected def setHandler(out: Outlet[_], handler: OutHandler): Unit = {
    handlers(out.id + inCount) = handler
    if (_interpreter != null) _interpreter.setHandler(conn(out), handler)
  }

With Shape and GraphStage, we can build a Graph:

def applyPipe[In,Out](f: In => Out) = GraphDSL.create() {implicit builder =>
    val pipe = builder.add(new Pipe(f))
    FlowShape(pipe.in,pipe.out)
  }

It can also be used directly to combine a composite Graph:

  RunnableGraph.fromGraph(
    GraphDSL.create(){implicit builder =>
      import GraphDSL.Implicits._

      val source = Source(1 to 10)
      val sink = Sink.foreach(println)
      val f: Int => Int = _ * 3
      val pipeShape = builder.add(new Pipe[Int,Int](f))
      source ~> pipeShape.in
      pipeShape.out~> sink
      ClosedShape

    }
  ).run()

The source code of the whole example is as follows:

import akka.actor._
import akka.stream._
import akka.stream.scaladsl._
import akka.stream.ActorAttributes._
import akka.stream.stage._

import scala.collection.immutable
import scala.util.control.NonFatal

object PipeOps {

  case class PipeShape[In, Out](
                                 in: Inlet[In],
                                 out: Outlet[Out]) extends Shape {

    override def inlets: immutable.Seq[Inlet[_]] = in :: Nil

    override def outlets: immutable.Seq[Outlet[_]] = out :: Nil

    override def deepCopy(): Shape =
      PipeShape(
        in = in.carbonCopy(),
        out = out.carbonCopy()
      )
  }

  class Pipe[In, Out](f: In => Out) extends GraphStage[PipeShape[In, Out]] {
    val in = Inlet[In]("Pipe.in")
    val out = Outlet[Out]("Pipe.out")

    override def shape = PipeShape(in, out)

    override def initialAttributes: Attributes = Attributes.none

    override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
      new GraphStageLogic(shape) with InHandler with OutHandler {

        private def decider =
          inheritedAttributes.get[SupervisionStrategy].map(_.decider).getOrElse(Supervision.stoppingDecider)

        override def onPull(): Unit = pull(in)

        override def onPush(): Unit = {
          try {
            push(out, f(grab(in)))
          }
          catch {
            case NonFatal(ex) ⇒ decider(ex) match {
              case Supervision.Stop ⇒ failStage(ex)
              case _ ⇒ pull(in)
            }
          }
        }

        setHandlers(in,out, this)
      }
  }

  def applyPipe[In,Out](f: In => Out) = GraphDSL.create() {implicit builder =>
    val pipe = builder.add(new Pipe(f))
    FlowShape(pipe.in,pipe.out)
  }

}

object ShapeDemo1 extends App {
import PipeOps._
  implicit val sys = ActorSystem("streamSys")
  implicit val ec = sys.dispatcher
  implicit val mat = ActorMaterializer()

  RunnableGraph.fromGraph(
    GraphDSL.create(){implicit builder =>
      import GraphDSL.Implicits._

      val source = Source(1 to 10)
      val sink = Sink.foreach(println)
      val f: Int => Int = _ * 3
      val pipeShape = builder.add(new Pipe[Int,Int](f))
      source ~> pipeShape.in
      pipeShape.out~> sink
      ClosedShape

    }
  ).run()


  val fut = Source(1 to 10).via(applyPipe[Int,Int](_ * 2)).runForeach(println)

  scala.io.StdIn.readLine()

  sys.terminate()


}

Topics: Scala