Deep Understanding of Spark 2.1 Core (X): Shuffle Map End Principle and Source Code Analysis

Posted by Fuzzy Wobble on Wed, 19 Jun 2019 02:14:07 +0200

In the previous article, "Understanding Spark 2.1 Core (9): Iterative Computing and Principles and Source Code Analysis of Shuffle", Sort Shuffle Writer.
    // According to the sorting method, the data is sorted and written to the memory buffer.
    // If the calculation results in the sorting exceed the threshold value,
    // Overwrite it to disk data file
    sorter.insertAll(records)
  • 1

Let's start with a macro view of the lower Map end. We will divide it into three categories according to whether aggregator.isDefined defines aggregation function and ordering.isDefined defines sorting function.

  • Without aggregation and sorting, data is first written to different files according to partition, and finally merged into the same file in partition order. Suitable for a small number of partitions. Combining multiple bucket s into the same file reduces the number of map output files, saves disk I/O and improves performance.
  • There is no aggregation but sorting. In the cache, the data is sorted by partition (or key), and then merged into the same file in partition order. Suitable when the number of partitions is large. Combining multiple bucket s into the same file reduces the number of map output files, saves disk I/O and improves performance. Caching uses data to be written to disk by exceeding the threshold.
  • There are aggregation and sorting. Now the data is aggregated according to the key value in the cache. Then the data is sorted according to the partition (or key) in the cache. Finally, the data is merged and written to the same file in the partition order. Combining multiple bucket s into the same file reduces the number of map output files, saves disk I/O and improves performance. Caching uses data to be written to disk by exceeding the threshold. Reading data item by item and aggregating it reduces memory usage.

Let's take a closer look at insertAll:

  def insertAll(records: Iterator[Product2[K, V]]): Unit = {
  // If an aggregation function is defined, shouldCombine is true
    val shouldCombine = aggregator.isDefined

    // Do external sorting require aggregation
    if (shouldCombine) {      
      // mergeValue is a merge function of Value
      val mergeValue = aggregator.get.mergeValue
      // createCombiner is a function that generates Combiner
      val createCombiner = aggregator.get.createCombiner
      var kv: Product2[K, V] = null
      // update is a partial function
      val update = (hadValue: Boolean, oldValue: C) => {
       // When there is a Value, merge oldValue with the new Value kv._2
       // If there is no Value, kv._2 is passed in to generate Value.
        if (hadValue) mergeValue(oldValue, kv._2) else createCombiner(kv._2)
      }
      while (records.hasNext) {
        addElementsRead()
        kv = records.next()
       // First use our AppendOnlyMap
       // Aggregate value in memory 
        map.changeValue((getPartition(kv._1), kv._1), update)
        // Write to disk when exceeding threshold
        maybeSpillCollection(usingMap = true)
      }
    } else {
      // Insert Value directly into the buffer
      while (records.hasNext) {
        addElementsRead()
        val kv = records.next()
        buffer.insert(getPartition(kv._1), kv._1, kv._2.asInstanceOf[C])
        maybeSpillCollection(usingMap = false)
      }
    }
  }

createCombiner here can be seen as generating a Value with kv._2. mergeValue can be understood as a combiner in MapReduce, that is, a Map-side Reduce operation, which aggregates the same key's Value first.

Aggregation algorithm

Let's take a closer look at the aggregation operation section:

Call stack:

  • util.collection.SizeTrackingAppendOnlyMap.changeValue
    • util.collection.AppendOnlyMap.changeValue
      • util.collection.AppendOnlyMap.incrementSize
        • util.collection.AppendOnlyMap.growTable
    • util.collection.SizeTracker.afterUpdate
      • util.collection.SizeTracker.takeSample

First, the changeValue function of AppendOnlyMap:

util.collection.SizeTrackingAppendOnlyMap.changeValue

  override def changeValue(key: K, updateFunc: (Boolean, V) => V): V = {
    // Using aggregation algorithm to get new Value
    val newValue = super.changeValue(key, updateFunc)
    // Update Sampling of AppendOnlyMap Size
    super.afterUpdate()
    // Return results
    newValue
  }

util.collection.AppendOnlyMap.changeValue

Aggregation algorithm:

  def changeValue(key: K, updateFunc: (Boolean, V) => V): V = {
    assert(!destroyed, destructionMessage)
    val k = key.asInstanceOf[AnyRef]
    if (k.eq(null)) {
      if (!haveNullValue) {
        incrementSize()
      }
      nullValue = updateFunc(haveNullValue, nullValue)
      haveNullValue = true
      return nullValue
    }
    // According to k's hashCode, pos is obtained from hash and upper mask
    // 2*pos is where k should be
    // 2*pos+1 is the location of v corresponding to k
    var pos = rehash(k.hashCode) & mask
    var i = 1
    while (true) {
    // Get the value curKey at the location of k in data
      val curKey = data(2 * pos)
      if (curKey.eq(null)) {
      // If curKey is empty
      // The new Value generated from kv._2, that is, a single new value, is obtained.
        val newValue = updateFunc(false, null.asInstanceOf[V])
        data(2 * pos) = k
        data(2 * pos + 1) = newValue.asInstanceOf[AnyRef]
        // Expansion capacity
        incrementSize()
        return newValue
      } else if (k.eq(curKey) || k.equals(curKey)) {
      // If k and curKey are equal
      // Aggregate oldValue (data(2 * pos + 1) and new Value (kv._2)
        val newValue = updateFunc(true, data(2 * pos + 1).asInstanceOf[V])
        data(2 * pos + 1) = newValue.asInstanceOf[AnyRef]
        return newValue
      } else {
      // If curKey is not null, and k does not want to wait,
      // That is hash conflict
      // Then continue to traverse backwards until the first two situations occur.
        val delta = i
        pos = (pos + delta) & mask
        i += 1
      }
    }
    null.asInstanceOf[V] 
  }
  • 1

util.collection.AppendOnlyMap.incrementSize

Let's take another look at the implementation of capacity expansion:

  private def incrementSize() {
    curSize += 1
    // When curSize is greater than the threshold growThreshold,
    // Call growTable()
    if (curSize > growThreshold) {
      growTable()
    }
  }
  • 1

util.collection.AppendOnlyMap.growTable

  protected def growTable() {
    //Generate capacity doubled newData
    val newCapacity = capacity * 2
    require(newCapacity <= MAXIMUM_CAPACITY, s"Can't contain more than ${growThreshold} elements")
    val newData = new Array[AnyRef](2 * newCapacity)
    // Generating newMask
    val newMask = newCapacity - 1
    var oldPos = 0
    while (oldPos < capacity) {
    // Recalculate the data in the old Data with newMask.
    // Copy to the new Data 
      if (!data(2 * oldPos).eq(null)) {
        val key = data(2 * oldPos)
        val value = data(2 * oldPos + 1)
        var newPos = rehash(key.hashCode) & newMask
        var i = 1
        var keepGoing = true
        while (keepGoing) {
          val curKey = newData(2 * newPos)
          if (curKey.eq(null)) {
            newData(2 * newPos) = key
            newData(2 * newPos + 1) = value
            keepGoing = false
          } else {
            val delta = i
            newPos = (newPos + delta) & newMask
            i += 1
          }
        }
      }
      oldPos += 1
    }
    // To update
    data = newData
    capacity = newCapacity
    mask = newMask
    growThreshold = (LOAD_FACTOR * newCapacity).toInt
  }
  • 1

util.collection.SizeTracker.afterUpdate

Let's look back at the update in SizeTracking AppendOnlyMap. changeValue to sample super.afterUpdate() for the size of AppendOnlyMap. The so-called size sampling is the change of AppendOnlyMap size after only one Update. But calculating AppendOnlyMap once after each operation, such as insert `update', can greatly degrade performance. Therefore, the sampling estimation method is adopted here:

  protected def afterUpdate(): Unit = {
    numUpdates += 1
    // If num Updates reaches the threshold,
    // Sampling
    if (nextSampleNum == numUpdates) {
      takeSample()
    }
  }
  • 1

util.collection.SizeTracker.takeSample

  private def takeSample(): Unit = {
    samples.enqueue(Sample(SizeEstimator.estimate(this), numUpdates))
    // Only two samples are used.
    if (samples.size > 2) {
      samples.dequeue()
    }
    val bytesDelta = samples.toList.reverse match {
    // Estimate the amount of change per update
      case latest :: previous :: tail =>
        (latest.size - previous.size).toDouble / (latest.numUpdates - previous.numUpdates)
      // If less than 2 samples, no change is assumed.
      case _ => 0
    }
    // To update
    bytesPerUpdate = math.max(0, bytesDelta)
    // Increase threshold
    nextSampleNum = math.ceil(numUpdates * SAMPLE_GROWTH_RATE).toLong
  }
  • 1
  • 10

Let's look at the function that estimates the size of AppendOnlyMap:

  def estimateSize(): Long = {
    assert(samples.nonEmpty)
    // Calculate the total variation of the estimate
    val extrapolatedDelta = bytesPerUpdate * (numUpdates - samples.last.numUpdates)
    // Previous size plus estimated total variation
    (samples.last.size + extrapolatedDelta).toLong
  }

Write Buffer

Now let's go back to insertAll and take a closer look at how to insert Value directly into the buffer.

Call stack:

  • util.collection.PartitionedPairBuffer.insert
    • util.collection.PartitionedPairBuffer.growArray

util.collection.PartitionedPairBuffer.insert

  def insert(partition: Int, key: K, value: V): Unit = {
   // To the capacity size, call growArray()
    if (curSize == capacity) {
      growArray()
    }
    data(2 * curSize) = (partition, key.asInstanceOf[AnyRef])
    data(2 * curSize + 1) = value.asInstanceOf[AnyRef]
    curSize += 1
    afterUpdate()
  }
  • 1

util.collection.PartitionedPairBuffer.growArray

  private def growArray(): Unit = {
    if (capacity >= MAXIMUM_CAPACITY) {
      throw new IllegalStateException(s"Can't insert more than ${MAXIMUM_CAPACITY} elements")
    }
    val newCapacity =
      if (capacity * 2 < 0 || capacity * 2 > MAXIMUM_CAPACITY) { // Overflow
        MAXIMUM_CAPACITY
      } else {
        capacity * 2
      }
      // Generating new Array with doubled capacity
    val newArray = new Array[AnyRef](2 * newCapacity)
    // copy
    System.arraycopy(data, 0, newArray, 0, 2 * capacity)
    data = newArray
    capacity = newCapacity
    resetSamples()
  }
  • 8

overflow

Now let's go back to insertAll to see how to write to disk when the threshold is exceeded:

Call stack:

  • util.collection.ExternalSorter.maybeSpillCollection
    • util.collection.Spillable.maybeSpill
      • util.collection.Spillable.spill
        • util.collection.ExternalSorter.spillMemoryIteratorToDisk

util.collection.ExternalSorter.maybeSpillCollection

  private def maybeSpillCollection(usingMap: Boolean): Unit = {
    var estimatedSize = 0L
    if (usingMap) {
      estimatedSize = map.estimateSize()
      if (maybeSpill(map, estimatedSize)) {
        map = new PartitionedAppendOnlyMap[K, C]
      }
    } else {
      estimatedSize = buffer.estimateSize()
      if (maybeSpill(buffer, estimatedSize)) {
        buffer = new PartitionedPairBuffer[K, C]
      }
    }

    if (estimatedSize > _peakMemoryUsedBytes) {
      _peakMemoryUsedBytes = estimatedSize
    }
  }
  • 5

util.collection.Spillable.maybeSpill

  protected def maybeSpill(collection: C, currentMemory: Long): Boolean = {
    var shouldSpill = false
    if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {
      // If it is greater than the threshold value
      // amountToRequest is the memory space to be applied for
      val amountToRequest = 2 * currentMemory - myMemoryThreshold
      val granted = acquireMemory(amountToRequest)
      myMemoryThreshold += granted
      // If we allocate too little memory,
      // Because tryToAcquire returns 0
      // Or the size of the memory application exceeds myMemoryThreshold
      // Cause the current Memory >= myMemory Threshold
      // Should Spill
      shouldSpill = currentMemory >= myMemoryThreshold
    }
    // If the number of elements read is greater than the threshold value
    // Should Spill
    shouldSpill = shouldSpill || _elementsRead > numElementsForceSpillThreshold
    if (shouldSpill) {
    // Number of times with new Spill
      _spillCount += 1
      logSpillage(currentMemory)
      // Spill operation
      spill(collection)
      // Element Read Number Zero
      _elementsRead = 0
      // Increase Spill's memory count
      // Release memory
      _memoryBytesSpilled += currentMemory
      releaseMemory()
    }
    shouldSpill
  }
  • 1

util.collection.Spillable.spill

spill the collection in memory into an ordered file. SortShuffleWriter.write then calls sorter.writePartitionedFile to merge them

  override protected[this] def spill(collection: WritablePartitionedPairCollection[K, C]): Unit = {
  // An iterator that generates a collection in memory,
  // This part will be explained later.
    val inMemoryIterator = collection.destructiveSortedWritablePartitionedIterator(comparator)
    // Generate spill file,
    // And add it to the array
    val spillFile = spillMemoryIteratorToDisk(inMemoryIterator)
    spills += spillFile
  }
  • 1

util.collection.ExternalSorter.spillMemoryIteratorToDisk

  private[this] def spillMemoryIteratorToDisk(inMemoryIterator: WritablePartitionedIterator)
      : SpilledFile = {
// Generate temporary files and blockId 
    val (blockId, file) = diskBlockManager.createTempShuffleBlock()

    // These values are reset after each flush
    var objectsWritten: Long = 0
    val spillMetrics: ShuffleWriteMetrics = new ShuffleWriteMetrics
    val writer: DiskBlockObjectWriter =
      blockManager.getDiskWriter(blockId, file, serInstance, fileBufferSize, spillMetrics)

    // Record branch size in order of writing to disk
    val batchSizes = new ArrayBuffer[Long]

    // Record how many elements each partition has
    val elementsPerPartition = new Array[Long](numPartitions)

    // Flush writer content to disk,
    // And update related variables
    def flush(): Unit = {
      val segment = writer.commitAndGet()
      batchSizes += segment.length
      _diskBytesSpilled += segment.length
      objectsWritten = 0
    }

    var success = false
    try {
    // Traversing memory collections
      while (inMemoryIterator.hasNext) {
        val partitionId = inMemoryIterator.nextPartition()
        require(partitionId >= 0 && partitionId < numPartitions,
          s"partition Id: ${partitionId} should be in the range [0, ${numPartitions})")
        inMemoryIterator.writeNext(writer)
        elementsPerPartition(partitionId) += 1
        objectsWritten += 1

     // When the number of elements written reaches the batch serialization size,
     // flush
        if (objectsWritten == serializerBatchSize) {
          flush()
        }
      }
      if (objectsWritten > 0) {
      // Write after traversal
      // flush
        flush()
      } else {
        writer.revertPartialWritesAndClose()
      }
      success = true
    } finally {
      if (success) {
        writer.close()
      } else {
        writer.revertPartialWritesAndClose()
        if (file.exists()) {
          if (!file.delete()) {
            logWarning(s"Error deleting ${file}")
          }
        }
      }
    }

    SpilledFile(file, blockId, batchSizes.toArray, elementsPerPartition)
  }
  • 1

sort

Let's go back to Sort Shuffle Writer. write:

      // In external ordering,
      // Some of the results may be in memory
      // Another part of the result is in one or more files
      // They need to be merge d into a large file
      val partitionLengths = sorter.writePartitionedFile(blockId, tmp)

Call stack:

  • util.collection.writePartitionedFile
    • util.collection.ExternalSorter.destructiveSortedWritablePartitionedIterator
    • util.collection.ExternalSorter.partitionedIterator
      • partitionedDestructiveSortedIterator

util.collection.ExternalSorter.writePartitionedFile

Let's take a closer look at the writePartitioned File, add data to the External Sorter and write it to a disk file:

  def writePartitionedFile(
      blockId: BlockId,
      outputFile: File): Array[Long] = {

    // Tracking the location of the output file
    val lengths = new Array[Long](numPartitions)
    val writer = blockManager.getDiskWriter(blockId, outputFile, serInstance, fileBufferSize,
      context.taskMetrics().shuffleWriteMetrics)

    if (spills.isEmpty) {
      // When only data exists in memory
      val collection = if (aggregator.isDefined) map else buffer
      val it = collection.destructiveSortedWritablePartitionedIterator(comparator)
      while (it.hasNext) {
        val partitionId = it.nextPartition()
        while (it.hasNext && it.nextPartition() == partitionId) {
          it.writeNext(writer)
        }
        val segment = writer.commitAndGet()
        lengths(partitionId) = segment.length
      }
    } else {
      // Otherwise, merge-sort must be done.
      // Get a partition iterator
      // And write all the data directly
      for ((id, elements) <- this.partitionedIterator) {
        if (elements.hasNext) {
          for (elem <- elements) {
            writer.write(elem._1, elem._2)
          }
          val segment = writer.commitAndGet()
          lengths(id) = segment.length
        }
      }
    }

    writer.close()
    context.taskMetrics().incMemoryBytesSpilled(memoryBytesSpilled)
    context.taskMetrics().incDiskBytesSpilled(diskBytesSpilled)
    context.taskMetrics().incPeakExecutionMemory(peakMemoryUsedBytes)

    lengths
  }
  • 1
  • 28

util.collection.ExternalSorter.destructiveSortedWritablePartitionedIterator

In the writePartitioned File, an iterator is generated using destructive Sorted Writable Partitioned Iterator:

val it = collection.destructiveSortedWritablePartitionedIterator(comparator)

It was also mentioned in the previous blog post as util.collection.Spillable.spill:

val inMemoryIterator = collection.destructiveSortedWritablePartitionedIterator(comparator)
  • 1

Let's look at destructive Sorted Writable Partitioned Iterator:

  def destructiveSortedWritablePartitionedIterator(keyComparator: Option[Comparator[K]])
    : WritablePartitionedIterator = {
    // generator iterator
    val it = partitionedDestructiveSortedIterator(keyComparator)
    new WritablePartitionedIterator {
      private[this] var cur = if (it.hasNext) it.next() else null

      def writeNext(writer: DiskBlockObjectWriter): Unit = {
        writer.write(cur._1._2, cur._2)
        cur = if (it.hasNext) it.next() else null
      }

      def hasNext(): Boolean = cur != null

      def nextPartition(): Int = cur._1._1
    }
  }
  • 1

You can see that Writable Partitioned Iterator is equivalent to the proxy class of the iterator returned by partitioned Destructive Sorted Iterator. Instead of returning a value, the destructive Sorted Writable Partitioned Iterator passes in the DiskBlockObjectWriter and writes it. Let's put the partitioned Destructive Sorted Iterator first and look down.

util.collection.ExternalSorter.partitionedIterator

Unlike another branch, this branch calls partitioned Iterator to get the partition iterator and write all the data directly. Let's take a closer look at partitioned Iterator:

  def partitionedIterator: Iterator[(Int, Iterator[Product2[K, C]])] = {
    val usingMap = aggregator.isDefined
    val collection: WritablePartitionedPairCollection[K, C] = if (usingMap) map else buffer
    if (spills.isEmpty) {
     // When there is no spills
     // According to our previous process, we will not join this branch.
      if (!ordering.isDefined) {
        // If you don't need to sort key s
        // Ordering Partition s only
        groupByPartition(destructiveIterator(collection.partitionedDestructiveSortedIterator(None)))
      } else {
        // Otherwise, you need to sort partition s and key s
        groupByPartition(destructiveIterator(
          collection.partitionedDestructiveSortedIterator(Some(keyComparator))))
      }
    } else {
      //  When there are spills
      // Temporary files and data in memory that need Merge spilled
      merge(spills, destructiveIterator(
        collection.partitionedDestructiveSortedIterator(comparator)))
    }
  }
  • 1

Let's first look at spills.isEmpty in two ways:

  • Sort Partition s only:
    The partitioned Destructive Sorted Iterator passes in None, meaning that key s are not sorted. Sorting Partition s is done by default in partitioned Destructive Sorted Iterator. Let's leave it behind.
groupByPartition(destructiveIterator(collection.partitionedDestructiveSortedIterator(None)))
  • 1

After sorting the Partitions, according to the aggregation of Partitions:

  private def groupByPartition(data: Iterator[((Int, K), C)])
      : Iterator[(Int, Iterator[Product2[K, C]])] =
  {
    val buffered = data.buffered
    (0 until numPartitions).iterator.map(p => (p, new IteratorForPartition(p, buffered)))
  }
  • 1
  • 2

Iterator ForPartition is an iterator for a single partion:

  private[this] class IteratorForPartition(partitionId: Int, data: BufferedIterator[((Int, K), C)])
    extends Iterator[Product2[K, C]]
  {
    override def hasNext: Boolean = data.hasNext && data.head._1._1 == partitionId

    override def next(): Product2[K, C] = {
      if (!hasNext) {
        throw new NoSuchElementException
      }
      val elem = data.next()
      (elem._1._2, elem._2)
    }
  }
  • Sort partition s and key s
groupByPartition(destructiveIterator(
          collection.partitionedDestructiveSortedIterator(Some(keyComparator))))
  • 1

In partitioned Destructive Sorted Iterator, keyComparator is passed in:

  private val keyComparator: Comparator[K] = ordering.getOrElse(new Comparator[K] {
    override def compare(a: K, b: K): Int = {
      val h1 = if (a == null) 0 else a.hashCode()
      val h2 = if (b == null) 0 else b.hashCode()
      if (h1 < h2) -1 else if (h1 == h2) 0 else 1
    }
  })

Sort by hashCode of key, then call groupByPartition to sort by partition.

For spills, we use comparator:

  private def comparator: Option[Comparator[K]] = {
  // If sorting or aggregation is required
    if (ordering.isDefined || aggregator.isDefined) {
      Some(keyComparator)
    } else {
      None
    }
  }
  • 1

partitionedDestructiveSortedIterator

Now let's take a look at partitioned Destructive Sorted Iterator. Partitioned Destructive Sorted Iterator is a method in the idiosyncratic Writable Partitioned Pair Collection. Writable Partitioned PairCollection is inherited by Partitioned AppendOnlyMap and Artitioned PairBuffer. In partitioned Iterator, you can see that:

    val usingMap = aggregator.isDefined
    val collection: WritablePartitionedPairCollection[K, C] = if (usingMap) map else buffer
  • 1

If aggregation is required, use Partitioned AppendOnlyMap or Partitioned PairBuffer

util.collection.PartitionedPairBuffer.partitionedDestructiveSortedIterator

Let's start with a simple Partitioned PairBuffer. partitioned Destructive Sorted Iterator:

  override def partitionedDestructiveSortedIterator(keyComparator: Option[Comparator[K]])
    : Iterator[((Int, K), V)] = {
    val comparator = keyComparator.map(partitionKeyComparator).getOrElse(partitionComparator)
    // Sort the data
    new Sorter(new KVArraySortDataFormat[(Int, K), AnyRef]).sort(data, 0, curSize, comparator)
    iterator
  }
  • 1

We can see that:

 val comparator = keyComparator.map(partitionKeyComparator).getOrElse(partitionComparator)
  • 1

The original comparator is replaced by the partitionKey Comparator. The partitionKey Comparator is the partition and key quadratic sort. If the incoming keyComparator is None, then only the Partition is sorted:

  def partitionKeyComparator[K](keyComparator: Comparator[K]): Comparator[(Int, K)] = {
    new Comparator[(Int, K)] {
      override def compare(a: (Int, K), b: (Int, K)): Int = {
        val partitionDiff = a._1 - b._1
        if (partitionDiff != 0) {
          partitionDiff
        } else {
          keyComparator.compare(a._2, b._2)
        }
      }
    }
  • 1
  • 2

Then we use Sort and others to sort the data, which uses TimSort. In future blog articles, we will explain in depth.

Finally, the iterator is returned, which simply traverses the data in pairs:

  private def iterator(): Iterator[((Int, K), V)] = new Iterator[((Int, K), V)] {
    var pos = 0

    override def hasNext: Boolean = pos < curSize

    override def next(): ((Int, K), V) = {
      if (!hasNext) {
        throw new NoSuchElementException
      }
      val pair = (data(2 * pos).asInstanceOf[(Int, K)], data(2 * pos + 1).asInstanceOf[V])
      pos += 1
      pair
    }
  }
}
  • 1
  • 6

util.collection.PartitionedAppendOnlyMap.partitionedDestructiveSortedIterator

  def partitionedDestructiveSortedIterator(keyComparator: Option[Comparator[K]])
    : Iterator[((Int, K), V)] = {
    val comparator = keyComparator.map(partitionKeyComparator).getOrElse(partitionComparator)
    destructiveSortedIterator(comparator)
  }
  • 1

util.collection.PartitionedAppendOnlyMap.destructiveSortedIterator

  def destructiveSortedIterator(keyComparator: Comparator[K]): Iterator[(K, V)] = {
    destroyed = true
    // To the left
    var keyIndex, newIndex = 0
    while (keyIndex < capacity) {
      if (data(2 * keyIndex) != null) {
        data(2 * newIndex) = data(2 * keyIndex)
        data(2 * newIndex + 1) = data(2 * keyIndex + 1)
        newIndex += 1
      }
      keyIndex += 1
    }
    assert(curSize == newIndex + (if (haveNullValue) 1 else 0))

    new Sorter(new KVArraySortDataFormat[K, AnyRef]).sort(data, 0, newIndex, keyComparator)

    // Return the new Iterator
    new Iterator[(K, V)] {
      var i = 0
      var nullValueReady = haveNullValue
      def hasNext: Boolean = (i < newIndex || nullValueReady)
      def next(): (K, V) = {
        if (nullValueReady) {
          nullValueReady = false
          (null.asInstanceOf[K], nullValue)
        } else {
          val item = (data(2 * i).asInstanceOf[K], data(2 * i + 1).asInstanceOf[V])
          i += 1
          item
        }
      }
    }
  }

Topics: Spark less