Class ReplicationTracker

  • All Implemented Interfaces:
    java.util.function.LongSupplier, IndexShardComponent

    public class ReplicationTracker
    extends AbstractIndexShardComponent
    implements java.util.function.LongSupplier
    This class is responsible for tracking the replication group with its progress and safety markers (local and global checkpoints). The global checkpoint is the highest sequence number for which all lower (or equal) sequence number have been processed on all shards that are currently active. Since shards count as "active" when the master starts them, and before this primary shard has been notified of this fact, we also include shards that have completed recovery. These shards have received all old operations via the recovery mechanism and are kept up to date by the various replications actions. The set of shards that are taken into account for the global checkpoint calculation are called the "in-sync shards".

    The global checkpoint is maintained by the primary shard and is replicated to all the replicas (via GlobalCheckpointSyncAction).

    • Constructor Detail

      • ReplicationTracker

        public ReplicationTracker​(ShardId shardId,
                                  java.lang.String allocationId,
                                  IndexSettings indexSettings,
                                  long operationPrimaryTerm,
                                  long globalCheckpoint,
                                  java.util.function.LongConsumer onGlobalCheckpointUpdated,
                                  java.util.function.LongSupplier currentTimeMillisSupplier,
                                  java.util.function.BiConsumer<RetentionLeases,​ActionListener<ReplicationResponse>> onSyncRetentionLeases)
        Initialize the global checkpoint service. The specified global checkpoint should be set to the last known global checkpoint, or SequenceNumbers.UNASSIGNED_SEQ_NO.
        Parameters:
        shardId - the shard ID
        allocationId - the allocation ID
        indexSettings - the index settings
        operationPrimaryTerm - the current primary term
        globalCheckpoint - the last known global checkpoint for this shard, or SequenceNumbers.UNASSIGNED_SEQ_NO
        onSyncRetentionLeases - a callback when a new retention lease is created or an existing retention lease expires
    • Method Detail

      • getRetentionLeases

        public RetentionLeases getRetentionLeases()
        Get all retention leases tracked on this shard.
        Returns:
        the retention leases
      • getRetentionLeases

        public Tuple<java.lang.Boolean,​RetentionLeases> getRetentionLeases​(boolean expireLeases)
        If the expire leases parameter is false, gets all retention leases tracked on this shard and otherwise first calculates expiration of existing retention leases, and then gets all non-expired retention leases tracked on this shard. Note that only the primary shard calculates which leases are expired, and if any have expired, syncs the retention leases to any replicas. If the expire leases parameter is true, this replication tracker must be in primary mode.
        Returns:
        a tuple indicating whether or not any retention leases were expired, and the non-expired retention leases
      • addRetentionLease

        public RetentionLease addRetentionLease​(java.lang.String id,
                                                long retainingSequenceNumber,
                                                java.lang.String source,
                                                ActionListener<ReplicationResponse> listener)
        Adds a new retention lease.
        Parameters:
        id - the identifier of the retention lease
        retainingSequenceNumber - the retaining sequence number
        source - the source of the retention lease
        listener - the callback when the retention lease is successfully added and synced to replicas
        Returns:
        the new retention lease
        Throws:
        java.lang.IllegalArgumentException - if the specified retention lease already exists
      • renewRetentionLease

        public RetentionLease renewRetentionLease​(java.lang.String id,
                                                  long retainingSequenceNumber,
                                                  java.lang.String source)
        Renews an existing retention lease.
        Parameters:
        id - the identifier of the retention lease
        retainingSequenceNumber - the retaining sequence number
        source - the source of the retention lease
        Returns:
        the renewed retention lease
        Throws:
        RetentionLeaseNotFoundException - if the specified retention lease does not exist
        java.lang.IllegalArgumentException - if the new retaining sequence number is lower than the retaining sequence number of the current retention lease.
      • removeRetentionLease

        public void removeRetentionLease​(java.lang.String id,
                                         ActionListener<ReplicationResponse> listener)
        Removes an existing retention lease.
        Parameters:
        id - the identifier of the retention lease
        listener - the callback when the retention lease is successfully removed and synced to replicas
      • updateRetentionLeasesOnReplica

        public void updateRetentionLeasesOnReplica​(RetentionLeases retentionLeases)
        Updates retention leases on a replica.
        Parameters:
        retentionLeases - the retention leases
      • loadRetentionLeases

        public RetentionLeases loadRetentionLeases​(java.nio.file.Path path)
                                            throws java.io.IOException
        Loads the latest retention leases from their dedicated state file.
        Parameters:
        path - the path to the directory containing the state file
        Returns:
        the retention leases
        Throws:
        java.io.IOException - if an I/O exception occurs reading the retention leases
      • persistRetentionLeases

        public void persistRetentionLeases​(java.nio.file.Path path)
                                    throws java.io.IOException
        Persists the current retention leases to their dedicated state file. If this version of the retention leases are already persisted then persistence is skipped.
        Parameters:
        path - the path to the directory containing the state file
        Throws:
        java.io.IOException - if an exception occurs writing the state file
      • assertRetentionLeasesPersisted

        public boolean assertRetentionLeasesPersisted​(java.nio.file.Path path)
                                               throws java.io.IOException
        Throws:
        java.io.IOException
      • getInSyncGlobalCheckpoints

        public com.carrotsearch.hppc.ObjectLongMap<java.lang.String> getInSyncGlobalCheckpoints()
        Get the local knowledge of the global checkpoints for all in-sync allocation IDs.
        Returns:
        a map from allocation ID to the local knowledge of the global checkpoint for that allocation ID
      • isPrimaryMode

        public boolean isPrimaryMode()
        Returns whether the replication tracker is in primary mode, i.e., whether the current shard is acting as primary from the point of view of replication.
      • getOperationPrimaryTerm

        public long getOperationPrimaryTerm()
        Returns the current operation primary term.
        Returns:
        the primary term
      • setOperationPrimaryTerm

        public void setOperationPrimaryTerm​(long operationPrimaryTerm)
        Sets the current operation primary term. This method should be invoked only when no other operations are possible on the shard. That is, either from the constructor of IndexShard or while holding all permits on the IndexShard instance.
        Parameters:
        operationPrimaryTerm - the new operation primary term
      • isRelocated

        public boolean isRelocated()
        Returns whether the replication tracker has relocated away to another shard copy.
      • getReplicationGroup

        public ReplicationGroup getReplicationGroup()
        Returns the current replication group for the shard.
        Returns:
        the replication group
      • getGlobalCheckpoint

        public long getGlobalCheckpoint()
        Returns the global checkpoint for the shard.
        Returns:
        the global checkpoint
      • getAsLong

        public long getAsLong()
        Specified by:
        getAsLong in interface java.util.function.LongSupplier
      • updateGlobalCheckpointOnReplica

        public void updateGlobalCheckpointOnReplica​(long globalCheckpoint,
                                                    java.lang.String reason)
        Updates the global checkpoint on a replica shard after it has been updated by the primary.
        Parameters:
        globalCheckpoint - the global checkpoint
        reason - the reason the global checkpoint was updated
      • updateGlobalCheckpointForShard

        public void updateGlobalCheckpointForShard​(java.lang.String allocationId,
                                                   long globalCheckpoint)
        Update the local knowledge of the global checkpoint for the specified allocation ID.
        Parameters:
        allocationId - the allocation ID to update the global checkpoint for
        globalCheckpoint - the global checkpoint
      • activatePrimaryMode

        public void activatePrimaryMode​(long localCheckpoint)
        Initializes the global checkpoint tracker in primary mode (see primaryMode. Called on primary activation or promotion.
      • updateFromMaster

        public void updateFromMaster​(long applyingClusterStateVersion,
                                     java.util.Set<java.lang.String> inSyncAllocationIds,
                                     IndexShardRoutingTable routingTable,
                                     java.util.Set<java.lang.String> pre60AllocationIds)
        Notifies the tracker of the current allocation IDs in the cluster state.
        Parameters:
        applyingClusterStateVersion - the cluster state version being applied when updating the allocation IDs from the master
        inSyncAllocationIds - the allocation IDs of the currently in-sync shard copies
        routingTable - the shard routing table
        pre60AllocationIds - the allocation IDs of shards that are allocated to pre-6.0 nodes
      • initiateTracking

        public void initiateTracking​(java.lang.String allocationId)
        Called when the recovery process for a shard has opened the engine on the target shard. Ensures that the right data structures have been set up locally to track local checkpoint information for the shard and that the shard is added to the replication group.
        Parameters:
        allocationId - the allocation ID of the shard for which recovery was initiated
      • markAllocationIdAsInSync

        public void markAllocationIdAsInSync​(java.lang.String allocationId,
                                             long localCheckpoint)
                                      throws java.lang.InterruptedException
        Marks the shard with the provided allocation ID as in-sync with the primary shard. This method will block until the local checkpoint on the specified shard advances above the current global checkpoint.
        Parameters:
        allocationId - the allocation ID of the shard to mark as in-sync
        localCheckpoint - the current local checkpoint on the shard
        Throws:
        java.lang.InterruptedException
      • updateLocalCheckpoint

        public void updateLocalCheckpoint​(java.lang.String allocationId,
                                          long localCheckpoint)
        Notifies the service to update the local checkpoint for the shard with the provided allocation ID. If the checkpoint is lower than the currently known one, this is a no-op. If the allocation ID is not tracked, it is ignored.
        Parameters:
        allocationId - the allocation ID of the shard to update the local checkpoint for
        localCheckpoint - the local checkpoint for the shard
      • startRelocationHandoff

        public ReplicationTracker.PrimaryContext startRelocationHandoff()
        Initiates a relocation handoff and returns the corresponding primary context.
      • abortRelocationHandoff

        public void abortRelocationHandoff()
        Fails a relocation handoff attempt.
      • completeRelocationHandoff

        public void completeRelocationHandoff()
        Marks a relocation handoff attempt as successful. Moves the tracker into replica mode.
      • activateWithPrimaryContext

        public void activateWithPrimaryContext​(ReplicationTracker.PrimaryContext primaryContext)
        Activates the global checkpoint tracker in primary mode (see primaryMode. Called on primary relocation target during primary relocation handoff.
        Parameters:
        primaryContext - the primary context used to initialize the state
      • pendingInSync

        public boolean pendingInSync()
        Whether the are shards blocking global checkpoint advancement. Used by tests.
      • getTrackedLocalCheckpointForShard

        public ReplicationTracker.CheckpointState getTrackedLocalCheckpointForShard​(java.lang.String allocationId)
        Returns the local checkpoint information tracked for a specific shard. Used by tests.