Monday, August 17, 2009

Retrying Operations in Java

There are many cases in which you may wish to retry an operation a certain number of times. Examples are database failures, network communication failures or file IO problems.

Approach 1
This is the traditional approach and involves a counter and a loop.

final int numberOfRetries = 5 ;
final long timeToWait = 1000 ;

for (int i=0; i<numberOfRetries; i++) {
 //perform the operation
 try {
  Naming.lookup("rmi://localhost:2106/MyApp");
  break;
 }
 catch (Exception e) {
  logger.error("Retrying...",e);
  try {
   Thread.sleep(timeToWait);
  }
  catch (InterruptedException i) {
  }
 }
}
Approach 2
In this approach, we hide the retry counter in a separate class called RetryStrategy and call it like this:
public class RetryStrategy
{
 public static final int DEFAULT_NUMBER_OF_RETRIES = 5;
 public static final long DEFAULT_WAIT_TIME = 1000;

 private int numberOfRetries; //total number of tries
 private int numberOfTriesLeft; //number left
 private long timeToWait; //wait interval

 public RetryStrategy()
 {
  this(DEFAULT_NUMBER_OF_RETRIES, DEFAULT_WAIT_TIME);
 }

 public RetryStrategy(int numberOfRetries, long timeToWait)
 {
  this.numberOfRetries = numberOfRetries;
  numberOfTriesLeft = numberOfRetries;
  this.timeToWait = timeToWait;
 }

 /**
  * @return true if there are tries left
  */
 public boolean shouldRetry()
 {
  return numberOfTriesLeft > 0;
 }

 /**
  * This method should be called if a try fails.
  *
  * @throws RetryException if there are no more tries left
  */
 public void errorOccured() throws RetryException
 {
  numberOfTriesLeft --;
  if (!shouldRetry())
  {
   throw new RetryException(numberOfRetries +
     " attempts to retry failed at " + getTimeToWait() +
     "ms interval");
  }
  waitUntilNextTry();
 }

 /**
  * @return time period between retries
  */
 public long getTimeToWait()
 {
  return timeToWait ;
 }

 /**
  * Sleeps for the duration of the defined interval
  */
 private void waitUntilNextTry()
 {
  try
  {
   Thread.sleep(getTimeToWait());
  }
  catch (InterruptedException ignored) {}
 }

 public static void main(String[] args) {
  RetryStrategy retry = new RetryStrategy();
  while (retry.shouldRetry()) {
   try {
    Naming.lookup("rmi://localhost:2106/MyApp");
    break;
   }
   catch (Exception e) {
    try {
     retry.errorOccured();
    }
    catch (RetryException e1) {
     e.printStackTrace();
    }
   }
  }
 }
}
Approach 3
Approach 2, although cleaner, hasn't really reduced the number of lines of code we have to write. In the next approach, we hide the retry loop and all logic in a separate class called RetriableTask. We make the operation that we are going to retry Callable and wrap it in a RetriableTask which then handles all the retrying for us, behind-the-scenes:
public class RetriableTask<T> implements Callable<T> {

 private Callable<T> task;
 public static final int DEFAULT_NUMBER_OF_RETRIES = 5;
 public static final long DEFAULT_WAIT_TIME = 1000;

 private int numberOfRetries; // total number of tries
 private int numberOfTriesLeft; // number left
 private long timeToWait; // wait interval

 public RetriableTask(Callable<T> task) {
  this(DEFAULT_NUMBER_OF_RETRIES, DEFAULT_WAIT_TIME, task);
 }

 public RetriableTask(int numberOfRetries, long timeToWait,
                      Callable<T> task) {
  this.numberOfRetries = numberOfRetries;
  numberOfTriesLeft = numberOfRetries;
  this.timeToWait = timeToWait;
  this.task = task;
 }

 public T call() throws Exception {
  while (true) {
   try {
    return task.call();
   }
   catch (InterruptedException e) {
    throw e;
   }
   catch (CancellationException e) {
    throw e;
   }
   catch (Exception e) {
    numberOfTriesLeft--;
    if (numberOfTriesLeft == 0) {
     throw new RetryException(numberOfRetries +
     " attempts to retry failed at " + timeToWait +
     "ms interval", e);
    }
    Thread.sleep(timeToWait);
   }
  }
 }

 public static void main(String[] args) {
  Callable<Remote> task = new Callable<Remote>() {
   public Remote call() throws Exception {
    String url="rmi://localhost:2106/MyApp";
    return (Remote) Naming.lookup(url);
   }
  };

  RetriableTask<Remote> r = new RetriableTask<Remote>(task);
  try {
   r.call();
  }
  catch (Exception e) {
   e.printStackTrace();
  }
 }
}
Also see: References:

5 comments:

  1. Hi

    I read this post two times.

    I like it so much, please try to keep posting.

    Let me introduce other material that may be good for our community.

    Source: Operations interview questions

    Best regards
    Henry

    ReplyDelete
  2. This is great stuff. I would even suggest that RetriableTask could accept a RetryStrategy instance so as control how the/when the retry is handled, e.g., different tasks might need different strategies. For example a strategy might only retry on a specific set of possible exceptions and fail-fast on others.

    ReplyDelete
  3. In the last sample, it seems should not be while (true),
    It should be:
    while(numberOfTriesLeft)

    right?

    ReplyDelete
  4. Hi Fahd,
    I am currently looking at some code in one of our products and we find that part of is is derived from some code you
    posted on your blog : http://fahdshariff.blogspot.co.uk/2009/08/retrying-operations-in-java.html
    I need to know what terms your source code is licensed under, if you authored 100% of the code
    on this blog or if someone else authored it .
    Is the code licensed under a standard open source license? (which one?)
    If not , do you give us permission to perform, distribute, and creative derivative works.

    Thanks

    ReplyDelete
  5. what is fail fast? what is fail safe? what is difference between fail fast and fail safe? this is very important interview question. you can find the details on http://www.itsoftpoint.com/?page_id=2736

    ReplyDelete