What is Redis Pipeline

What is Redis Pipeline

Redis clients and servers communicate with each other using a protocol called RESP (REdis Serialization Protocol) which is TCP-based. In TCP-based protocol, server and client communicate with request/response model. Redis also works the same way, the client sends a request, the server processes the command while the client waits for the response in a blocking way. Now consider a case, where we want to SET or GET 100s of commands, if we go by regular route, each command will take up some Round Trip Time(RTT) and that will be repeated for all the commands, which is not optimum. In cases like this, we can use Redis Pipeline.

Pipelines provide a way to transmit multiple commands to the Redis server
in one transmission or in one network call. This is convenient for batch processing, such as saving all the values in a list to Redis or getting multiple values one after another. Pipelining is basically a network optimization, where all commands are grouped from the client-side and sent at once. Let's see how the Redis pipeline really works internally. Before moving forward, let's create some entries that we will retrieve later using the Redis pipeline.

➜  ~ redis-cli
127.0.0.1:6379> set 'foo' apple
OK
127.0.0.1:6379> set 'bar' pie
OK
127.0.0.1:6379> 
setting values in redis
import redis

conn = redis.Redis()
pipeline = conn.pipeline(transaction=False)
pipeline.get('foo')
pipeline.get('bar')
pipeline.execute()
Initiating a redis pipeline and getting values

The above python code generates the following command, which is then passed to the Redis Server over a socket connection, and then the result is displayed back.

'*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n*2\r\n$3\r\nGET\r\n$3\r\nbar\r\n'

Since the above command is understood by Redis-server, let's use netcat to send the same command as was generated in the above step and see whats the Redis server sends back.

➜  ~ echo -e '*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n*2\r\n$3\r\nGET\r\n$3\r\nbar\r\n' | nc localhost 6379
$5
apple
$3
pie
➜  ~ 

And voila! we see the response back, the same happens from client libraries as well, all commands are grouped together and called at once.

Transactions vs Pipeline in Redis.

We have seen that pipeline is a way to send all commands together from the client-side, the same applies to Redis transactions as well, then how are they different?

The difference is pipelines are not atomic whereas transactions are atomic, meaning 2 transactions do not run at the same time, whereas multiple pipelines can be executed by Redis-server at the same time in an interleaved fashion.

Redis Transaction vs Pipeline

Going back to code, in the above example, note that when we initiated the pipeline, we passed transaction=False, which tells Redis-client to not wrap the commands within the MULTI/EXEC, let's see what happens when we pass transaction=True. Since transaction params defaults to True, not passing the parameter is the same as when passed True.

import redis

conn = redis.Redis()
pipeline = conn.pipeline(transaction=True)
pipeline.get('foo')
pipeline.get('bar')
pipeline.execute()
➜  ~ echo -e '*1\r\n$5\r\nMULTI\r\n*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n*2\r\n$3\r\nGET\r\n$3\r\nbar\r\n*1\r\n$4\r\nEXEC\r\n' | nc localhost 6379
+OK
+QUEUED
+QUEUED
*2
$5
apple
$3
pie
➜  ~

We can see that even in transactions all the commands still went together, same as pipeline, however, it's of a blocking nature, as only 1 transaction can run at the same time which is the cost of using a transaction against using a pipeline.

When pipelines can't be used!

  • Writes followed by Reads are not possible within a pipeline, as the results come together in the end for all the commands.
  • In Redis cluster-mode, if all the keys do not map to the same slot, Redis throws an error like CROSSSLOT Keys in request don't hash to the same slot. This error can however be handled with the use of hash-tags. With hash-tags we can force keys to be mapped on the same shard.

Conclusion:

  • Pipelines are client implementations and can differ in each library, in Redis's python library we have seen that if we don't specify transaction=False explicitly, commands are actually wrapped in MULTI/EXEC which transforms it into a transaction, which is not always desirable. Do check the client's implementation in your case.
  • Also, when using pipelines, make sure to force each key to be slotted on the same shard when Redis is used in cluster mode. Use hash-tags in order to force this.
  • Pipelines are helpful in not only eliminating multiple round trips but also in Redis-server's system calls to the kernel, as these system calls are also batched.

4.5 Non-transactional pipelines|4.5 Non-transactional pipelines
4.5 Non-transactional pipelines When we first introduced MULTI/EXEC in chapter 3, we talked about them as having a “transaction” property—everything between the MULTI and EXEC commands will execute without other clients being able to do anything. One benefit to using transactions is the underlying l…
Using pipelining to speedup Redis queries – Redis