nsq-py, pynsq, gnsq compared - Est's Blog

nsq-py, pynsq, gnsq compared

nsq is a lightweight message queue like Kafka/RabbitMQ/RocketMQ/RedisQueue/Celery.

For python bindings there ware three libraries out there:

  • pynsq: the official build, but requires tornado
  • gnsq: the gevent only consumer/publisher, might have race condition in state machine. Don't use.
  • nsq-py: select() based, but compatible with gevent/tornado/threading+select.

So far nsq-py looks reall promising because it's compatible with more options with concurrency, but when I used it with massive TCP connections to nsqd, gevent threw "ValueError: filedescriptor out of range in select()" error.

Upon further investigation, it was a select(2) limitation. The kernel will limit max tcp fileno 1024.

See CPython source of selectmodule.c:

    if (!_PyIsSelectable_fd(v)) {
        PyErr_SetString(PyExc_ValueError,
                    "filedescriptor out of range in select()");
        goto finally;
    }

And _PyIsSelectable_fd is defined as

#define _PyIsSelectable_fd(FD) ((unsigned int)(FD) < (unsigned int)FD_SETSIZE)

And the const FD_SETSIZE is described at man page:

The Linux kernel imposes no fixed limit, but the glibc implementation makes fd_set a fixed- size type, with FD_SETSIZE defined as 1024, and the FD_*() macros operating according to that limit. To monitor file descriptors greater than 1023, use poll(2) or epoll(7) instead.

Lessons I learned today: don't use libraries based on select.select() if you have too many connections. Even with Gevent's monkey patch, select() wont block on each eventlet, but still it will not allow fileno greater or equal to 1024. Sad.

Comments