tornado: disable epoll and add a test that shows why.

Here's a simple test program that shows the problem with epoll.

import socket, select, os
s1, s2 = socket.socketpair()
fd = os.dup(s1.fileno())
e = select.epoll()
e.register(s1, select.EPOLLIN)

libcurl apparently still has a bug where it will sometimes close and replace
its underlying file descriptor *before* telling epoll (or not even telling
epoll at all).  select() would have no problem with this, since the fd is
the same before and after, but epoll remembers the identity of the fd, which
persists even if the file no longer exists in this process (like if it was
inherited by a subprocess).  Unfortunately, it keeps returning the original
fd number, not the new one, which is both useless and confusing.  And to
make things worse, it doesn't poll on the new file descriptor that inherited
the old number!

This causes these possible symptoms:

1. if a subprocess inherits our libcurl fd, we'll keep getting epoll events
   on it under some conditions, even after we think we've closed it.  When
   the remote http server dies, we'll just get endless EOF events on a fd
   we can't possibly read anymore since it doesn't even exist in this

2. if the remote server *doesn't* close our
   connection right away, catawampus might just sit forever (or until
   a local timeout if one exists) waiting for the server to respond on the
   new socket.  The server might in fact respond, but we'd never get an
   event about it.  This could explain strange catawampus delays.  This could
   happen regardless of whether a subprocess has a copy of the original fd
   or not; the main thing is epoll is not listening on the new fd.

The "right" fix would be to overhaul libcurl to always unregister its fds
before closing or replacing them, but that's too hard (because libcurl's fd
logic is very confusing).  Instead, let's just bludgeon it by switching to
good old-fashioned select(), which is more resilient to such things.

select() is less scalable than epoll() if you have zillions of fds (which is
what tornado was originally built to handle), but our use case doesn't need


Change-Id: I1d57a7d8defa48e0b003e10b59afa474d4c4e2bd
diff --git a/tornado/ b/tornado/
index a332100..6507c06 100644
--- a/tornado/
+++ b/tornado/
@@ -327,8 +327,8 @@
                     handler = self._handlers[fd]
                 except KeyError:
-                    logging.error("Handler for fd %s no longer exists, closing",
-                                  fd, exc_info=True)
+                    logging.error("Handler for fd %s (ev=0x%x) no longer exists, closing",
+                                  fd, events, exc_info=True)
@@ -341,8 +341,8 @@
                         logging.error("Exception in I/O handler for fd %s",
                                       fd, exc_info=True)
                 except Exception:
-                    logging.error("Exception in I/O handler for fd %s",
-                                  fd, exc_info=True)
+                    logging.error("Exception in I/O handler for fd %s (ev=0x%x)",
+                                  fd, events, exc_info=True)
         # reset the stopped flag so another start/stop pair can be issued
         self._stopped = False
         if self._blocking_signal_threshold is not None:
@@ -672,7 +672,7 @@
 # Choose a poll implementation. Use epoll if it is available, fall back to
 # select() for non-Linux platforms
-if hasattr(select, "epoll"):
+if False and hasattr(select, "epoll"):
     # Python 2.6+ on Linux
     _poll = select.epoll
 elif hasattr(select, "kqueue"):