[glib: 2/3] Add G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING




commit e26a8a59813ce651c881fe223e7d1a5034f2f816
Author: Vinícius dos Santos Oliveira <vini ipsmaker gmail com>
Date:   Mon Apr 12 20:03:32 2021 -0300

    Add G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING
    
    It fixes a race. g_source_attach() had the following check to ensure a
    loop blocked on poll() would wakeup.
    
      if (do_wakeup && context->owner && context->owner != G_THREAD_SELF)
        g_wakeup_signal (context->wakeup);
    
    However it doesn't contemplate an implementation where poll()ing is a
    non-blocking operation that will be scheduled while the thread is
    released to perform other tasks. This scenario opens up several
    different possibilities where the condition would fail to hold true. I
    experienced two of such races.
    
    The first race pertains to a mono-threaded application. Do keep in mind
    that integrating GLib to a foreign event loop will make GLib act as a
    slave in the new event loop. When you post a new work unit to execute in
    the thread managed by the foreign event loop, you don't use
    g_main_context_invoke(). In fact the only reason to integrate
    GMainContext in a foreign event loop is to make the two of them
    communicate. So from time to time, the foreign event loop will execute
    callbacks that manipulate the GMainContext loop. An illustration
    follows.
    
      // in this callback we translate an event from the foreign event loop
      // to an event in the GMainContext event loop (that runs in the same
      // thread)
      static void my_event_loop_callback(void* data)
      {
        GMainContext* ctx = /* ... */;
        // ...
        g_source_attach(source, ctx);
      }
    
      int main()
      {
        // ...
        my_event_loop_invoke(my_event_loop_callback, data);
        // ...
    
        // this function has all mechanisms in place to run the foreign
        // event loop and the hooks to call
        // g_main_context_{prepare,query,check,dispatch}
        my_event_loop_run();
      }
    
    In this case, you would have the following series of calls:
    
    1. g_main_context_prepare()
    2. g_main_context_query()
    3. A callback to my_event_loop is registered when any fd on the set is
       ready or the timeout is reached.
    4. The thread is released to perform other tasks.
    5. One of the tasks executed wishes to communicate with my_event_loop
       and enters my_event_loop_callback.
    6. g_source_attach() is called.
    7. g_source_attach() detects do_wakeup=TRUE, context->owner != NULL, and
       context->owner == G_THREAD_SELF so g_wakeup_signal() is skipped.
    8. None of the fds on the GLib poll() set becomes ready nor the GLib
       timeout expires. The my_event_loop callback that would call
       g_main_context_check() is never executed. Deadlock.
    
    A shallow analysis will fail to detect the race here. The explanation
    seems to showcase a scenario that will deterministically fail with a
    deadlock every time. However do keep in mind that my_event_loop_callback
    could be invoked before or after g_main_context_prepare(). There is an
    _event_ race here. Furthermore, some GLib libraries such as GDBus will
    initialize objects from extra threads (GAsyncInitable interface) and
    invoke the result on the original thread when ready (g_source_attach()
    will eventually be called). Now you have scenarios closer to standard
    race examples.
    
    The other scenario where a race would manifest happens in a
    multi-threaded application that has a concurrency design similar to the
    actor model. No actor executes in two threads simultaneously, but it's
    not guaranteed that it'll always wake-up in the same thread. It'd
    perform steps 1-4 just as in the previous example, but before thread
    control is returned to the pool, it'd call g_main_context_release(). Now
    g_source_attach() would skip g_wakeup_signal() for a different reason:
    
    7. g_source_attach() detects do_wakeup=TRUE, context->owner == NULL so
       g_wakeup_signal() is skipped.
    8. Same as before.
    
    Certainly there are other concurrency designs where this optimization
    would cause a deadlock, but all of them have origin in the same place:
    the optimization assumes the poll() implementation is a blocking
    operation and the thread will never be released to perform other tasks
    (possibly involving GLib calls) while result is not ready. They share
    not only the same problem, but also the same solution: do not make
    assumptions and just call g_wakeup_signal().
    
    This patch implements this solution by introducing the
    G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING flag. This flag will force a call
    to g_wakeup_signal() and fix the race on foreign event loops. The reason
    to prevent changing this option after creation is to avoid other races
    that would lead to event loss. Construction is the only proper time to
    set this option.
    
    The implementation design means we do not change **any** semantics for
    current working code. If you don't set the new flag, the code won't
    enter in different branches and current behavior won't be affected. The
    patch is small and easy to follow too.

 glib/gmain.c          | 17 +++++++++++++++--
 glib/gmain.h          |  5 +++++
 glib/tests/mainloop.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+), 2 deletions(-)
---
diff --git a/glib/gmain.c b/glib/gmain.c
index 69fca6260..f43d0afa3 100644
--- a/glib/gmain.c
+++ b/glib/gmain.c
@@ -177,6 +177,15 @@
  * g_main_context_prepare(), g_main_context_query(),
  * g_main_context_check() and g_main_context_dispatch().
  *
+ * If the event loop thread releases #GMainContext ownership until the results
+ * required by g_main_context_check() are ready you must create a context with
+ * the flag %G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING or else you'll lose
+ * g_source_attach() notifications. This happens for instance when you integrate
+ * the GLib event loop into implementations that follow the proactor pattern
+ * (i.e. in these contexts the `poll()` implementation will reclaim the thread for
+ * other tasks until the results are ready). One example of the proactor pattern
+ * is the Boost.Asio library.
+ *
  * ## State of a Main Context # {#mainloop-states}
  *
  * The operation of these functions can best be seen in terms
@@ -1267,8 +1276,12 @@ g_source_attach_unlocked (GSource      *source,
   /* If another thread has acquired the context, wake it up since it
    * might be in poll() right now.
    */
-  if (do_wakeup && context->owner && context->owner != G_THREAD_SELF)
-    g_wakeup_signal (context->wakeup);
+  if (do_wakeup &&
+      (context->flags & G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING ||
+       (context->owner && context->owner != G_THREAD_SELF)))
+    {
+      g_wakeup_signal (context->wakeup);
+    }
 
   g_trace_mark (G_TRACE_CURRENT_TIME, 0,
                 "GLib", "g_source_attach",
diff --git a/glib/gmain.h b/glib/gmain.h
index 8e7d77290..22103414c 100644
--- a/glib/gmain.h
+++ b/glib/gmain.h
@@ -41,6 +41,10 @@ typedef enum /*< flags >*/
 /**
  * GMainContextFlags:
  * @G_MAIN_CONTEXT_FLAGS_NONE: Default behaviour.
+ * @G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING: Assume that polling for events will
+ * free the thread to process other jobs. That's useful if you're using
+ * `g_main_context_{prepare,query,check,dispatch}` to integrate GMainContext in
+ * other event loops.
  *
  * Flags to pass to g_main_context_new_with_flags() which affect the behaviour
  * of a #GMainContext.
@@ -51,6 +55,7 @@ GLIB_AVAILABLE_TYPE_IN_2_72
 typedef enum /*< flags >*/
 {
   G_MAIN_CONTEXT_FLAGS_NONE = 0,
+  G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING = 1
 } GMainContextFlags;
 
 
diff --git a/glib/tests/mainloop.c b/glib/tests/mainloop.c
index ab1317644..7d4fc2d7e 100644
--- a/glib/tests/mainloop.c
+++ b/glib/tests/mainloop.c
@@ -158,6 +158,56 @@ test_mainloop_basic (void)
   g_main_loop_unref (loop);
 }
 
+static void
+test_ownerless_polling (gconstpointer test_data)
+{
+  gboolean attach_first = GPOINTER_TO_INT (test_data);
+  GMainContext *ctx = g_main_context_new_with_flags (
+    G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING);
+
+  GPollFD fds[20];
+  gint fds_size;
+  gint max_priority;
+  GSource *source = NULL;
+
+  g_assert_true (ctx != g_main_context_default ());
+
+  g_main_context_push_thread_default (ctx);
+
+  /* Drain events */
+  for (;;)
+    {
+      gboolean ready_to_dispatch = g_main_context_prepare (ctx, &max_priority);
+      gint timeout, nready;
+      fds_size = g_main_context_query (ctx, max_priority, &timeout, fds, G_N_ELEMENTS (fds));
+      nready = g_poll (fds, fds_size, /*timeout=*/0);
+      if (!ready_to_dispatch && nready == 0)
+        {
+          if (timeout == -1)
+            break;
+          else
+            g_usleep (timeout * 1000);
+        }
+      ready_to_dispatch = g_main_context_check (ctx, max_priority, fds, fds_size);
+      if (ready_to_dispatch)
+        g_main_context_dispatch (ctx);
+    }
+
+  if (!attach_first)
+    g_main_context_pop_thread_default (ctx);
+
+  source = g_idle_source_new ();
+  g_source_attach (source, ctx);
+  g_source_unref (source);
+
+  if (attach_first)
+    g_main_context_pop_thread_default (ctx);
+
+  g_assert_cmpint (g_poll (fds, fds_size, 0), >, 0);
+
+  g_main_context_unref (ctx);
+}
+
 static gint a;
 static gint b;
 static gint c;
@@ -2145,6 +2195,8 @@ main (int argc, char *argv[])
 #endif
   g_test_add_func ("/mainloop/nfds", test_nfds);
   g_test_add_func ("/mainloop/steal-fd", test_steal_fd);
+  g_test_add_data_func ("/mainloop/ownerless-polling/attach-first", GINT_TO_POINTER (TRUE), 
test_ownerless_polling);
+  g_test_add_data_func ("/mainloop/ownerless-polling/pop-first", GINT_TO_POINTER (FALSE), 
test_ownerless_polling);
 
   return g_test_run ();
 }


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]