Re: [RFC PATCH v2 2/2] platform: add support for WireGuard links



Hi Thomas,

On Tue, Apr 03, 2018 at 09:17:05PM +0200, Thomas Haller wrote:
Hi,

it looks quite good!! :)
thanks

Much appreciated! Your review comments have been very helpful.

diff --git a/libnm-core/nm-core-types-internal.h b/libnm-core/nm-
core-types-internal.h
index 4d43aaf45..1c4d74ca9 100644
--- a/libnm-core/nm-core-types-internal.h
+++ b/libnm-core/nm-core-types-internal.h

@@ -26,11 +26,42 @@
 #error Cannot use this header.
 #endif
 
+#include "nm-utils/c-list.h"
+
 typedef struct {
    guint32 from;
    guint32 to;
 } NMVlanQosMapping;
 
+typedef struct {
+   guint16 family;

Family always either 2 (AF_INET) or 10 (AF_INET6).
Can you make it a gint8, and move it together with "cidr" field?

Sure.

+   union {
+           struct in_addr ip4;
+           struct in6_addr ip6;
+   } ip;

NMIPAddr?

D'oh - of course this would already be available somewhere in NM.
Thank you!

+typedef struct {
+   guint8 public_key[NM_WG_PUBLIC_KEY_LEN];
+   guint8 preshared_key[NM_WG_SYMMETRIC_KEY_LEN];
+   union {
+           struct sockaddr addr;
+           struct sockaddr_in addr4;
+           struct sockaddr_in6 addr6;
+   } endpoint;
+   struct timespec last_handshake_time;
+   guint64 rx_bytes, tx_bytes;
+   guint16 persistent_keepalive_interval;
+
+   CList allowedips_lst_head;

CList is a great data structure (+1 for simplicity). But in this case,
can you not just allocate an array?

        guint allowed_ips_len;
        const NMWireguardAllowedIP *allowed_ips;

I would see a point in making NMWireguardAllowedIP individually
allocated structures, if they'd be reference-counted. But embedding a
CList in the structure, kinda prevents meaningfully sharing it, because
the entire list would be shared too (at which point, there is no need
to make individual elements shared/ref-counted/malloc-ed).

The reason why I had a CList was just so we could quickly perform
allowedip coalescing during genl parsing. (Sometimes all allowedips for
a peer won't fit in one genl message, so they will be sent in batches).

But in any case yeah, peers/allowedip are always received/sent all in
one go. There's no reason why this can't be a plain array-of-structs
after the parsing is done, and I understand how that is preferable.

I'll make that change.

--- a/src/platform/nm-linux-platform.c
+++ b/src/platform/nm-linux-platform.c
@@ -173,6 +173,40 @@ G_STATIC_ASSERT (RTA_MAX == (__RTA_MAX - 1));
 
 /*******************************************************************
**********/
 
+#define WG_CMD_GET_DEVICE 0
+#define WG_CMD_SET_DEVICE 1
+
+#define IFLA_WG_DEVICE_UNSPEC           0
+#define IFLA_WG_DEVICE_IFINDEX          1
+#define IFLA_WG_DEVICE_IFNAME           2
+#define IFLA_WG_DEVICE_PRIVATE_KEY      3
+#define IFLA_WG_DEVICE_PUBLIC_KEY       4
+#define IFLA_WG_DEVICE_FLAGS            5
+#define IFLA_WG_DEVICE_LISTEN_PORT      6
+#define IFLA_WG_DEVICE_FWMARK           7
+#define IFLA_WG_DEVICE_PEERS            8
+#define __IFLA_WG_DEVICE_MAX            9
+
+#define IFLA_WG_PEER_UNSPEC                        0
+#define IFLA_WG_PEER_PUBLIC_KEY                    1
+#define IFLA_WG_PEER_PRESHARED_KEY                 2
+#define IFLA_WG_PEER_FLAGS                         3
+#define IFLA_WG_PEER_ENDPOINT                      4
+#define IFLA_WG_PEER_PERSISTENT_KEEPALIVE_INTERVAL 5
+#define IFLA_WG_PEER_LAST_HANDSHAKE_TIME           6
+#define IFLA_WG_PEER_RX_BYTES                      7
+#define IFLA_WG_PEER_TX_BYTES                      8
+#define IFLA_WG_PEER_ALLOWEDIPS                    9

WireGuard's uapi headers seem to call these fields differently.
E.g. WGPEER_A_LAST_HANDSHAKE_TIME instead of
IFLA_WG_PEER_LAST_HANDSHAKE_TIME.

These compat defines are here, so we can compile against older kernel
headers. The name should be identical to the kernel API, except,
maybe a NM_ prefix, to distinuish them from the real name. The NM-
prefix is used if the define is put to a header, otherwise, it is not
used.

Oh I see now - yeah I've made a mess of these then.
I thought IFLA_* was some naming convention within NM, didn't realize
it's actually kernel uapi.

I'll sort it out.

+static void
+_wireguard_peer_free (gpointer data)
+{
+   NMWireguardPeer *peer = data;
+   NMWireguardAllowedIP *allowedip;
+
+   c_list_for_each_entry (allowedip, &peer-
allowedips_lst_head, allowedips_lst)
+           g_free (allowedip);

this would be wrong, because you'd need c_list_for_each_entry_safe().
Anyway, let's not make this a C-List.

True.

+static NMPObject *
+_parse_lnk_wireguard (const char *kind)
+{
+   nm_auto_nmpobj NMPObject *obj = NULL;
+   NMPObject *obj_result;
+   NMPObjectLnkWireguard *wgobj;
+
+   if (!nm_streq0 (kind, "wireguard"))
+           return NULL;
+
+   obj = nmp_object_new (NMP_OBJECT_TYPE_LNK_WIREGUARD, NULL);
+   wgobj = &obj->_lnk_wireguard;
+
+   wgobj->peers = g_ptr_array_sized_new (1);
+   g_ptr_array_set_free_func (wgobj->peers,
_wireguard_peer_free);

this doesn't seem useful. _parse_lnk_wireguard() doesn't really parse
anything. Just drop.

Yeah it's a leftover from refactoring.
Gone.

@@ -1875,6 +1944,9 @@ _new_from_nl_link (NMPlatform *platform, const
NMPCache *cache, struct nlmsghdr
    case NM_LINK_TYPE_VXLAN:
            lnk_data = _parse_lnk_vxlan (nl_info_kind,
nl_info_data);
            break;
+   case NM_LINK_TYPE_WIREGUARD:
+           lnk_data = _parse_lnk_wireguard (nl_info_kind);

here, no need to create an empty lnk_data instance.

As above.

@@ -1914,6 +1986,15 @@ _new_from_nl_link (NMPlatform *platform, const
NMPCache *cache, struct nlmsghdr
                            obj->link.tx_packets = link_cached-
link.tx_packets;
                            obj->link.tx_bytes = link_cached-
link.tx_bytes;
                    }
+           } else {

I think, it's not really if-else.

By default, you'd assume that the cache already contains a suitable
NMPObjectLnkWireguard instance. Just look into the cache (the lines
with the "if", and if it's there that's it.

Only if no NMPObjectLnkWireguard can be found in the cache, fetch it
the first time. As said, this leaves out some notification mechanism to
 refetch the data as necessary. For now, just first do the if-part
(always), and then always refetch via genl. Then, compare the two, and
if they differ, keep the newly received one.

I have to admit I got confused about the cache code on this function.
But yeah on second inspection this code is not laid out correctly. For
example, if cache == NULL then we'd never query genl.

+                   /* TODO: we have no mechanism to detect
changes to cached WireGuard
+                    * interfaces yet, so any changes will not
be picked up. */
+                   switch (obj->link.type) {
+                   case NM_LINK_TYPE_WIREGUARD:
+                           nm_platform_wireguard_get_link_prope
rties (platform, &obj->link, lnk_data);

this can be just a static function inside nm-linux-platform.c. Do you
expect this to be useful from somewhere else?

Not really. I'll just make it static.

+   if (tbp[IFLA_WG_PEER_PUBLIC_KEY])
+           memcpy (&peer->public_key, nla_data
(tbp[IFLA_WG_PEER_PUBLIC_KEY]), sizeof (peer->public_key));
+   if (tbp[IFLA_WG_PEER_PRESHARED_KEY])
+           memcpy (&peer->preshared_key, nla_data
(tbp[IFLA_WG_PEER_PRESHARED_KEY]), sizeof (peer->preshared_key));
+   if (tbp[IFLA_WG_PEER_ENDPOINT])
+           memcpy (&peer->endpoint, nla_data
(tbp[IFLA_WG_PEER_ENDPOINT]), sizeof (peer->endpoint));

this seems wrong. The size of this attribute depends on addr_family.

Good catch. Will fix.

+static gboolean
+wireguard_get_link_properties (NMPlatform *platform, const
NMPlatformLink *link, NMPObject *obj)
+{
+   NMLinuxPlatformPrivate *priv = NM_LINUX_PLATFORM_GET_PRIVATE
(platform);
+   nm_auto_nlmsg struct nl_msg *msg = NULL;
+   struct nl_cb cb = {
+           .valid_cb = _wireguard_read_device_cb,
+           .valid_arg = obj,
+   };
+
+   g_return_val_if_fail (priv->wireguard_family_id, FALSE);

g_return_*() is an assertion. I guess there can be run-time situations,
where this doesn't hold.

I'll address this (together with next comment).


@@ -7066,6 +7356,16 @@ constructed (GObject *_object)
                                    (EVENT_CONDITIONS |
ERROR_CONDITIONS | DISCONNECT_CONDITIONS),
                                     event_handler, platform);
 
+   priv->genlh = nl_socket_alloc ();
+   g_assert (priv->genlh);
+
+   nle = nl_connect (priv->genlh, NETLINK_GENERIC);
+   g_assert (!nle);
+   _LOGD ("Generic netlink socket established: port=%u, fd=%d",
nl_socket_get_local_port (priv->genlh), nl_socket_get_fd (priv-
genlh));
+
+   priv->wireguard_family_id = genl_ctrl_resolve (priv->genlh,
"wireguard");
+   _LOG2D ("kernel-support: wireguard: %s", priv-
wireguard_family_id ? "detected" : "not detected");

I wonder, what happens if the wireguard module is not loaded initially.
I guess, caching the family-id is right, but you need to retry if it's
not avilable. For example, in wireguard_get_link_properties()

Makes sense.

Should we attempt a refetch once per RTM_*LINK message then? Or is there
some reliable way to get notified when a new genl family becomes
available?

+static void
+_vt_cmd_obj_hash_update_lnk_wireguard (const NMPObject *obj,
NMHashState *h)
+{
+   nm_assert (NMP_OBJECT_GET_TYPE (obj) ==
NMP_OBJECT_TYPE_LNK_WIREGUARD);
+
+   nm_platform_lnk_wireguard_hash_update (&obj->lnk_wireguard,
h);
+   nm_hash_update_val (h, obj->_lnk_wireguard.peers);

You must hash the entire content, not just the peers-pointer.

This will become irrelevant when we have a simple array-of-structs, but:
if copy_lnk_wireguard was implemented by simply grabbing a ref to the
same GPtrArray, would it still be wrong to take a shortcut here by
hashing/testing just the pointer?

+static int
+_vt_cmd_obj_cmp_lnk_wireguard (const NMPObject *obj1, const
NMPObject *obj2)
+{
+   int c;
+
+   c = nm_platform_lnk_wireguard_cmp (&obj1->lnk_wireguard,
&obj2->lnk_wireguard);
+   if (c)
+           return c;
+
+   return obj1->_lnk_wireguard.peers < obj2-
_lnk_wireguard.peers;

As with hash-update, you must compare the content of peers.

Sidenote: even if you really wanted to compare the pointers, then "<"
would only return 0 and 1. But cmp must return -1, 0, and 1.

...and that's simply a bug.

diff --git a/src/platform/nmp-object.h b/src/platform/nmp-object.h
index 0dd2687a3..e64ac6614 100644
--- a/src/platform/nmp-object.h
+++ b/src/platform/nmp-object.h
@@ -213,6 +213,12 @@ typedef struct {
    NMPlatformLnkVxlan _public;
 } NMPObjectLnkVxlan;
 
+typedef struct {
+   NMPlatformLnkWireguard _public;
+
+   GPtrArray *peers;

It makes some sense that @peers is GPtrArray, because that type is ref-
counted, and possibly multiple NMPlatformLnkWireguard instances could
share it.

However, see that NMPObjectLnkVlan doesn't try to share the
ingress_qos_map. It doesn't because, what is instead shared is the
lnk_vlan instance itself.

See how
  - _parse_lnk_vlan() always creates a new NMPObjectLnkVlan instance.
    Since it parses the netlink attribute at once, it just allocates
    a suitable obj->_lnk_vlan.ingress_qos_map.
  - then, it will reach line
    
https://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/src/platform/nm-linux-platform.c?id=eb8257dea5802a004af9cccacb30af98440e2172#n1894
    if the newly created object is identical to the already cached one,
    the newly created object is thrown away and the cached one is kept.
What is shared and ref-countable is the NMPObjectLnkVlan instance itself. Not
the data pointers to ingress_qos_map.

the same is for NMPObjectLnkWireguard.peers. Just make it a plain array, possibly
also add a "peers_len" field.

Ah - it's starting to click now.
I'll simplify this before the next iteration.

Thanks for your time and patience! :)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]