Re: [BuildStream] Nuking the pickle jobber.



Hi Tristan,

Angelos responded on the !1965 (https://gitlab.com/BuildStream/buildstream/-/merge_requests/1965#note_361444144).  I'll quote it below for completeness.

Cheers,

Sander

I think at this point, having this feature in it's present state serves mostly as an obstacle to any progress and refactoring, although it would be great if someone wanted to own this feature, fix it, and eventually complete #1021.

Unfortunately Mac, Windows, and "plugin replay" support didn't draw the huge amount of interest I'd hoped for; so perhaps this is for the best. Its nice to see code become less complicated at least.

Here are some other points in favour of keeping these changes, in case one of them is persuasive :)

First, it may not be obvious that this is also a step back for support on Mac. As of Python 3.8, the spawn method of starting child processes is the default as fork is considered unsafe, more details here. That means that state must be pickled over to the child jobs.

Second, I personally found "plugin replays" useful for debugging plugins. I used the pickling support of jobs to "save" a job to disk, and wrote a small utility to "replay" those saves - so I could repeatedly run and debug a plugin in-process. I know that when I last worked on the project, pdb wasn't something most folks were interested in, so ymmv. This is something that would be useful to customers of BuildStream though, not just an internal debugging tool.

Third, it seems the concern for making things picklable was also shared with the effort to split UI and scheduler. Wherever you see __getstate__ and __setstate__ methods, those are for standard pickling support, rather than something specific to job pickling. Perhaps there are other things in current plans that might benefit from this. That said, I'm more of a "you aint gonna need it" person, so if there aren't current plans then I recommend removing.

Unfortunately errors during pickling are notoriously tricky to debug, which certainly adds drag. One thing I started but didn't finish, was to add a flag for a 'debugging pickler' which would print out the graph of things being pickled as it went. I could probably fish out the WIP if its of interest. Similar to this SO answer. I didn't try dill.detect, but it describes itself as 'Methods for detecting objects leading to pickling failures.', which sounds handy. I haven't experimented with gc.get_referrers() but that also sounds interesting.

One thing I did do to make it fail a bit earlier, was to identify some things which should never be pickled. For example, it's easy to indirectly reference e.g. the Scheduler by holding on to some other object that references it, at best the Scheduler adds a lot of extra state to be pickled, at worst it won't work because it holds on to things that aren't picklable.

Once I had changed things such that it was possible to pickle the things we needed to, I found git bisect to be a very valuable tool for tracking down where undesirable references to e.g. the Scheduler were introduced.

The difference between __getstate__ and get_state_for_child_job_pickling is that:

I wish I had time to finish Windows and Mac support, in personal work I was hoping to use BuildStream + some helpers in place of HomeBrew on my Mac. Similar on Windows. In the enterprise I think this multi-platform support would be a big differentiator and allow more folks to adopt BuildStream.

I'm happy to answer q's going forward, but I can't offer to push it forward unfortunately.

Hope this helps, good luck!


On Mon, Jun 15, 2020 at 10:38 AM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:
Hi,

The pickle jobber is a part of #1021[0], a great initiative towards
supporting BuildStream on win32 platforms natively (not WSL).

I'd love to see #1021 completed, but I think the pickle jobber is
acting as a stick thrown directly in our front wheel while we speed
ahead and try to get things done.

I've filed !1965[1] today, with a long list of complaints detailing why
I think it is imperative that we rid ourselves of this drag energy in
the short term.

Please discuss.

Best Regards,
    -Tristan


[0]: https://gitlab.com/BuildStream/buildstream/-/issues/1021
[1]: https://gitlab.com/BuildStream/buildstream/-/merge_requests/1965


_______________________________________________
buildstream-list mailing list
buildstream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]