[gjs: 1/6] initial commit of heapgraph scripts
- From: Philip Chimento <pchimento src gnome org>
- To: commits-list gnome org
- Cc:
- Subject: [gjs: 1/6] initial commit of heapgraph scripts
- Date: Mon, 9 Apr 2018 21:16:56 +0000 (UTC)
commit 32b307463a6fb2c62f985417ea8442c86800b0e6
Author: Andy Holmes <andrew g r holmes gmail com>
Date: Sat Apr 7 21:32:04 2018 -0700
initial commit of heapgraph scripts
tools/README.md | 174 ++++++++++++++
tools/heapdot.py | 269 ++++++++++++++++++++++
tools/heapgraph.py | 659 +++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 1102 insertions(+)
---
diff --git a/tools/README.md b/tools/README.md
new file mode 100644
index 00000000..16aac275
--- /dev/null
+++ b/tools/README.md
@@ -0,0 +1,174 @@
+# gjs-heapgraph
+
+A heap analyzer for Gjs based on https://github.com/amccreight/heapgraph to aid
+in debugging and plugging memory leaks.
+
+## Resource Usage
+
+Be aware that parsing a heap can take a fair amount of RAM depending on the
+heap size and time depending on the amount of target objects and path length.
+
+Examples of approximate memory and time required to build DOT graphs on an
+IvyBridge i7:
+
+| Heap Size | RAM | Targets | Time |
+|-----------|-------|---------|-------------|
+| 5MB | 80MB | 1500 | 1.5 Minutes |
+| 30MB | 425MB | 7700 | 40 Minutes |
+
+## Basic Usage
+
+### Getting a Heap Dump
+
+Currently the only way to dump a heap is from within a script, although in the
+near future it will be possible to send `SIGUSR1` to a GJS process with the env
+variable `GJS_DEBUG_HEAP_OUTPUT` set:
+
+```
+$ GJS_DEBUG_HEAP_OUTPUT=myApp.heap gjs myApp.js
+$ kill -s SIGUSR1 <gjs-pid>
+```
+
+An example of dumping a heap from within a script:
+
+```js
+const System = imports.system;
+
+// Dumping the heap before the "leak" has happened
+System.dumpHeap('/home/user/myApp1.heap.');
+
+// Code presumably resulting in a leak...
+
+// Running the garbage collector before dumping can avoid some false positives
+System.gc();
+
+// Dumping the heap after the "leak" has happened
+System.dumpHeap('/home/user/myApp2.heap.');
+```
+
+### Output
+
+The default output of `./heapgraph.py` is a tiered tree of paths from root to
+rooted objects. If the output is being sent to a terminal (TTY) some minimal
+ANSI styling is used to make the output more readable. Additionally, anything
+that isn't part of the graph will be sent to `stderr` so the output can be
+directed to a file as plain text. Below is a snippet:
+
+```
+$ ./heapgraph.py file.heap Object > file.tree
+Parsing file.heap...done
+Found 343 targets with type "Object"
+
+$ cat file.tree
+├─[vm_stack[1]]─➤ [Object jsobj@0x7fce60683440]
+│
+├─[vm_stack[1]]─➤ [Object jsobj@0x7fce606833c0]
+│
+├─[exact-Object]─➤ [Object jsobj@0x7fce60683380]
+│
+├─[exact-Object]─➤ [GjsGlobal jsobj@0x7fce60680060]
+│ ├─[Debugger]─➤ [Function Debugger jsobj@0x7fce606a4540]
+│ │ ╰─[Object]─➤ [Function Object jsobj@0x7fce606a9cc0]
+│ │ ╰─[prototype]─➤ [Object (nil) jsobj@0x7fce60681160]
+│ │
+...and so on
+```
+
+`heapgraph.py` can also output DOT graphs that can be a useful way to visualize
+the heap graph, especially if you don't know exactly what you're looking for.
+Passing the `--dot-graph` option will output a DOT graph to `<input-file>.dot`
+in the current working directory.
+
+There are a few choices for viewing dot graphs, and many utilities for
+converting them to other formats like PDF, Tex or GraphML. For Gnome desktops
+[`xdot`](https://github.com/jrfonseca/xdot.py) is a nice lightweight
+Python/Cairo viewer available on PyPi and in most distributions.
+
+```sh
+$ ./heapgraph.py --dot-graph /home/user/myApp2.heap Object
+Parsing file.heap...done
+Found 343 targets with type "Object"
+
+$ xdot myApp2.heap.dot
+```
+
+### Excluding Nodes from the Graph
+
+The exclusion switch you are most likely to use is `--diff-heap` which will
+exclude all nodes in the graph common to that heap, allowing you to easily
+see what's not being collected between two states.
+
+```sh
+$ ./heapgraph --diff-heap myApp1.heap myApp2.heap GObject
+```
+
+You can also exclude Gray Roots, WeakMaps, nodes with a heap address or nodes
+with labels containing a string. Because GObject addresses are part of the node
+label, these can be excluded with `--hide-node` as well.
+
+By default the global object (GjsGlobal aka `window`), imports (GjsModule,
+GjsFileImporter), and namespaces (GIRepositoryNamespace) aren't shown in the
+graph since these are less useful and can't be garbage collected anyways.
+
+```sh
+$ ./heapgraph.py --hide-addr 0x7f6ef022c060 \
+ --hide-node 'self-hosting-global' \
+ --no-gray-roots \
+ /home/user/myApp2.heap Object
+$ ./heapgraph.py --hide-node 0x55e93cf5deb0 /home/user/myApp2.heap Object
+```
+
+### Command-Line Arguments
+
+> **NOTE:** Command line arguments are subject to change; Check
+> `./heapgraph.py --help` before running.
+
+```
+usage: heapgraph.py [-h] [--edge | --function | --string] [--count]
+ [--dot-graph] [--no-addr] [--diff-heap FILE]
+ [--no-gray-roots] [--no-weak-maps] [--show-global]
+ [--show-imports] [--hide-addr ADDR] [--hide-node LABEL]
+ FILE TARGET
+
+Find what is rooting or preventing an object from being collected in a GJS
+heap using a shortest-path breadth-first algorithm.
+
+positional arguments:
+ FILE Garbage collector heap from System.dumpHeap()
+ TARGET Heap address (eg. 0x7fa814054d00) or type prefix (eg.
+ Array, Object, GObject, Function...)
+
+optional arguments:
+ -h, --help show this help message and exit
+ --edge, -e Treat TARGET as a function name
+ --function, -f Treat TARGET as a function name
+ --string, -s Treat TARGET as a string literal or String()
+
+Output Options:
+ --count, -c Only count the matches for TARGET
+ --dot-graph, -d Output a DOT graph to FILE.dot
+ --no-addr, -na Don't show addresses
+
+Node/Root Filtering:
+ --diff-heap FILE, -dh FILE
+ Don't show roots common to the heap FILE
+ --no-gray-roots, -ng Don't show gray roots (marked to be collected)
+ --no-weak-maps, -nwm Don't show WeakMaps
+ --show-global, -g Show the global object (eg. window/GjsGlobal)
+ --show-imports, -i Show import and module nodes (eg. imports.foo)
+ --hide-addr ADDR, -ha ADDR
+ Don't show roots with the heap address ADDR
+ --hide-node LABEL, -hn LABEL
+ Don't show nodes with labels containing LABEL
+```
+
+## See Also
+
+Below are some links to information relevant to SpiderMonkey garbage collection
+and heap parsing:
+
+* [GC.cpp Comments](https://searchfox.org/mozilla-central/source/js/src/gc/GC.cpp)
+* [How JavaScript Objects Are
Implemented](https://www.infoq.com/presentations/javascript-objects-spidermonkey)
+* [Tracing garbage collection](https://en.wikipedia.org/wiki/Tracing_garbage_collection#Tri-color_marking)
on Wikipedia
+* [SpiderMonkey Memory](https://gitlab.gnome.org/GNOME/gjs/blob/master/doc/SpiderMonkey_Memory.md) via GJS
Repo
+
diff --git a/tools/heapdot.py b/tools/heapdot.py
new file mode 100644
index 00000000..e9333ce7
--- /dev/null
+++ b/tools/heapdot.py
@@ -0,0 +1,269 @@
+#!/usr/bin/env python3
+
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+#
+# heapdot.py - DOT Graph output
+
+
+import re
+
+func_regex = re.compile('Function(?: ([^/]+)(?:/([<|\w]+))?)?')
+gobj_regex = re.compile('([^ ]+) (\(nil\)|0x[a-fA-F0-9]+$)')
+
+
+###############################################################################
+# Shape Compression
+###############################################################################
+
+def findi(m, x):
+ if not x in m:
+ m[x] = [x, 0]
+ return m[x]
+ if m[x][0] == x:
+ return m[x]
+ z = findi (m, m[x][0])
+ m[x] = z
+ return z
+
+
+def find(m, x):
+ return findi(m, x)[0]
+
+
+def union(m, rep, x, y):
+ xp = findi (m, x)
+ yp = findi (m, y)
+ if xp == yp:
+ return
+ if xp[1] < yp[1]:
+ rep[yp[0]] = rep.get(xp[0], xp[0])
+ if xp[0] in rep:
+ del rep[xp[0]]
+ m[xp[0]][0] = yp[0]
+ elif xp[1] > yp[1]:
+ m[yp[0]][0] = xp[0]
+ else:
+ m[yp[0]][0] = xp[0]
+ m[xp[0]][1] += 1
+
+
+def compress_shapes(graph, nodes, edges):
+ shape_merge = {}
+ shape_rep = {}
+
+ def canon_node(x):
+ y = find(shape_merge, x)
+ if y in shape_rep:
+ y = shape_rep[y]
+ return y
+
+ for x in nodes:
+ if graph.node_labels.get(x, '') != 'shape':
+ continue
+ if not x in edges:
+ continue
+ for y in edges[x]:
+ if graph.node_labels.get(y, '') != 'shape' and graph.node_labels.get(y, '') != 'base_shape':
+ continue
+ union(shape_merge, shape_rep, y, x)
+ break
+
+ # Remove merged away nodes
+ for x in shape_merge.keys():
+ if canon_node(x) != x:
+ nodes.remove(x)
+
+ # Update the edges for merging
+ new_edges = {}
+ for x, dsts in edges.items():
+ new_dsts = set([])
+ for y in dsts:
+ new_dsts.add(canon_node(y))
+ x = canon_node(x)
+ if x in new_dsts:
+ new_dsts.remove(x)
+ new_edges[x] = new_edges.get(x, set([])).union(new_dsts)
+ edges = new_edges
+
+
+###############################################################################
+# DOT Graph Output
+###############################################################################
+
+dot_graph_paths = []
+
+
+def add_dot_graph_path(path):
+ dot_graph_paths.append(path)
+
+
+def output_dot_file(args, graph, targs, fname):
+
+ # build the set of nodes
+ nodes = set([])
+ for p in dot_graph_paths:
+ for x in p:
+ nodes.add(x)
+
+ # build the edge map
+ edges = {}
+
+ for p in dot_graph_paths:
+ prevNode = None
+ for x in p:
+ if prevNode:
+ edges.setdefault(prevNode, set([])).add(x)
+ prevNode = x
+
+ # Shape Compression
+ compress_shapes(graph, nodes, edges)
+
+
+ # Write out the DOT graph
+ outf = open(fname, 'w')
+ outf.write('digraph {\n')
+
+ # Nodes
+ for addr in nodes:
+ label = graph.node_labels.get(addr, '')
+ color = 'black'
+ style = 'solid'
+ shape = 'rect'
+ native = ''
+
+ if label.endswith('<no private>'):
+ label = label[:-13]
+
+ # Lookup the edge label for this node
+ elabel = ''
+
+ for origin in graph.edge_labels.values():
+ if addr in origin:
+ elabels = origin[addr]
+ elabel = elabels[0]
+ break
+
+
+ # GObject or something else with a native address
+ gm = gobj_regex.match(label)
+
+ if gm:
+ label = gm.group(1)
+ color = 'orange'
+ style = 'bold'
+
+ if not args.no_addr:
+ native = gm.group(2)
+
+ # Some kind of GObject
+ if label.startswith('GObject_'):
+ shape = 'circle'
+
+ if elabel in ['prototype', 'group_proto']:
+ style += ',dashed'
+ # Another object native to Gjs
+ elif label.startswith('Gjs') or label.startswith('GIR'):
+ shape = 'octagon'
+ elif label.startswith('Function'):
+ fm = func_regex.match(label)
+
+ if fm.group(2) == '<':
+ label = 'Function via {}()'.format(fm.group(1))
+ elif fm.group(2):
+ label = 'Function {} in {}'.format(fm.group(2), fm.group(1))
+ else:
+ if len(label) > 10:
+ label = label[9:]
+ label += '()'
+
+ color = 'green'
+ style = 'bold,rounded'
+ # A function context
+ elif label == 'Call' or label == 'LexicalEnvironment':
+ color = 'green'
+ style = 'bold,dashed'
+ # A file/script reference
+ elif label.startswith('script'):
+ label = label[7:].split('/')[-1]
+ shape = 'note'
+ color = 'blue'
+ # A WeakMap
+ elif label.startswith('WeakMap'):
+ label = 'WeakMap'
+ style = 'dashed'
+ # Mostly uninteresting objects
+ elif label in ['base_shape', 'object_group', 'type_object']:
+ style = 'dotted'
+ if label == 'base_shape':
+ label = 'shape'
+ elif label == 'type_object':
+ label = 'type'
+
+ # Only mark the target if it's a single match
+ if addr == targs[0] and len(targs) == 1:
+ color = 'red'
+ style = 'bold'
+
+ if args.no_addr:
+ outf.write(' node [label="{0}", color={1}, shape={2}, style="{3}"] q{4};\n'.format(label,
color, shape, style, addr))
+ else:
+ if native:
+ outf.write(' node [label="{0}\\njsobj@{4}\\nnative@{5}", color={1}, shape={2}, style="{3}"]
q{4};\n'.format(label, color, shape, style, addr, native))
+ else:
+ outf.write(' node [label="{0}\\njsobj@{4}", color={1}, shape={2}, style="{3}"]
q{4};\n'.format(label, color, shape, style, addr))
+
+ # Edges (relationships)
+ for origin, destinations in edges.items():
+ for destination in destinations:
+ labels = graph.edge_labels.get(origin, {}).get(destination, [])
+ ll = []
+
+ for l in labels:
+ if len(l) == 2:
+ l = l[0]
+ if l.startswith('**UNKNOWN SLOT '):
+ continue
+ ll.append(l)
+
+ label = ''
+ style = 'solid'
+ color = 'black'
+
+ if len(ll) == 1:
+ label = ll[0]
+
+ # Object children
+ if label.startswith('objects['):
+ label = label[7:]
+ # Array elements
+ elif label.startswith('objectElements['):
+ label = label[14:]
+ # prototype/constructor function
+ elif label in ['prototype', 'group_proto']:
+ color = 'orange'
+ style = 'bold,dashed'
+ # fun_environment
+ elif label == 'fun_environment':
+ label = ''
+ color = 'green'
+ style = 'bold,dashed'
+ elif label == 'script':
+ label = ''
+ color = 'blue'
+ # Signals
+ # TODO: better heap label via gi/closure.cpp & gi/object.cpp
+ elif label == 'signal connection':
+ color = 'red'
+ style = 'bold,dashed'
+
+ if len(label) > 18:
+ label = label[:8] + '...' + label[-8:]
+ else:
+ label = ',\\n'.join(ll)
+
+ outf.write(' q{0} -> q{1} [label="{2}", color={3}, style="{4}"];\n'.format(origin, destination,
label, color, style))
+
+ outf.write('}\n')
+ outf.close()
diff --git a/tools/heapgraph.py b/tools/heapgraph.py
new file mode 100755
index 00000000..4b54b1f5
--- /dev/null
+++ b/tools/heapgraph.py
@@ -0,0 +1,659 @@
+#!/usr/bin/env python3
+
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+#
+# heapgraph.py - Top-level script for interpreting Garbage Collector heaps
+
+import argparse
+import copy
+from collections import namedtuple
+from collections import deque
+import os
+import re
+import sys
+
+try:
+ from heapdot import output_dot_file
+ from heapdot import add_dot_graph_path
+except ImportError:
+ sys.stderr.write('DOT graph output not available\n')
+
+
+########################################################
+# Command line arguments.
+########################################################
+
+parser = argparse.ArgumentParser(description='Find what is rooting or preventing an object from being
collected in a GJS heap using a shortest-path breadth-first algorithm.')
+
+parser.add_argument('heap_file', metavar='FILE',
+ help='Garbage collector heap from System.dumpHeap()')
+
+parser.add_argument('target', metavar='TARGET',
+ help='Heap address (eg. 0x7fa814054d00) or type prefix (eg. Array, Object, GObject,
Function...)')
+
+### Target Options
+targ_opts = parser.add_mutually_exclusive_group()
+
+
+targ_opts.add_argument('--edge', '-e', dest='edge_target', action='store_true',
+ default=False,
+ help='Treat TARGET as a function name')
+
+targ_opts.add_argument('--function', '-f', dest='func_target', action='store_true',
+ default=False,
+ help='Treat TARGET as a function name')
+
+targ_opts.add_argument('--string', '-s', dest='string_target', action='store_true',
+ default=False,
+ help='Treat TARGET as a string literal or String()')
+
+### Output Options
+out_opts = parser.add_argument_group('Output Options')
+
+out_opts.add_argument('--count', '-c', dest='count', action='store_true',
+ default=False,
+ help='Only count the matches for TARGET')
+
+out_opts.add_argument('--dot-graph', '-d', dest='dot_graph',
+ action='store_true', default=False,
+ help='Output a DOT graph to FILE.dot')
+
+out_opts.add_argument('--no-addr', '-na', dest='no_addr',
+ action='store_true', default=False,
+ help='Don\'t show addresses')
+
+### Node and Root Filtering
+filt_opts = parser.add_argument_group('Node/Root Filtering')
+
+filt_opts.add_argument('--diff-heap', '-dh', dest='diff_heap', action='store',
+ metavar='FILE',
+ help='Don\'t show roots common to the heap FILE')
+
+filt_opts.add_argument('--no-gray-roots', '-ng', dest='no_gray_roots',
+ action='store_true', default=False,
+ help='Don\'t show gray roots (marked to be collected)')
+
+filt_opts.add_argument('--no-weak-maps', '-nwm', dest='no_weak_maps',
+ action='store_true', default=False,
+ help='Don\'t show WeakMaps')
+
+filt_opts.add_argument('--show-global', '-g', dest='show_global',
+ action='store_true', default=False,
+ help='Show the global object (eg. window/GjsGlobal)')
+
+filt_opts.add_argument('--show-imports', '-i', dest='show_imports',
+ action='store_true', default=False,
+ help='Show import and module nodes (eg. imports.foo)')
+
+filt_opts.add_argument('--hide-addr', '-ha', dest='hide_addrs', action='append',
+ metavar='ADDR', default=[],
+ help='Don\'t show roots with the heap address ADDR')
+
+filt_opts.add_argument('--hide-node', '-hn', dest='hide_nodes', action='append',
+ metavar='LABEL', default=['GIRepositoryNamespace', 'GjsFileImporter', 'GjsGlobal',
'GjsModule'],
+ help='Don\'t show nodes with labels containing LABEL')
+
+
+###############################################################################
+# Heap Patterns
+###############################################################################
+
+GraphAttribs = namedtuple('GraphAttribs', 'edge_labels node_labels roots root_labels weakMapEntries')
+WeakMapEntry = namedtuple('WeakMapEntry', 'weakMap key keyDelegate value')
+
+
+addr_regex = re.compile('[A-F0-9]+$|0x[a-f0-9]+$')
+node_regex = re.compile ('((?:0x)?[a-fA-F0-9]+) (?:(B|G|W) )?([^\r\n]*)\r?$')
+edge_regex = re.compile ('> ((?:0x)?[a-fA-F0-9]+) (?:(B|G|W) )?([^\r\n]*)\r?$')
+wme_regex = re.compile ('WeakMapEntry map=([a-zA-Z0-9]+|\(nil\)) key=([a-zA-Z0-9]+|\(nil\))
keyDelegate=([a-zA-Z0-9]+|\(nil\)) value=([a-zA-Z0-9]+)\r?$')
+
+func_regex = re.compile('Function(?: ([^/]+)(?:/([<|\w]+))?)?')
+gobj_regex = re.compile('([^ ]+) (0x[a-fA-F0-9]+$)')
+
+###############################################################################
+# Heap Parsing
+###############################################################################
+
+def parse_roots(fobj):
+ """Parse the roots portion of a garbage collector heap."""
+
+ roots = {}
+ root_labels = {}
+ weakMapEntries = []
+
+ for line in fobj:
+ node = node_regex.match(line)
+
+ if node:
+ addr = node.group(1)
+ color = node.group(2)
+ label = node.group(3)
+
+ # Only overwrite an existing root with a black root.
+ if addr not in roots or color == 'B':
+ roots[addr] = (color == 'B')
+ # It would be classier to save all the root labels, though then
+ # we have to worry about gray vs black.
+ root_labels[addr] = label
+ else:
+ wme = wme_regex.match(line)
+
+ if wme:
+ weakMapEntries.append(WeakMapEntry(weakMap=wme.group(1),
+ key=wme.group(2),
+ keyDelegate=wme.group(3),
+ value=wme.group(4)))
+ # Skip comments, arenas, compartments and zones
+ elif line[0] == '#':
+ continue
+ # Marks the end of the roots section
+ elif line[:10] == '==========':
+ break
+ else:
+ sys.stderr.write('Error: unknown line {}\n'.format(line))
+ exit(-1)
+
+ return [roots, root_labels, weakMapEntries]
+
+
+def parse_graph(fobj):
+ """Parse the node and edges of a garbage collector heap."""
+
+ edges = {}
+ edge_labels = {}
+ node_labels = {}
+
+ def addNode (addr, node_label):
+ edges[addr] = {}
+ edge_labels[addr] = {}
+
+ if node_label != '':
+ node_labels[addr] = node_label
+
+ def addEdge(source, target, edge_label):
+ edges[source][target] = edges[source].get(target, 0) + 1
+
+ if edge_label != '':
+ edge_labels[source].setdefault(target, []).append(edge_label)
+
+ node_addr = None
+
+ for line in fobj:
+ e = edge_regex.match(line)
+
+ if e:
+ if node_addr not in args.hide_addrs:
+ addEdge(node_addr, e.group(1), e.group(3))
+ else:
+ node = node_regex.match(line)
+
+ if node:
+ node_addr = node.group(1)
+ node_color = node.group(2)
+ node_label = node.group(3)
+
+ # Use this opportunity to map hide_nodes to addresses
+ for hide_node in args.hide_nodes:
+ if hide_node in node_label:
+ args.hide_addrs.append(node_addr)
+ break
+ else:
+ addNode(node_addr, node_label)
+ # Skip comments, arenas, compartments and zones
+ elif line[0] == '#':
+ continue
+ else:
+ sys.stderr.write('Error: Unknown line: {}\n'.format(line[:-1]))
+
+ # yar, should pass the root crud in and wedge it in here, or somewhere
+ return [edges, edge_labels, node_labels]
+
+
+def parse_heap(fname):
+ """Parse a garbage collector heap."""
+
+ try:
+ fobj = open(fname, 'r')
+ except:
+ sys.stderr.write('Error opening file {}\n'.format(fname))
+ exit(-1)
+
+ [roots, root_labels, weakMapEntries] = parse_roots(fobj)
+ [edges, edge_labels, node_labels] = parse_graph(fobj)
+ fobj.close()
+
+ graph = GraphAttribs(edge_labels=edge_labels, node_labels=node_labels,
+ roots=roots, root_labels=root_labels,
+ weakMapEntries=weakMapEntries)
+
+ return (edges, graph)
+
+
+def find_nodes(fname):
+ """Parse a garbage collector heap and return a list of node addresses."""
+
+ addrs = [];
+
+ try:
+ fobj = open(fname, 'r')
+ sys.stderr.write('Parsing {0}...'.format(fname))
+ except:
+ sys.stderr.write('Error opening file {}\n'.format(fname))
+ exit(-1)
+
+ # Whizz past the roots
+ for line in fobj:
+ if '==========\n' in line:
+ break
+
+ for line in fobj:
+ node = node_regex.match(line)
+
+ if node:
+ addrs.append(node.group(1))
+
+ fobj.close()
+
+ sys.stderr.write('done\n')
+ sys.stderr.flush()
+
+ return addrs
+
+
+
+# Some applications may not care about multiple edges.
+# They can instead use a single graph, which is represented as a map
+# from a source node to a set of its destinations.
+def to_single_graph(edges):
+ single_graph = {}
+
+ for origin, destinations in edges.items():
+ d = set([])
+ for destination, distance in destinations.items():
+ d.add(destination)
+ single_graph[origin] = d
+
+ return single_graph
+
+
+def load_graph(fname):
+ sys.stderr.write('Parsing {0}...'.format(fname))
+ (edges, graph) = parse_heap(fname)
+ edges = to_single_graph(edges)
+ sys.stderr.write('done\n')
+
+ sys.stderr.flush()
+
+ return (edges, graph)
+
+
+###############################################################################
+# Path printing
+###############################################################################
+
+tree_graph_paths = {}
+
+
+class style:
+ DIM = '\033[30m'
+ BOLD = '\033[1m'
+ ITALIC = '\033[3m'
+ UNDERLINE = '\033[4m'
+ END = '\033[0m'
+
+
+def get_edge_label(graph, origin, destination):
+ elabel = lambda l: l[0] if len(l) == 2 else l
+ labels = graph.edge_labels.get(origin, {}).get(destination, [])
+
+ if len(labels) == 1:
+ label = labels[0]
+
+ if label == 'signal connection':
+ return 'GSignal'
+ else:
+ return label
+ elif len(labels) > 1:
+ return ', '.join([elabel(l) for l in labels])
+ else:
+ return ''
+
+
+def get_node_label(graph, addr):
+ label = graph.node_labels[addr]
+
+ if label.endswith(' <no private>'):
+ label = label[:-13]
+
+ if label.startswith('Function '):
+ fm = func_regex.match(label)
+
+ if fm.group(2) == '<':
+ return 'Function via {}'.format(fm.group(1))
+ elif fm.group(2):
+ return 'Function {} in {}'.format(fm.group(2), fm.group(1))
+ else:
+ return label
+ if label.startswith('script'):
+ label = label[7:].split('/')[-1]
+ elif label.startswith('WeakMap'):
+ label = 'WeakMap'
+ elif label == 'base_shape':
+ label = 'shape'
+ elif label == 'type_object':
+ label = 'type'
+
+ return label[:50]
+
+
+def get_root_label(graph, root):
+ # This won't work on Windows.
+ #label = re.sub(r'0x[0-9a-f]{8}', '*', graph.root_labels[root])
+ label = graph.root_labels[root]
+ return '(ROOT) {}'.format(label)
+
+
+def output_tree_graph(graph, data, base='', parent=''):
+ while data:
+ addr, children = data.popitem()
+
+ # Labels
+ if parent:
+ edge = get_edge_label(graph, parent, addr)
+ else:
+ edge = graph.root_labels[addr]
+
+ node = get_node_label(graph, addr)
+ has_native = gobj_regex.match(node)
+
+ # Color/Style
+ if os.isatty(1):
+ if parent:
+ edge = style.DIM + edge + style.END
+ else:
+ edge = style.ITALIC + edge + style.END
+
+ orig = style.UNDERLINE + 'jsobj@' + addr + style.END
+
+ if has_native:
+ node = style.BOLD + has_native.group(1) + style.END
+ orig += ' ' + style.UNDERLINE + 'native@' + has_native.group(2) + style.END
+ else:
+ node = style.BOLD + node + style.END
+ else:
+ orig = 'jsobj@' + addr
+
+ if has_native:
+ node = has_native.group(1)
+ orig += ' native@' + has_native.group(2)
+
+ # Print the path segment
+ if args.no_addr:
+ if data:
+ print('{0}├─[{1}]─➤ [{2}]'.format(base, edge, node))
+ else:
+ print('{0}╰─[{1}]─➤ [{2}]'.format(base, edge, node))
+ else:
+ if data:
+ print('{0}├─[{1}]─➤ [{2} {3}]'.format(base, edge, node, orig))
+ else:
+ print('{0}╰─[{1}]─➤ [{2} {3}]'.format(base, edge, node, orig))
+
+ # Print child segments
+ if children:
+ if data:
+ output_tree_graph(graph, children, base + '│ ', addr)
+ else:
+ output_tree_graph(graph, children, base + ' ', addr)
+ else:
+ if data:
+ print(base + '│ ')
+ else:
+ print(base + ' ')
+
+
+def add_tree_graph_path(owner, path):
+ o = owner.setdefault(path.pop(0), {})
+ if path:
+ add_tree_graph_path(o, path)
+
+
+def add_path(args, graph, path):
+ if args.dot_graph:
+ add_dot_graph_path(path)
+ else:
+ add_tree_graph_path(tree_graph_paths, path)
+
+
+###############################################################################
+# Breadth-first shortest path finding.
+###############################################################################
+
+def find_roots_bfs(args, edges, graph, target):
+ workList = deque()
+ distances = {}
+
+ def traverseWeakMapEntry(dist, k, m, v, label):
+ if not k in distances or not m in distances:
+ # Haven't found either the key or map yet.
+ return
+
+ if distances[k][0] > dist or distances[m][0] > dist:
+ # Either the key or the weak map is farther away, so we
+ # must wait for the farther one before processing it.
+ return
+
+ if v in distances:
+ return
+
+ distances[v] = (dist + 1, k, m, label)
+ workList.append(v)
+
+
+ # For now, ignore keyDelegates.
+ weakData = {}
+ for wme in graph.weakMapEntries:
+ weakData.setdefault(wme.weakMap, set([])).add(wme)
+ weakData.setdefault(wme.key, set([])).add(wme)
+ if wme.keyDelegate != '0x0':
+ weakData.setdefault(wme.keyDelegate, set([])).add(wme)
+
+ # Unlike JavaScript objects, GObjects can be "rooted" by their refcount so
+ # we have to use a fake root (repetitively)
+ startObject = 'FAKE START OBJECT'
+ rootEdges = set([])
+
+ for addr, isBlack in graph.roots.items():
+ if isBlack or not args.no_gray_roots:
+ rootEdges.add(addr)
+
+ #FIXME:
+ edges[startObject] = rootEdges
+ distances[startObject] = (-1, None)
+ workList.append(startObject)
+
+ # Search the graph.
+ while workList:
+ origin = workList.popleft()
+ dist = distances[origin][0]
+
+ # Found the target, stop digging
+ if origin == target:
+ continue
+
+ # origin does not point to any other nodes.
+ if not origin in edges:
+ continue
+
+ for destination in edges[origin]:
+ if destination not in distances:
+ distances[destination] = (dist + 1, origin)
+ workList.append(destination)
+
+ if origin in weakData:
+ for wme in weakData[origin]:
+ traverseWeakMapEntry(dist, wme.key, wme.weakMap, wme.value,
+ "value in WeakMap " + wme.weakMap)
+ traverseWeakMapEntry(dist, wme.keyDelegate, wme.weakMap, wme.key,
+ "key delegate in WeakMap " + wme.weakMap)
+
+
+ # Print out the paths by unwinding backwards to generate a path,
+ # then print the path. Accumulate any weak maps found during this
+ # process into the printWorkList queue, and print out what keeps
+ # them alive. Only print out why each map is alive once.
+ printWorkList = deque()
+ printWorkList.append(target)
+ printedThings = set([target])
+
+ while printWorkList:
+ p = printWorkList.popleft()
+ path = []
+ while p in distances:
+ path.append(p)
+ dist = distances[p]
+ if len(dist) == 2:
+ [_, p] = dist
+ else:
+ # The weak map key is probably more interesting,
+ # so follow it, and worry about the weak map later.
+ [_, k, m, label] = dist
+
+ graph.edge_labels[k].setdefault(p, []).append(label)
+ p = k
+ if not m in printedThings and not args.no_weak_maps:
+ printWorkList.append(m)
+ printedThings.add(m)
+
+ if path:
+ path.pop()
+ path.reverse()
+ add_path(args, graph, path)
+
+
+########################################################
+# Target selection
+########################################################
+
+def target_edge(graph, target):
+ targets = []
+
+ for origin, destinations in graph.edge_labels.items():
+ for destination in destinations:
+ if target in graph.edge_labels[origin][destination]:
+ targets.append(destination)
+
+ sys.stderr.write('Found {} objects with edge label of {}\n'.format(len(targets), target))
+ return targets
+
+
+def target_func(graph, target):
+ targets = []
+
+ for addr, label in graph.node_labels.items():
+ if not label[:9] == 'Function ':
+ continue
+
+ if label[9:] == target:
+ targets.append(addr)
+
+ sys.stderr.write('Found {} functions named "{}"\n'.format(len(targets), target))
+ return targets
+
+
+def target_gobject(graph, target):
+ targets = []
+
+ for addr, label in graph.node_labels.items():
+ if label.endswith(target):
+ targets.append(addr)
+
+ sys.stderr.write('Found GObject with address of {}\n'.format(target))
+ return targets
+
+
+def target_string(graph, target):
+ targets = []
+
+ for addr, label in graph.node_labels.items():
+ if label[:7] == 'string ' and target in label[7:]:
+ targets.append(addr)
+ elif label[:10] == 'substring ' and target in label[10:]:
+ targets.append(addr)
+
+ sys.stderr.write('Found {} strings containing "{}"\n'.format(len(targets), target))
+ return targets
+
+
+def target_type(graph, target):
+ targets = []
+
+ for addr in edges.keys():
+ if graph.node_labels.get(addr, '')[0:len(args.target)] == args.target:
+ targets.append(addr)
+
+ sys.stderr.write('Found {} targets with type "{}"\n'.format(len(targets), target))
+ return targets
+
+
+def select_targets(args, edges, graph):
+ if args.edge_target:
+ return target_edge(graph, args.target)
+ elif args.func_target:
+ return target_func(graph, args.target)
+ elif args.string_target:
+ return target_string(graph, args.target)
+ # If args.target seems like an address search for a JS Object, then GObject
+ elif addr_regex.match(args.target):
+ if args.target in edges:
+ sys.stderr.write('Found object with address "{}"\n'.format(args.target))
+ return [args.target]
+ else:
+ return target_gobject(graph, args.target)
+
+ # Fallback to looking for JavaScript objects by class name
+ return target_type(graph, args.target)
+
+
+if __name__ == "__main__":
+ args = parser.parse_args()
+
+ # Node and Root Filtering
+ if args.show_global:
+ args.hide_nodes.remove('GjsGlobal')
+ if args.show_imports:
+ args.hide_nodes.remove('GjsFileImporter')
+ args.hide_nodes.remove('GjsModule')
+ args.hide_nodes.remove('GIRepositoryNamespace')
+
+ # Make sure we don't filter an explicit target
+ if args.target in args.hide_nodes:
+ args.hide_nodes.remove(args.target)
+
+ # Heap diffing; we do these addrs separately due to the sheer amount
+ diff_addrs = []
+
+ if args.diff_heap:
+ diff_addrs = find_nodes(args.diff_heap)
+
+ # Load the graph
+ (edges, graph) = load_graph(args.heap_file)
+ targets = select_targets(args, edges, graph)
+
+ if len(targets) == 0:
+ sys.stderr.write('No targets found for "{}".\n'.format(args.target))
+ sys.exit(-1)
+ elif args.count:
+ sys.exit(-1);
+
+ for addr in targets:
+ if addr in edges and addr not in diff_addrs:
+ find_roots_bfs(args, edges, graph, addr)
+
+ if args.dot_graph:
+ output_dot_file(args, graph, targets, args.heap_file + ".dot")
+ else:
+ output_tree_graph(graph, tree_graph_paths)
+
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]