# How to profile Coq?
I (Pierre-Marie Pédrot) mainly use two OCaml branches to profile Coq, whether I
want to profile time or memory consumption. AFAIK, this only works for Linux.
## Time
In Coq source folder:
opam switch 4.05.0+trunk+fp
./configure -local -debug
make
perf record -g bin/coqc file.v
perf report -g fractal,callee --no-children
To profile only part of a file, first load it using
bin/coqtop -l file.v
and plug into the process
perf record -g -p PID
### Per-component [flame graphs](https://github.com/brendangregg/FlameGraph)
I (Andres Erbsen) have found it useful to look at library-wide flame graphs of
coq time consumption. As the Ltac interpreter stack is reflected in the OCaml
stack, calls to the same primitive can appear on top of multiple essentially
equivalent stacks. To make the profiles more readable, one could either try to
edit the stack trace to merge "equivalent" frames, or simply look at the
aggregate profile on a component-by-component basis. Here is how to do the
second for the standard library ([example output](https://cdn.rawgit.com/andres-erbsen/b29b29cb6480dfc6a662062e4fcd0ae3/raw/304fc3fea9630c8e453929aa7920ca8a2a570d0b/stdlib_categorized_outermost.svg)).
~~~~~
#!/bin/bash
make -f Makefile.dune clean
make -f Makefile.dune states
perf record -F99 `# ~1GB of data` --call-graph=dwarf -- make -f Makefile.dune world
perf script --time '0%-100%' |
stackcollapse-perf.pl |
grep Coqtop__compile |
sed -rf <(cat <<'EOF'
s/;caml/;/g
s/_[0-9]*;/;/g
s/Logic_monad__fun;//g
s/_apply[0-9];//g
s/;System/@&@/
s/;Hashcons/@&@/
s/;Grammar/@&@/
s/;Declaremods/@&@/
s/;Tactics/@&@/
s/;Pretyping/@&@/
s/;Typeops/@&@/
s/;Reduction/@&@/
s/;Unification/@&@/
s/;Evarutil/@&@/
s/;Evd/@&@/
s/;EConstr/@&@/
s/;Constr/@&@/
s/;Univ/@&@/
s/;Ugraph/@&@/
s/;UState/@&@/
s/;Micromega/@&@/
s/;Omega/@&@/
s/;Auto/@&@/
s/;Ltac_plugin__Tacinterp/@&@/
s/;Ltac_plugin__Rewrite/@&@/
s/[^@]*@;([^@]*)@/\1;\1/
s/@//g
:a; s/;([^;]+);\1;/;\1;/g;ta
EOF
) |
flamegraph.pl
~~~~~
## Memory
You first need a few commits atop trunk for this to work.
git remote add ppedrot https://github.com/ppedrot/coq.git
git fetch ppedrot
git checkout ppedrot/allocation-profiling
git rebase master
Then:
opam switch 4.00.1+alloc-profiling
./configure -local -debug
make
Note that linking the coqtop binary takes quite an amount of time with this
branch, so do not worry too much. There are more recent branches of
alloc-profiling on mshinwell's repo which can be found at:
https://github.com/mshinwell/opam-repo-dev
### For memory dump:
CAMLRUNPARAM=T,mj bin/coqc file.v
In another terminal:
pkill -SIGUSR1 $COQTOPPID
...
pkill -SIGUSR1 $COQTOPPID
dev/decode-major-heap.sh heap.$COQTOPPID.$N bin/coqtop
where $COQTOPPID is coqtop pid and $N the index of the call to pkill.
First column is the memory taken by the objects (in words), second one is the
number of objects and third is the place where the objects where allocated.
### For complete memory graph:
CAMLRUNPARAM=T,gr bin/coqc file.v
In another terminal:
pkill -SIGUSR1 $COQTOPPID
...
pkill -SIGUSR1 $COQTOPPID
ocaml dev/decodegraph.ml edge.$COQTOPPID.$N bin/coqtop > memory.dot
dot -Tpdf -o memory.pdf memory.dot
where $COQTOPPID is coqtop pid and $N the index of the call to pkill.
The pdf produced by the last command gives a compact graphical representation of
the various objects allocated.
¤ Dauer der Verarbeitung: 0.22 Sekunden
(vorverarbeitet)
¤
|
Haftungshinweis
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung ist noch experimentell.
|