|
| 1 | +--- |
| 2 | +title: Poor Performance - Using Linux Perf |
| 3 | +layout: docs.hbs |
| 4 | +--- |
| 5 | + |
| 6 | +# Using Linux Perf |
| 7 | + |
| 8 | +[Linux Perf](https://perf.wiki.kernel.org/index.php/Main_Page) provides low level CPU profiling with JavaScript, |
| 9 | +native and OS level frames. |
| 10 | + |
| 11 | +**Important**: this tutorial is only available on Linux. |
| 12 | + |
| 13 | +## How To |
| 14 | + |
| 15 | +Linux Perf is usually available through the `linux-tools-common` package. Through either `--perf-basic-prof` or |
| 16 | +`--perf-basic-prof-only-functions` we are able to start a Node.js application supporting _perf_events_. |
| 17 | + |
| 18 | +`--perf-basic-prof` will always write to a file (/tmp/perf-PID.map), which can lead to infinite disk growth. |
| 19 | +If that’s a concern either use the module: [linux-perf](https://www.npmjs.com/package/linux-perf) |
| 20 | +or `--perf-basic-prof-only-functions`. |
| 21 | + |
| 22 | +The main difference between both is that `--perf-basic-prof-only-functions` produces less output, it is a viable option |
| 23 | +for production profiling. |
| 24 | + |
| 25 | +```console |
| 26 | +# Launch the application an get the PID |
| 27 | +$ node --perf-basic-prof-only-functions index.js & |
| 28 | +[1] 3870 |
| 29 | +``` |
| 30 | + |
| 31 | +Then record events based in the desired frequency: |
| 32 | + |
| 33 | +```console |
| 34 | +$ sudo perf record -F 99 -p 3870 -g |
| 35 | +``` |
| 36 | + |
| 37 | +In this phase, you may want to use a load test in the application in order to generate more records for a reliable |
| 38 | +analysis. When the job is done, close the perf process by sending a SIGINT (Ctrl-C) to the command. |
| 39 | + |
| 40 | +The `perf` will generate a file inside the `/tmp` folder, usually called `/tmp/perf-PID.map` |
| 41 | +(in above example: `/tmp/perf-3870.map`) containing the traces for each function called. |
| 42 | + |
| 43 | +To aggregate those results in a specific file execute: |
| 44 | + |
| 45 | +```console |
| 46 | +$ sudo perf script > perfs.out |
| 47 | +``` |
| 48 | + |
| 49 | +```console |
| 50 | +$ cat ./perfs.out |
| 51 | +node 3870 25147.878454: 1 cycles: |
| 52 | + ffffffffb5878b06 native_write_msr+0x6 ([kernel.kallsyms]) |
| 53 | + ffffffffb580d9d5 intel_tfa_pmu_enable_all+0x35 ([kernel.kallsyms]) |
| 54 | + ffffffffb5807ac8 x86_pmu_enable+0x118 ([kernel.kallsyms]) |
| 55 | + ffffffffb5a0a93d perf_pmu_enable.part.0+0xd ([kernel.kallsyms]) |
| 56 | + ffffffffb5a10c06 __perf_event_task_sched_in+0x186 ([kernel.kallsyms]) |
| 57 | + ffffffffb58d3e1d finish_task_switch+0xfd ([kernel.kallsyms]) |
| 58 | + ffffffffb62d46fb __sched_text_start+0x2eb ([kernel.kallsyms]) |
| 59 | + ffffffffb62d4b92 schedule+0x42 ([kernel.kallsyms]) |
| 60 | + ffffffffb62d87a9 schedule_hrtimeout_range_clock+0xf9 ([kernel.kallsyms]) |
| 61 | + ffffffffb62d87d3 schedule_hrtimeout_range+0x13 ([kernel.kallsyms]) |
| 62 | + ffffffffb5b35980 ep_poll+0x400 ([kernel.kallsyms]) |
| 63 | + ffffffffb5b35a88 do_epoll_wait+0xb8 ([kernel.kallsyms]) |
| 64 | + ffffffffb5b35abe __x64_sys_epoll_wait+0x1e ([kernel.kallsyms]) |
| 65 | + ffffffffb58044c7 do_syscall_64+0x57 ([kernel.kallsyms]) |
| 66 | + ffffffffb640008c entry_SYSCALL_64_after_hwframe+0x44 ([kernel.kallsyms]) |
| 67 | +.... |
| 68 | +``` |
| 69 | + |
| 70 | +The raw output can be a bit hard to understand so typically the raw file is used to generate flamegraphs for a better |
| 71 | +visualization. |
| 72 | + |
| 73 | + |
| 74 | + |
| 75 | +To generate a flamegraph from this result, follow [this tutorial](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#create-a-flame-graph-with-system-perf-tools) |
| 76 | +from step 6. |
| 77 | + |
| 78 | +Because `perf` output is not a Node.js specific tool, it might have issues with how JavaScript code is optimized in |
| 79 | +Node.js. See [perf output issues](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#perf-output-issues) for a |
| 80 | +futher reference. |
| 81 | + |
| 82 | +## Useful Links |
| 83 | + |
| 84 | +* https://nodejs.org/en/docs/guides/diagnostics-flamegraph/ |
| 85 | +* https://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html |
| 86 | +* https://perf.wiki.kernel.org/index.php/Main_Page |
| 87 | +* https://blog.rafaelgss.com.br/node-cpu-profiler |
0 commit comments