Skip to content

Commit 173eba1

Browse files
authored
doc: include poor-performance diagnostic (#4928)
1 parent 2e65cbe commit 173eba1

File tree

3 files changed

+124
-0
lines changed

3 files changed

+124
-0
lines changed

locale/en/docs/guides/diagnostics/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,5 +17,6 @@ This is the available set of diagnostics guides:
1717

1818
* [Memory](/en/docs/guides/diagnostics/memory)
1919
* [Live Debugging](/en/docs/guides/diagnostics/live-debugging)
20+
* [Poor Performance](/en/docs/guides/diagnostics/poor-performance)
2021

2122
[Diagnostics Working Group]: https://github.com/nodejs/diagnostics
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
---
2+
title: Poor Performance - Diagnostics
3+
layout: docs.hbs
4+
---
5+
6+
# Poor Performance
7+
8+
In this document you can learn about how to profile a Node.js process.
9+
10+
* [Poor Performance](#poor-performance)
11+
* [My application has a poor performance](#my-application-has-a-poor-performance)
12+
* [Symptoms](#symptoms)
13+
* [Debugging](#debugging)
14+
15+
## My application has a poor performance
16+
17+
### Symptoms
18+
19+
My applications latency is high and I have already confirmed that the bottleneck
20+
is not my dependencies like databases and downstream services. So I suspect that
21+
my application spends significant time to run code or process information.
22+
23+
You are satisfied with your application performance in general but would like to
24+
understand which part of our application can be improved to run faster or more
25+
efficient. It can be useful when we want to improve the user experience or save
26+
computation cost.
27+
28+
### Debugging
29+
30+
In this use-case, we are interested in code pieces that use more CPU cycles than
31+
the others. When we do this locally, we usually try to optimize our code.
32+
33+
This document provides two simple ways to profile a Node.js application:
34+
35+
* [Using V8 Sampling Profiler](https://nodejs.org/en/docs/guides/simple-profiling/)
36+
* [Using Linux Perf](/en/docs/guides/diagnostics/poor-performance/using-linux-perf)
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
title: Poor Performance - Using Linux Perf
3+
layout: docs.hbs
4+
---
5+
6+
# Using Linux Perf
7+
8+
[Linux Perf](https://perf.wiki.kernel.org/index.php/Main_Page) provides low level CPU profiling with JavaScript,
9+
native and OS level frames.
10+
11+
**Important**: this tutorial is only available on Linux.
12+
13+
## How To
14+
15+
Linux Perf is usually available through the `linux-tools-common` package. Through either `--perf-basic-prof` or
16+
`--perf-basic-prof-only-functions` we are able to start a Node.js application supporting _perf_events_.
17+
18+
`--perf-basic-prof` will always write to a file (/tmp/perf-PID.map), which can lead to infinite disk growth.
19+
If that’s a concern either use the module: [linux-perf](https://www.npmjs.com/package/linux-perf)
20+
or `--perf-basic-prof-only-functions`.
21+
22+
The main difference between both is that `--perf-basic-prof-only-functions` produces less output, it is a viable option
23+
for production profiling.
24+
25+
```console
26+
# Launch the application an get the PID
27+
$ node --perf-basic-prof-only-functions index.js &
28+
[1] 3870
29+
```
30+
31+
Then record events based in the desired frequency:
32+
33+
```console
34+
$ sudo perf record -F 99 -p 3870 -g
35+
```
36+
37+
In this phase, you may want to use a load test in the application in order to generate more records for a reliable
38+
analysis. When the job is done, close the perf process by sending a SIGINT (Ctrl-C) to the command.
39+
40+
The `perf` will generate a file inside the `/tmp` folder, usually called `/tmp/perf-PID.map`
41+
(in above example: `/tmp/perf-3870.map`) containing the traces for each function called.
42+
43+
To aggregate those results in a specific file execute:
44+
45+
```console
46+
$ sudo perf script > perfs.out
47+
```
48+
49+
```console
50+
$ cat ./perfs.out
51+
node 3870 25147.878454: 1 cycles:
52+
ffffffffb5878b06 native_write_msr+0x6 ([kernel.kallsyms])
53+
ffffffffb580d9d5 intel_tfa_pmu_enable_all+0x35 ([kernel.kallsyms])
54+
ffffffffb5807ac8 x86_pmu_enable+0x118 ([kernel.kallsyms])
55+
ffffffffb5a0a93d perf_pmu_enable.part.0+0xd ([kernel.kallsyms])
56+
ffffffffb5a10c06 __perf_event_task_sched_in+0x186 ([kernel.kallsyms])
57+
ffffffffb58d3e1d finish_task_switch+0xfd ([kernel.kallsyms])
58+
ffffffffb62d46fb __sched_text_start+0x2eb ([kernel.kallsyms])
59+
ffffffffb62d4b92 schedule+0x42 ([kernel.kallsyms])
60+
ffffffffb62d87a9 schedule_hrtimeout_range_clock+0xf9 ([kernel.kallsyms])
61+
ffffffffb62d87d3 schedule_hrtimeout_range+0x13 ([kernel.kallsyms])
62+
ffffffffb5b35980 ep_poll+0x400 ([kernel.kallsyms])
63+
ffffffffb5b35a88 do_epoll_wait+0xb8 ([kernel.kallsyms])
64+
ffffffffb5b35abe __x64_sys_epoll_wait+0x1e ([kernel.kallsyms])
65+
ffffffffb58044c7 do_syscall_64+0x57 ([kernel.kallsyms])
66+
ffffffffb640008c entry_SYSCALL_64_after_hwframe+0x44 ([kernel.kallsyms])
67+
....
68+
```
69+
70+
The raw output can be a bit hard to understand so typically the raw file is used to generate flamegraphs for a better
71+
visualization.
72+
73+
![Example nodejs flamegraph](https://user-images.githubusercontent.com/26234614/129488674-8fc80fd5-549e-4a80-8ce2-2ba6be20f8e8.png)
74+
75+
To generate a flamegraph from this result, follow [this tutorial](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#create-a-flame-graph-with-system-perf-tools)
76+
from step 6.
77+
78+
Because `perf` output is not a Node.js specific tool, it might have issues with how JavaScript code is optimized in
79+
Node.js. See [perf output issues](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#perf-output-issues) for a
80+
futher reference.
81+
82+
## Useful Links
83+
84+
* https://nodejs.org/en/docs/guides/diagnostics-flamegraph/
85+
* https://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html
86+
* https://perf.wiki.kernel.org/index.php/Main_Page
87+
* https://blog.rafaelgss.com.br/node-cpu-profiler

0 commit comments

Comments
 (0)