打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Making FlameGraphs with Containerized Java – Alice Goldfuss
userphoto

2019.01.02

关注
Alice Goldfuss Reading time ~5 minutes

Making FlameGraphs with Containerized Java

About a month ago, I had the pleasure of taking a tutorial led by the fantastic Brendan Gregg on creating FlameGraphs using the Linux perf toolset. I recommend reading his many blog posts on the subject, but in short: while perf is an excellent resource for debugging kernel and user space processes, FlameGraphs make the data even easier to consume.

Now, if the process you’re trying to profile is Java, there are some extra hoops to jump through, which Brendan has also detailed online.

But if the Java process is in a container, it’s even more annoying. That’s where this post comes in.

Some context

As explained in Brendan’s blog post here, perf doesn’t work out of the box on Java, because Java doesn’t automatically expose stacks and method names. Running perf without these gives you something like this:

Notice the nondescript frame dedicated to “java”? Not very helpful.

Running Java with the option -XX:+PreserveFramePointer (starting in JDK8u60) will expose the stacks. However, without the method name symbols, you get this:

You need to also collect and dump the symbols of the running Java process, so perf can apply them to the correct stacks. This is made easier by Johannes Rudolph’s perf-map-agent repo. It has some scripts that will dump the Java process symbols and even integrate with the FlameGraph repo to make the graphs for you with one command. It’s pretty slick.

Enter containers.

Containers

Containers, for all their hype and mystery, are still processes on a host. Run a ps and you can see all container processes running the same as noncontainerized ones.

$ ps -ef | grep java103  88834  88800 33 Jan27 ?        10:05:13 /usr/java/default/bin/java

That Java process is running inside a Docker container, and from the point of view of the host, it has PID 88834 and UID 103.

Inside the container, that Java process has PID 27 and is owned by the cassandra user.

$ ps -ef | grep javacassand+     27      1 33 Jan27 ?        10:05:20 /usr/java/default/bin/java

Herein lies the issue. Due to a bug in Java, you must dump the process symbols while operating as the owner of the Java process. The perf-map-agent scripts require it. But the process owner (cassandra) only exists within the container. Meanwhile, the perf toolkit must be run as root, and it’s common practice not to allow root within running containers.

So, how can you dump the symbols?

The hack

The hack (“workaround” is too elegant a word) is to run perf outside on the host, dump the symbols inside the container, and marry the two resulting files in the same space to make a FlameGraph.

More specifically:

  1. Setup the FlameGraph repo on your host and the perf-map-agent repo inside the container where the Java process owner can access it. I also had to alter /etc/passwd inside the container to give my cassandra user a shell (use vipw for safety).
  2. Capture a system profile on the host with something like
     sudo perf record -F 99 -a -g -- sleep 30
  3. From inside the container (easier to have this running already in another shell) dump the symbols for the Java process with
     java -cp attach-main.jar:$JAVA_HOME/lib/tools.jar 
     net.virtualvoid.perf.AttachOnce PID
  4. You will now have a perf-PID.map file inside /tmp of the container. Move this file to the host (I used a mounted volume).
  5. Now on the host, rename the perf-PID.map file to match the PID of the Java process as seen by the host. For example, my file was named perf-27.map but the host has that PID as 88834, so I renamed it to perf-88834.map
  6. Move the re-named perf-PID.map file to your host’s /tmp directory and chown it to root
  7. You can now proceed with the directions as though containers are not involved. So, create a FlameGraph with
     sudo perf script | stackcollapse-perf.pl | flamegraph.pl 
     --color=java --hash > flamegraph.svg

You will need to alter this command depending on where your perf.data file resides in relation to the FlameGraph repo.

Voila! A containerized Java FlameGraph.

Tips:

  • Let Java warm up before profiling it to ensure less churn in symbol creation. I let mine run for 15 minutes.
  • Run the perf profile before dumping the symbols. Switching the order might result in empty stacks, because the symbols were created in the JVM after the perf-PID.map file.

Why?

Why is this hack needed? Why can’t we dump the symbols outside the container?

At first glance, it seems easy enough to just create a cassandra user on the host with UID 103. But trying to dump the Java symbols gives us an error:

[cassandra@hostname]$ java -cp attach-main.jar:$JAVA_HOME/lib/tools.jar net.virtualvoid.perf.AttachOnce 88834Exception in thread "main" com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded    at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)    at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)    at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:208)    at net.virtualvoid.perf.AttachOnce.loadAgent(AttachOnce.java:37)    at net.virtualvoid.perf.AttachOnce.main(AttachOnce.java:33)

This is the same behavior you get if you try to dump the symbols as a user who doesn’t own the Java process. So, the host’s cassandra user can’t attach to a socket. What kind of socket? JMX or UNIX? Not sure. The documentation isn’t super clear.

Even nsenter fails here:

[root@hostname]$ nsenter -t 88834 -n java -cp attach-main.jar:$JAVA_HOME/lib/tools.jar net.virtualvoid.perf.AttachOnce 88834Exception in thread "main" com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded    at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)    at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)    at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:208)    at net.virtualvoid.perf.AttachOnce.loadAgent(AttachOnce.java:37)    at net.virtualvoid.perf.AttachOnce.main(AttachOnce.java:33)

Walking the same network namespace as process 88834 still doesn’t access the socket.

I talked to several people about this and each conversation ended in puzzlement. Usually I would only post once I had all the answers, but I think it’s good to illustrate that everyone gets stuck sometimes. And it’s better to get the hack out there as a stopgap in the meantime, clunky though it might be. I look forward to a more elegant solution.

Special thanks

I want to thank Brendan Gregg, Johannes Rudolph, and Nitsan Wakart for creating and maintaining the FlameGraph and perf-map-agent repos, as well as helping me initially troubleshoot. Thank you to Jérôme Petazzoni for his unique container systems knowledge and my colleague Mike Hix for poking at namespaces. I am proud to work with all of you and delighted to occasionally stump you.

本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
Linux 问题故障定位,看这一篇就够了
工欲性能调优,必先利其器(2)
Java heap dump error with jmap command : Premature EOF
Docker容器学习梳理--容器登陆方法梳理(attach、exec、nsenter)
转载:代码性能优化
MySQL5.7 使用utf8mb4字符集比latin1字符集性能低25%,你敢信?
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服