Skip to content

Conversation

r-oldenburg
Copy link

@r-oldenburg r-oldenburg commented Nov 18, 2024

Hey there,

I wanted to share this contribution to this amazing project.
First thing: I am not a native GO programmer. SORRY, if I did break anything or violated any best practices / code conventions or whatever. Feel free to give such feedback.

We just changed the logging in our k8s deployments to JSON in order to make use of structured logging. Reading those in k9s is a pain, so I searched for solutions. And only found plugin solutions that are not giving the same UX as the usual Log View.

The current log UX in k8s is amazing, so I started experimenting with "jq" calls from within k9s. And eventually I found "gojq" which now even allows running without a "jq" installation on the system.

The main idea is to extract 3 fields: datetime, log level, message

Features:

  • Configure global expressions to be used in expressions (i.e. coloring by log level, padding, etc.)
  • Configure templates (collections of jq expression for time / log-level / message)
  • Toggle switch for JSON on/off ("J")
  • Iterating through the available templates ("Ctrl+J")
  • Editing of templates/their expressions while in Log View ("Shift+J")
    • this even with realtime compiling and instant error messages

Example Templates

... that I can provide (as follows):

  • "General" (simply selecting typical fields)
  • "Spring" / Logback (typical logback logging of Spring Boot Apps, even inserting arguments into placeholders)
  • "Spring + Stacktrace" same as above but with traditional Stacktrace visualization
  • "Mongo" Typical fields from Mongo DB logs
k9s:
  logger:
    tail: 7000
    buffer: 50000
    sinceSeconds: -1
    textWrap: false
    showTime: true
    decodeJson: true
    json:
      debug: false
      globalExpressions: |
        def lpad(n;p): . | tostring | if (n > length) then ((n - length) * (p|tostring)) + . else . end;
        def lpad(n): lpad(n;" "); 
        def rpad(n;p): . | tostring | if (n > length) then . + ((n - length) * (p|tostring)) else . end;
        def rpad(n): rpad(n;" ");
        def dateTimeFromTsNs($ts;$nanos): if $ts then ($ts/1000 | strflocaltime("%FT%T.")) + 
          (if $nanos then $nanos | lpad(9;0) else $ts%1000 | lpad(3;0) end) else "" end + "Z";
        def color($content;$logLvl): ({
          "ERROR": "\u001b[31m", 
          "E": "\u001b[31m", 
          "F": "\u001b[31m", 
          "WARN":"\u001b[33m", 
          "W":"\u001b[33m", 
          "INFO":"",
          "I":"",
          "DEBUG":"\u001b[34m", 
          "D1":"\u001b[34m",
          "TRACE":"\u361b[0m"
        }) as $lvlColor | $lvlColor.[$logLvl] + ($content | tostring) + "\u001b[0m";
        def insertSpringArgs(args): if (args | not) then . else reduce args[] as $arg (.; . |= sub("{}"; $arg)) end;
        def formatThrowable: .className + (if .message and .message != "null" then ": " + .message else "" end) + "\n" + 
        (.stepArray | [ .[]? | "    at " + .className + "." + .methodName + "(" + .fileName + ":" + (.lineNumber|tostring) + ")" ] | join("\n")) + 
        if .commonFramesCount then "\n    ... " + (.commonFramesCount | tostring) + " common frames omitted" else "" end;
        def formatThrowableWithCause: formatThrowable + (if .cause then "\nCaused by: " + (.cause | formatThrowableWithCause) else "" end)
      defaultTemplate: Spring
      templates:
        - name: Generic
          loglevel: '.s // .level // ."log.level"'
          datetime: '."@timestamp" // .t."$date" // .timestamp // .timeMillis'
          message: '.message // "\(.msg) \(.attr)"'
        - name: Spring
          loglevel: '.level // ."log.level" | color("[" + . + "]" | lpad(7); .)'
          datetime: '."@timestamp" // dateTimeFromTsNs(.timestamp//.timeMillis; .nanoseconds) // $k8sts'
          message: '. as $j | .message | color(insertSpringArgs($j.arguments); $j.level // $j."log.level")'
        - name: Spring + Stacktraces
          loglevel: '.level // ."log.level" | color("[" + . + "]" | lpad(7); .)'
          datetime: '."@timestamp" // dateTimeFromTsNs(.timestamp//.timeMillis; .nanoseconds)'
          message: '. as $j | .message | color(insertSpringArgs($j.arguments); $j.level // $j."log.level") +
            (if $j.throwable then "\n" + ($j.throwable | formatThrowableWithCause) else "" end)'
        - name: Mongo
          loglevel: '.s | color("[" + . + "]" | lpad(4); .)'
          datetime: '."@timestamp" // .t."$date" // dateTimeFromTsNs(.timestamp//.timeMillis; .nanoseconds)'
          message: '. as $j | "\(.msg) \(.attr)" | color(. ;$j.s // $j.level // $j."log.level")'

Screenshots

Classic log view

image

JSON decode turned on

image

Edit/select templates

image

@derailed
Copy link
Owner

derailed commented Dec 8, 2024

@r-oldenburg Thank you for this update Roland! I like the concept but it would require a lot of TLC to map out the various logging strategies to pull the required fields. That imho should live outside the main k9s config. Similarly to k9s plugins this would give us a way to quickly create and ingest contributions.
I am not super keen on using a dialog to edit the configs as a full fledge editor would be best suited for this task.
Lastly, the elephant in the room... having additional overhead to process each log line could be a deal killer.
It would be great to understand what is the cost of the overhead to colorize json via benchmarks.
Does this make sense or am I missing it?

@derailed derailed added enhancement New feature or request needs-tlc Pr needs additional updates labels Dec 8, 2024
@r-oldenburg
Copy link
Author

r-oldenburg commented Dec 9, 2024

Hey there @derailed,
this is great feedback! Thank you.
And thank you also for at least considering my addition.

Sorry for this post being so long. I hope you find the time to read it... :-D

TL;DR: You are absolutely right. Only the aspect performance may not really an issue. But hey, maybe I am missing something.

Now the long version:
I must admit, I don't know what you mean with "TLC", but I assume it means test driven refactoring or something?

If I get you right, your concerns are

  1. too many logging specifics in the main config
  2. using a dialog for editing the logging config
  3. performance overhead for the processing

Please let me elaborate on them.

First some general thoughts:

  • The functionality has been growing while I was trying to parse and read our own application logs
  • My addition is not about colorizing json. Instead it is about parsing (and somewhat compacting) json log messages
  • Colorizing can be achieved along the way (as I am doing it with bash color codes), but it is not the main goal
  • The main idea is to simplify log entries to the most compact and comprehensive text form leaving out irrelevant or "less important" info
  • Ever so often there is a line that is not json, then it is simply passed through
  • Json logs are most often very long and hard to grasp, even with coloring

Here an example for some log messages from MongoDB:

2024-12-09T07:39:10.887131339Z {"t":{"$date":"2024-12-09T08:39:10.887+01:00"},"s":"I",  "c":"REPL",     "id":5123008, "ctx":"main","msg":"Successfully registered PrimaryOnlyService","attr":{"service":"ShardSplitDonorService","namespace":"config.tenantSplitDonors"}}
2024-12-09T07:41:20.301194135Z {"t":{"$date":"2024-12-09T08:41:20.300+01:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn180","msg":"client metadata","attr":{"remote":"10.1.19.49:37652","client":"conn180","negotiatedCompressors":[],"doc":{"application":{"name":"mongosh 2.2.6"},"driver":{"name":"nodejs|mongosh","version":"6.6.2|2.2.6"},"platform":"Node.js v20.12.2, LE","os":{"name":"linux","architecture":"arm64","version":"4.14.209-160.339.amzn2.aarch64","type":"Linux"},"env":{"container":{"runtime":"docker","orchestrator":"kubernetes"}}}}}

Now in most cases you don't need all the fields. What you actually need would most likely be this:

2024-12-09T08:39:10.887 INFO Successfully registered PrimaryOnlyService
2024-12-09T08:41:20.299 INFO client metadata <maybe some additional args from json...>

Another example from java/spring:

2024-12-09T07:39:15.296118675Z {"sequenceNumber":0,"timestamp":1733729955242,"nanoseconds":242924133,"level":"INFO","threadName":"main","loggerName":"com.coremedia.springframework.component.ComponentLoader","context":{"name":"default","birthdate":1733729951147,"properties":{}},"mdc": {},"message":"loading bean definitions from root component resource location {} into application context {}","arguments": ["classpath:/META-INF/resources/WEB-INF/applicationContext.xml","org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext@34f7234e, started on Thu Jan 01 01:00:00 CET 1970"],"throwable":null}
2024-12-09T07:39:24.274689929Z {"sequenceNumber":0,"timestamp":1733729964274,"nanoseconds":274549220,"level":"INFO","threadName":"main","loggerName":"com.coremedia.springframework.component.cae.CaeComponentAutoConfiguration$LegacyCookieProcessorConfiguration","context":{"name":"default","birthdate":1733729951147,"properties":{}},"mdc": {},"message":"Configuring cookie processor of type '{}'.","arguments": ["class org.apache.tomcat.util.http.Rfc6265CookieProcessor"[],"throwable":null}

Here the arguments in the message are even given in a separate list property (arguments). Which makes it even harder to read and distinguish the log messages. Timestamp is just a number, and so on...

You actually would want to see this (leading datetime taken from the timestamp field):

2024-12-09T08:39:15.242924133Z  [INFO] loading bean definitions from root component resource location classpath:/META-INF/resources/WEB-INF/applicationContext.xml into application context org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext@34f7234e, started on Thu Jan 01 01:00:00 CET 1970
2024-12-09T08:39:24.274549220Z  [INFO] Configuring cookie processor of type 'class org.apache.tomcat.util.http.Rfc6265CookieProcessor'.

The common ground for all logs usually is: Time, Log-Level and the Message
This is the main rationale behind my code: You should be able to define JQ expressions that extract these 3 fields from your specific json logs. The message field of course can be constructed from several json fields. JQ allows nearly everything in this regard.

That way you can achieve to read the log in a regular "logfile style" while still having JSON logging turned on. For our k8s ENVs this turned out to be very(!) helpful.

Now about your (perfectly reasonable) concerns (in reverse order):

  • 3: Performance: For starters: No one has to turn it on.
    There is this "decodeJson" config flag that sets the initial state for log view. When you leave it to default/false it is turned off by default and there is (obviously) no impact. So anyone who doesn't need it, would simply leave it off.
    When turned on the parsing itself of course has a performance impact. How could it not. (Still I must say that I could not really see any difference in the behaviour on my Macbook. It just works. I am using buffer: 50000 and tail: 7000 right now). Clearly not everyone will have a fast machine and it may slow things down. True.
    But: If one turns it on, he probably needs the functionality (as do I). And that outweighs any short delay by far as it drastically improves readability. As a shortcut for non-json lines we could introduce a check for a leading "{" or "[" which would denote valid json.
  • 2: Dialog: You are right. That may not be the best way. It evolved simply out of the need for me to test and adjust the expressions during runtime. It turned out to be quite annoying to always restart k9s in order to just test a slightly different expression. To be honest in my eyes a "full fledge editor" may be a little over-the-top. Currently you can select one of the templates, adjust them and see the effect on the log in realtime (even with parsing errors when debug = true).
  • 1: Too much in main config: You are absolutely right. I guess especially the expressions/templates should be moved to a dedicated config file as they are very specific for the given environments / applications / frameworks.

I would be happy to adjust the code according to your guidance.
Let me know what you think! :-)

@Nahuel92
Copy link

I'm glad to find that I'm not the only one struggling with structured logging as well. Hope this can be merged and released soon 🚀

@r-oldenburg
Copy link
Author

Hey @derailed,
I guess you are very busy... Do you have any thoughts on my last post that you want to share?
Would be great to get some direction what to improve and where.
Thanks again

@r-oldenburg r-oldenburg force-pushed the telekom-mms/jq-for-log-view branch 2 times, most recently from 3ea3434 to 99b617d Compare March 24, 2025 08:25
@r-oldenburg
Copy link
Author

Hey @derailed,
I now extracted the "config" part into another file (json.yaml) even with a default template.
Is there anything else I can do?

@r-oldenburg
Copy link
Author

And @derailed an important follow-up after re-reading your feedback:
This PR is NOT about coloring JSON output (although that is possible).
Instead it is about parsing JSON and making those lines readable again. This is being done by selecting fields, converting, merging and interpreting field values. You see the difference when you look at the screenshots (which show the very same log).

@r-oldenburg r-oldenburg force-pushed the telekom-mms/jq-for-log-view branch from 99b617d to feffd1b Compare April 15, 2025 06:47
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label May 17, 2025
@r-oldenburg
Copy link
Author

r-oldenburg commented May 21, 2025

JSON structured logging will (very likely) increase in adaption and the demand for readable live logs will grow...

@derailed closed the related ticket and this PR is getting stale.
Frankly that is a little disappointing to me. :-(

I can perfectly understand when time is short, but maybe @derailed could at least give a feedback on how this topic will be handled in future. Even a "not to be considered in near future" would help all interested parties to plan ahead.

@r-oldenburg
Copy link
Author

P.S.: For all those interested: I regularly do share binary builds of my fork here: https://github.com/r-oldenburg/k9s/releases

@r-oldenburg r-oldenburg force-pushed the telekom-mms/jq-for-log-view branch from 110ec45 to 04c0229 Compare May 21, 2025 08:36
@github-actions github-actions bot removed the stale label May 22, 2025
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Jun 26, 2025
@github-actions github-actions bot closed this Jul 4, 2025
@r-oldenburg
Copy link
Author

@derailed I still want to ask for your feedback on this topic... And I am still wondering what the actual problem is.
Could you at least respond and give some feedback on how this will be handled in future?

Up to now the only response was closing the ticket and the PR without any further explanation. :-(

@guillaume86
Copy link

That's sad to see, especially considering @r-oldenburg politeness. Thx for your work...

@kwstannard
Copy link

I wouldn't mind something like this, but it might be less maintenance and more flexible to specify a shell program that takes a log line as stdin and outputs what is to be displayed.

@pshchelo
Copy link

pshchelo commented Sep 8, 2025

I second last comment, providing possibility for k9s to pipe logs through some external program before displaying them would be a nice boon - everyone has their favorite log parsers/highlighters/colorizers (mine is currently tailspin), so this way it is generic enough for k9s and potentially suits everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs-tlc Pr needs additional updates stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants