#log-file — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #log-file, aggregated by home.social.
-
Have you noticed the user agent ‘Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0’ in your log files? Thousands of such requests come in from 25 servers at a German provider. They look like search queries on the website as if using a form, but all quite pointless.
This has been going on for months. It doesn't really burden the server, but is a waste of resources. I don't have a good explanation. An AI bot?
-
Have you noticed the user agent ‘Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0’ in your log files? Thousands of such requests come in from 25 servers at a German provider. They look like search queries on the website as if using a form, but all quite pointless.
This has been going on for months. It doesn't really burden the server, but is a waste of resources. I don't have a good explanation. An AI bot?
-
Have you noticed the user agent ‘Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0’ in your log files? Thousands of such requests come in from 25 servers at a German provider. They look like search queries on the website as if using a form, but all quite pointless.
This has been going on for months. It doesn't really burden the server, but is a waste of resources. I don't have a good explanation. An AI bot?
-
Have you noticed the user agent ‘Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0’ in your log files? Thousands of such requests come in from 25 servers at a German provider. They look like search queries on the website as if using a form, but all quite pointless.
This has been going on for months. It doesn't really burden the server, but is a waste of resources. I don't have a good explanation. An AI bot?
-
Have you noticed the user agent ‘Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0’ in your log files? Thousands of such requests come in from 25 servers at a German provider. They look like search queries on the website as if using a form, but all quite pointless.
This has been going on for months. It doesn't really burden the server, but is a waste of resources. I don't have a good explanation. An AI bot?
-
Ist euch auch in den Logfiles der User-Agent "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" aufgefallen? Von 25 Servern bei einem deutschen Provider kommen tausende solche Anfragen herein. Alle ziemlich sinnlos. Immer werden allen Resourcen der jeweiligen Webseite geladen.
Das läuft schon seit Monaten. Belastet den Server nicht wirklich, ist aber letztlich eine Verschwendung. Eine gute Erklärung habe ich nicht. Ein KI-Bot?
-
Ist euch auch in den Logfiles der User-Agent "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" aufgefallen? Von 25 Servern bei einem deutschen Provider kommen tausende solche Anfragen herein. Alle ziemlich sinnlos. Immer werden allen Resourcen der jeweiligen Webseite geladen.
Das läuft schon seit Monaten. Belastet den Server nicht wirklich, ist aber letztlich eine Verschwendung. Eine gute Erklärung habe ich nicht. Ein KI-Bot?
-
Ist euch auch in den Logfiles der User-Agent "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" aufgefallen? Von 25 Servern bei einem deutschen Provider kommen tausende solche Anfragen herein. Alle ziemlich sinnlos. Immer werden allen Resourcen der jeweiligen Webseite geladen.
Das läuft schon seit Monaten. Belastet den Server nicht wirklich, ist aber letztlich eine Verschwendung. Eine gute Erklärung habe ich nicht. Ein KI-Bot?
-
Ist euch auch in den Logfiles der User-Agent "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" aufgefallen? Von 25 Servern bei einem deutschen Provider kommen tausende solche Anfragen herein. Alle ziemlich sinnlos. Immer werden allen Resourcen der jeweiligen Webseite geladen.
Das läuft schon seit Monaten. Belastet den Server nicht wirklich, ist aber letztlich eine Verschwendung. Eine gute Erklärung habe ich nicht. Ein KI-Bot?
-
Ist euch auch in den Logfiles der User-Agent "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" aufgefallen? Von 25 Servern bei einem deutschen Provider kommen tausende solche Anfragen herein. Alle ziemlich sinnlos. Immer werden allen Resourcen der jeweiligen Webseite geladen.
Das läuft schon seit Monaten. Belastet den Server nicht wirklich, ist aber letztlich eine Verschwendung. Eine gute Erklärung habe ich nicht. Ein KI-Bot?
-
Dass sich (KI-)Bots 🤖 im Open-Data-Portal tummeln, ist nichts Neues. Doch heute ist mir ein besonders merkwürdiger Fall begegnet, über den ich berichten möchte.
-
Dass sich (KI-)Bots 🤖 im Open-Data-Portal tummeln, ist nichts Neues. Doch heute ist mir ein besonders merkwürdiger Fall begegnet, über den ich berichten möchte.
-
Dass sich (KI-)Bots 🤖 im Open-Data-Portal tummeln, ist nichts Neues. Doch heute ist mir ein besonders merkwürdiger Fall begegnet, über den ich berichten möchte.
-
Dass sich (KI-)Bots 🤖 im Open-Data-Portal tummeln, ist nichts Neues. Doch heute ist mir ein besonders merkwürdiger Fall begegnet, über den ich berichten möchte.
-
Dass sich (KI-)Bots 🤖 im Open-Data-Portal tummeln, ist nichts Neues. Doch heute ist mir ein besonders merkwürdiger Fall begegnet, über den ich berichten möchte.
-
I've already done a webhook similar to #discord's and will also implement #logfile #monitoring to replace #Fail2Ban.
-
I've already done a webhook similar to #discord's and will also implement #logfile #monitoring to replace #Fail2Ban.
-
I've already done a webhook similar to #discord's and will also implement #logfile #monitoring to replace #Fail2Ban.
-
I've already done a webhook similar to #discord's and will also implement #logfile #monitoring to replace #Fail2Ban.
-
I've already done a webhook similar to #discord's and will also implement #logfile #monitoring to replace #Fail2Ban.
-
I came across the #Fail2Ban #docker image from #linuxserverio and thought to myself that it's finally time to set up Fail2Ban. I admit i never used it before and it was a bit difficult to add it to my #playbook as all of my #servers have different services and therefore different #logfile paths, but that's nothing #jinja #templating can't fix.
Now that i've got #Discord notifications for banned #IPs, it's time to work on actual #IPblocking. I also want to use the #IPComplaint and #AbuseIPDB actions as i really like the idea of reporting abuse (even though i have no idea how effective that may be).
I may also want to replace the discord #webhook with #email notifications later as that's mostly the reason i've set up a #mailserver ( #stalwart ) in the first place.
I mean, most of my services are only accessible from #tailscale or my #homenetwork, but since #Ansible makes it so much easier to apply higher standards, i just can't resist. My #homelab is changing every day and i think setting up additional #security, even though i don't need it yet, is never a bad idea.
#networking #badactors #firewall #automation #linux #selfhosting #homeserver
-
I came across the #Fail2Ban #docker image from #linuxserverio and thought to myself that it's finally time to set up Fail2Ban. I admit i never used it before and it was a bit difficult to add it to my #playbook as all of my #servers have different services and therefore different #logfile paths, but that's nothing #jinja #templating can't fix.
Now that i've got #Discord notifications for banned #IPs, it's time to work on actual #IPblocking. I also want to use the #IPComplaint and #AbuseIPDB actions as i really like the idea of reporting abuse (even though i have no idea how effective that may be).
I may also want to replace the discord #webhook with #email notifications later as that's mostly the reason i've set up a #mailserver ( #stalwart ) in the first place.
I mean, most of my services are only accessible from #tailscale or my #homenetwork, but since #Ansible makes it so much easier to apply higher standards, i just can't resist. My #homelab is changing every day and i think setting up additional #security, even though i don't need it yet, is never a bad idea.
#networking #badactors #firewall #automation #linux #selfhosting #homeserver
-
I came across the #Fail2Ban #docker image from #linuxserverio and thought to myself that it's finally time to set up Fail2Ban. I admit i never used it before and it was a bit difficult to add it to my #playbook as all of my #servers have different services and therefore different #logfile paths, but that's nothing #jinja #templating can't fix.
Now that i've got #Discord notifications for banned #IPs, it's time to work on actual #IPblocking. I also want to use the #IPComplaint and #AbuseIPDB actions as i really like the idea of reporting abuse (even though i have no idea how effective that may be).
I may also want to replace the discord #webhook with #email notifications later as that's mostly the reason i've set up a #mailserver ( #stalwart ) in the first place.
I mean, most of my services are only accessible from #tailscale or my #homenetwork, but since #Ansible makes it so much easier to apply higher standards, i just can't resist. My #homelab is changing every day and i think setting up additional #security, even though i don't need it yet, is never a bad idea.
#networking #badactors #firewall #automation #linux #selfhosting #homeserver
-
I came across the #Fail2Ban #docker image from #linuxserverio and thought to myself that it's finally time to set up Fail2Ban. I admit i never used it before and it was a bit difficult to add it to my #playbook as all of my #servers have different services and therefore different #logfile paths, but that's nothing #jinja #templating can't fix.
Now that i've got #Discord notifications for banned #IPs, it's time to work on actual #IPblocking. I also want to use the #IPComplaint and #AbuseIPDB actions as i really like the idea of reporting abuse (even though i have no idea how effective that may be).
I may also want to replace the discord #webhook with #email notifications later as that's mostly the reason i've set up a #mailserver ( #stalwart ) in the first place.
I mean, most of my services are only accessible from #tailscale or my #homenetwork, but since #Ansible makes it so much easier to apply higher standards, i just can't resist. My #homelab is changing every day and i think setting up additional #security, even though i don't need it yet, is never a bad idea.
#networking #badactors #firewall #automation #linux #selfhosting #homeserver
-
I came across the #Fail2Ban #docker image from #linuxserverio and thought to myself that it's finally time to set up Fail2Ban. I admit i never used it before and it was a bit difficult to add it to my #playbook as all of my #servers have different services and therefore different #logfile paths, but that's nothing #jinja #templating can't fix.
Now that i've got #Discord notifications for banned #IPs, it's time to work on actual #IPblocking. I also want to use the #IPComplaint and #AbuseIPDB actions as i really like the idea of reporting abuse (even though i have no idea how effective that may be).
I may also want to replace the discord #webhook with #email notifications later as that's mostly the reason i've set up a #mailserver ( #stalwart ) in the first place.
I mean, most of my services are only accessible from #tailscale or my #homenetwork, but since #Ansible makes it so much easier to apply higher standards, i just can't resist. My #homelab is changing every day and i think setting up additional #security, even though i don't need it yet, is never a bad idea.
#networking #badactors #firewall #automation #linux #selfhosting #homeserver
-
lnav - 「tail -f」で消耗したくない方必見のターミナル用ログファイルビューアー
https://softantenna.com/blog/lnav/ -
Day 51 of #100DaysOfCode
Visualize Google bot visits through log file data (can easily be adapted to all other bots).🔵 Requests by device type
🔵 Requests to first and second directories of site
🔵 Status code, time, response sizeGet the interactive HTML chart here:
-
Day 51 of #100DaysOfCode
Visualize Google bot visits through log file data (can easily be adapted to all other bots).🔵 Requests by device type
🔵 Requests to first and second directories of site
🔵 Status code, time, response sizeGet the interactive HTML chart here:
-
Day 51 of #100DaysOfCode
Visualize Google bot visits through log file data (can easily be adapted to all other bots).🔵 Requests by device type
🔵 Requests to first and second directories of site
🔵 Status code, time, response sizeGet the interactive HTML chart here:
-
Day 51 of #100DaysOfCode
Visualize Google bot visits through log file data (can easily be adapted to all other bots).🔵 Requests by device type
🔵 Requests to first and second directories of site
🔵 Status code, time, response sizeGet the interactive HTML chart here:
-
Day 51 of #100DaysOfCode
Visualize Google bot visits through log file data (can easily be adapted to all other bots).🔵 Requests by device type
🔵 Requests to first and second directories of site
🔵 Status code, time, response sizeGet the interactive HTML chart here:
-
Day 41 of #100DaysOfCode:
Tried to parse a log file with 4.6 Billion lines until it crashed my computer. There should be a better way to do that.
I settled for 100 Million lines. (16 minutes, and one line of code)
You can get started here:
-
Day 41 of #100DaysOfCode:
Tried to parse a log file with 4.6 Billion lines until it crashed my computer. There should be a better way to do that.
I settled for 100 Million lines. (16 minutes, and one line of code)
You can get started here:
-
Day 41 of #100DaysOfCode:
Tried to parse a log file with 4.6 Billion lines until it crashed my computer. There should be a better way to do that.
I settled for 100 Million lines. (16 minutes, and one line of code)
You can get started here:
-
Day 41 of #100DaysOfCode:
Tried to parse a log file with 4.6 Billion lines until it crashed my computer. There should be a better way to do that.
I settled for 100 Million lines. (16 minutes, and one line of code)
You can get started here:
-
Day 41 of #100DaysOfCode:
Tried to parse a log file with 4.6 Billion lines until it crashed my computer. There should be a better way to do that.
I settled for 100 Million lines. (16 minutes, and one line of code)
You can get started here:
-
Day 14 of #100DaysOfCode:
Created a tutorial on analyzing millions of URLs:
🔵 2.4M URLs from a web server log file
🔵 Splitting into their components creates a 5.7GB (giga) DataFrame
🔵 Using the new output_file parameter saves the same data in a 67MB (mega) file
🔵 Read only the columns you want, while filtering for a subset of rows
🔵 Enjoy!Notebook and video:
-
Day 14 of #100DaysOfCode:
Created a tutorial on analyzing millions of URLs:
🔵 2.4M URLs from a web server log file
🔵 Splitting into their components creates a 5.7GB (giga) DataFrame
🔵 Using the new output_file parameter saves the same data in a 67MB (mega) file
🔵 Read only the columns you want, while filtering for a subset of rows
🔵 Enjoy!Notebook and video:
-
Day 8 of #100DaysOfCode:
Added the option to specify custom date formats for log files:
🔵 advertools.logs_to_df will attempt to convert datetime columns to a datetime type according to default formats
🔵 Supply your own date format if your logs have a different one (or if you decide to change it)
🔵 Date format will be using the strftime format spec
🔵 Coming to adv v.0.15.0 -
Day 8 of #100DaysOfCode:
Added the option to specify custom date formats for log files:
🔵 advertools.logs_to_df will attempt to convert datetime columns to a datetime type according to default formats
🔵 Supply your own date format if your logs have a different one (or if you decide to change it)
🔵 Date format will be using the strftime format spec
🔵 Coming to adv v.0.15.0 -
Day 8 of #100DaysOfCode:
Added the option to specify custom date formats for log files:
🔵 advertools.logs_to_df will attempt to convert datetime columns to a datetime type according to default formats
🔵 Supply your own date format if your logs have a different one (or if you decide to change it)
🔵 Date format will be using the strftime format spec
🔵 Coming to adv v.0.15.0 -
Day 8 of #100DaysOfCode:
Added the option to specify custom date formats for log files:
🔵 advertools.logs_to_df will attempt to convert datetime columns to a datetime type according to default formats
🔵 Supply your own date format if your logs have a different one (or if you decide to change it)
🔵 Date format will be using the strftime format spec
🔵 Coming to adv v.0.15.0 -
GoAccess; lnav; agrind
We’ve all got’em, and most of us dread needing to have to look at them, as it usually means something’s gone awry.
Yep. We’re talking about log files. Dozens. Hundreds. Thousands…of log files. Of every shape, sort, and size.
Today we present three resources (one from “back in my day”, that is still cranking through text files, today) to help you get a handle on what your logs might be saying to you. Though, if you actually hear them saying something (and, you’re not using a screen reader), you have far more issues that what may lie in those files.
There should be something for any developers, system administrators, and data crunchers who regularly work with log files to troubleshoot issues, monitor systems, or analyze application behavior.
Type your email…
Subscribe
GoAccess
GoAccess (GH) is, for me, a blast from the past. It’s a pretty spiffy log file analyzer that offers a real-time, terminal-based, and web-based interface for monitoring web server statistics. It’s designed to be a fast, and with it, you can parse virtually any web log format, including — but not limited to — Common Log Format (CLF), Combined Log Format (XLF/ELF), W3C format (IIS), and Amazon CloudFront (Download Distribution). This flexibility means we can analyze logs from a wide variety of sources without the need for extensive configuration or setup.
One neat feature of GoAccess is its ability to generate real-time, interactive reports that can be viewed in a web browser. This is achieved through its own websocket server, which pushes the latest data to the browser, allowing users to see up-to-the-minute information about their web traffic. This real-time analysis is particularly useful for quickly diagnosing issues or understanding traffic patterns as they happen.
GoAccess also supports incremental log processing. This means that it can process logs in chunks, keep track of what it has already analyzed, and then continue from where it left off. This feature is handy when analyzing large log files or for continuous monitoring over long periods. The tool can also output its data in various formats, including HTML, JSON, and CSV, providing flexibility in how the analyzed data is consumed and shared.
lnav
The Logfile Navigator, lnav (GH), is an “enhanced log file viewer that takes advantage of any semantic information that can be gleaned from the files being viewed, such as timestamps and log levels. Using this extra semantic information, lnav can do things like interleaving messages from different files, generate histograms of messages over time, and providing hotkeys for navigating through the file.” This terminal-based application also lets us merge, tail, search, filter, and query log files with ease. There’s no server to set up, no complicated configuration; just point it to a directory, and it takes care of the rest. The section header is it slupring up all my access logs.
It has direct knowledge of three particular, and one generic, log sources:
access_log: Apache common access log formatsyslog_log: Syslog formatstrace_log: Strace log formatgeneric_log: ‘Generic’ log format. This table contains messages from files that have a very simple format with a leading timestamp followed by the message.
The tool also has support for performing SQL queries on log files using the SQLite3 “virtual” table feature. For all supported log file types, lnav will create tables that can be queried using the subset of SQL that is supported by SQLite3. For example, to get the top ten URLs being accessed in any loaded Apache log files, we can execute:
;SELECT cs_uri_stem, count(*) AS total FROM access_log GROUP BY cs_uri_stem ORDER BY total DESC LIMIT 10;
Here’s the sad result on mine:
I really dislike staring at linux
journalctllogs, but withjournalctl | lnavthey become way easier to triage.Honestly, there’s so much packed into this tool that you really just have to try it out, which you can do without installing anything! Just do
ssh [email protected]in a terminal and follow along with the tutorial.Make sure to keep the extensive documentation link handy.
agrind
Photo by Maria Orlova on Pexels.comNOTE: the proper name of this tool is angle-grinder.
I cannot find better words than the author’s intro to the tool so here that is:
The [Rust-based]
agutility lets us “parse, aggregate, sum, average, min/max, percentile, and sort [our] data. [We] can see it, live-updating, in [our] terminal[s]. [It’s] designed for when, for whatever reason, [we] don’t have [our] data in graphite/ honeycomb/ kibana/ sumologic/ splunk/ etc. but still want to be able to do sophisticated analytics”.“It can process well above 1M rows per second (simple pipelines as high as 5M), so it’s usable for fairly meaty aggregation. The results will live update in your terminal as data is processed. [What’s more,
agbundles a] bare bones functional programming language coupled with a pretty terminal UI.”The basic premise is similar to that of
jq: you feed it lines of text and filter + perform operations on them in a script you fit between quotes:$ agrind '<filter1> [... <filterN>] | operator1 | operator2 | operator3 | ...'
Examples speak louder than templates.
I have to admit it was rather fun watching it live-update the counts of HTTP status codes across 53 (~220MB) of my
rud.ismain web server access logs$ time cat rud.is.access.log*| agrind '* | apache | count by status'status _count----------------------------200 909037301 98643202 41206304 39099404 34667206 8425302 4615403 3958499 3427405 1333101 749204 508502 370503 239400 106201 6500 4409 18.71s user 0.82s system 156% cpu 6.073 total
It supports defnining fields as named capture groups in regular expressions, which is pretty cool. For example, we can pick out the timestamp and path from all the
GETrequests in this synthetic Go Gin app log:2024-02-26T12:00:01Z | INFO | 200 | 90ms | 192.168.1.1 | GET /api/v1/users2024-02-26T12:00:02Z | INFO | 201 | 45ms | 192.168.1.2 | POST /api/v1/users2024-02-26T12:00:03Z | INFO | 404 | 10ms | 192.168.1.3 | GET /api/v1/unknown2024-02-26T12:00:04Z | INFO | 500 | 120ms | 192.168.1.4 | PUT /api/v1/users/1232024-02-26T12:00:05Z | INFO | 200 | 78ms | 192.168.1.5 | GET /api/v1/posts2024-02-26T12:00:06Z | INFO | 403 | 12ms | 192.168.1.6 | DELETE /api/v1/users/1232024-02-26T12:00:07Z | INFO | 200 | 65ms | 192.168.1.7 | GET /api/v1/comments2024-02-26T12:00:08Z | INFO | 422 | 47ms | 192.168.1.8 | POST /api/v1/posts2024-02-26T12:00:09Z | INFO | 200 | 89ms | 192.168.1.9 | GET /api/v1/users/123/posts2024-02-26T12:00:10Z | INFO | 204 | 15ms | 192.168.1.10 | DELETE /api/v1/posts/123
via:
$ cat gin | agrind '"GET " | parse regex "^(?P<ts>[^|]+).*GET (?P<path>.*)"'[path=/api/v1/users] [ts=2024-02-26T12:00:01Z][path=/api/v1/unknown] [ts=2024-02-26T12:00:03Z][path=/api/v1/posts] [ts=2024-02-26T12:00:05Z][path=/api/v1/comments] [ts=2024-02-26T12:00:07Z][path=/api/v1/users/123/posts] [ts=2024-02-26T12:00:09Z]
This example is from the README, but it shows how much nicer it is to do the processing in
agcsjq:curl https://api.github.com/repos/rcoh/angle-grinder/releases | \ jq '.[] | .assets | .[]' -c | \ agrind '* | json | parse "download/*/" from browser_download_url as version | sum(download_count) by version | sort by version desc'version _sum-----------------------v0.6.2 0v0.6.1 4v0.6.0 5v0.5.1 0v0.5.0 4v0.4.0 0v0.3.3 0v0.3.2 2v0.3.1 9v0.3.0 7v0.2.1 0v0.2.0 1
There are plenty more examples in the repo, and the author has a pretty cool blog post on the Rust journey taken to build the tool.
Type your email…
Subscribe
FIN
Remember, you can follow and interact with the full text of The Daily Drop’s free posts on Mastodon via
@[email protected]☮️ -
GoAccess; lnav; agrind
We’ve all got’em, and most of us dread needing to have to look at them, as it usually means something’s gone awry.
Yep. We’re talking about log files. Dozens. Hundreds. Thousands…of log files. Of every shape, sort, and size.
Today we present three resources (one from “back in my day”, that is still cranking through text files, today) to help you get a handle on what your logs might be saying to you. Though, if you actually hear them saying something (and, you’re not using a screen reader), you have far more issues that what may lie in those files.
There should be something for any developers, system administrators, and data crunchers who regularly work with log files to troubleshoot issues, monitor systems, or analyze application behavior.
GoAccess
GoAccess (GH) is, for me, a blast from the past. It’s a pretty spiffy log file analyzer that offers a real-time, terminal-based, and web-based interface for monitoring web server statistics. It’s designed to be a fast, and with it, you can parse virtually any web log format, including — but not limited to — Common Log Format (CLF), Combined Log Format (XLF/ELF), W3C format (IIS), and Amazon CloudFront (Download Distribution). This flexibility means we can analyze logs from a wide variety of sources without the need for extensive configuration or setup.
One neat feature of GoAccess is its ability to generate real-time, interactive reports that can be viewed in a web browser. This is achieved through its own websocket server, which pushes the latest data to the browser, allowing users to see up-to-the-minute information about their web traffic. This real-time analysis is particularly useful for quickly diagnosing issues or understanding traffic patterns as they happen.
GoAccess also supports incremental log processing. This means that it can process logs in chunks, keep track of what it has already analyzed, and then continue from where it left off. This feature is handy when analyzing large log files or for continuous monitoring over long periods. The tool can also output its data in various formats, including HTML, JSON, and CSV, providing flexibility in how the analyzed data is consumed and shared.
lnav
The Logfile Navigator, lnav (GH), is an “enhanced log file viewer that takes advantage of any semantic information that can be gleaned from the files being viewed, such as timestamps and log levels. Using this extra semantic information, lnav can do things like interleaving messages from different files, generate histograms of messages over time, and providing hotkeys for navigating through the file.” This terminal-based application also lets us merge, tail, search, filter, and query log files with ease. There’s no server to set up, no complicated configuration; just point it to a directory, and it takes care of the rest. The section header is it slupring up all my access logs.
It has direct knowledge of three particular, and one generic, log sources:
access_log: Apache common access log formatsyslog_log: Syslog formatstrace_log: Strace log formatgeneric_log: ‘Generic’ log format. This table contains messages from files that have a very simple format with a leading timestamp followed by the message.
The tool also has support for performing SQL queries on log files using the SQLite3 “virtual” table feature. For all supported log file types, lnav will create tables that can be queried using the subset of SQL that is supported by SQLite3. For example, to get the top ten URLs being accessed in any loaded Apache log files, we can execute:
;SELECT cs_uri_stem, count(*) AS total FROM access_log GROUP BY cs_uri_stem ORDER BY total DESC LIMIT 10;
Here’s the sad result on mine:
I really dislike staring at linux
journalctllogs, but withjournalctl | lnavthey become way easier to triage.Honestly, there’s so much packed into this tool that you really just have to try it out, which you can do without installing anything! Just do
ssh [email protected]in a terminal and follow along with the tutorial.Make sure to keep the extensive documentation link handy.
agrind
Photo by Maria Orlova on Pexels.comNOTE: the proper name of this tool is angle-grinder.
I cannot find better words than the author’s intro to the tool so here that is:
The [Rust-based]
agutility lets us “parse, aggregate, sum, average, min/max, percentile, and sort [our] data. [We] can see it, live-updating, in [our] terminal[s]. [It’s] designed for when, for whatever reason, [we] don’t have [our] data in graphite/ honeycomb/ kibana/ sumologic/ splunk/ etc. but still want to be able to do sophisticated analytics”.“It can process well above 1M rows per second (simple pipelines as high as 5M), so it’s usable for fairly meaty aggregation. The results will live update in your terminal as data is processed. [What’s more,
agbundles a] bare bones functional programming language coupled with a pretty terminal UI.”The basic premise is similar to that of
jq: you feed it lines of text and filter + perform operations on them in a script you fit between quotes:$ agrind '<filter1> [... <filterN>] | operator1 | operator2 | operator3 | ...'
Examples speak louder than templates.
I have to admit it was rather fun watching it live-update the counts of HTTP status codes across 53 (~220MB) of my
rud.ismain web server access logs$ time cat rud.is.access.log*| agrind '* | apache | count by status'status _count----------------------------200 909037301 98643202 41206304 39099404 34667206 8425302 4615403 3958499 3427405 1333101 749204 508502 370503 239400 106201 6500 4409 18.71s user 0.82s system 156% cpu 6.073 total
It supports defnining fields as named capture groups in regular expressions, which is pretty cool. For example, we can pick out the timestamp and path from all the
GETrequests in this synthetic Go Gin app log:2024-02-26T12:00:01Z | INFO | 200 | 90ms | 192.168.1.1 | GET /api/v1/users2024-02-26T12:00:02Z | INFO | 201 | 45ms | 192.168.1.2 | POST /api/v1/users2024-02-26T12:00:03Z | INFO | 404 | 10ms | 192.168.1.3 | GET /api/v1/unknown2024-02-26T12:00:04Z | INFO | 500 | 120ms | 192.168.1.4 | PUT /api/v1/users/1232024-02-26T12:00:05Z | INFO | 200 | 78ms | 192.168.1.5 | GET /api/v1/posts2024-02-26T12:00:06Z | INFO | 403 | 12ms | 192.168.1.6 | DELETE /api/v1/users/1232024-02-26T12:00:07Z | INFO | 200 | 65ms | 192.168.1.7 | GET /api/v1/comments2024-02-26T12:00:08Z | INFO | 422 | 47ms | 192.168.1.8 | POST /api/v1/posts2024-02-26T12:00:09Z | INFO | 200 | 89ms | 192.168.1.9 | GET /api/v1/users/123/posts2024-02-26T12:00:10Z | INFO | 204 | 15ms | 192.168.1.10 | DELETE /api/v1/posts/123
via:
$ cat gin | agrind '"GET " | parse regex "^(?P<ts>[^|]+).*GET (?P<path>.*)"'[path=/api/v1/users] [ts=2024-02-26T12:00:01Z][path=/api/v1/unknown] [ts=2024-02-26T12:00:03Z][path=/api/v1/posts] [ts=2024-02-26T12:00:05Z][path=/api/v1/comments] [ts=2024-02-26T12:00:07Z][path=/api/v1/users/123/posts] [ts=2024-02-26T12:00:09Z]
This example is from the README, but it shows how much nicer it is to do the processing in
agcsjq:curl https://api.github.com/repos/rcoh/angle-grinder/releases | \ jq '.[] | .assets | .[]' -c | \ agrind '* | json | parse "download/*/" from browser_download_url as version | sum(download_count) by version | sort by version desc'version _sum-----------------------v0.6.2 0v0.6.1 4v0.6.0 5v0.5.1 0v0.5.0 4v0.4.0 0v0.3.3 0v0.3.2 2v0.3.1 9v0.3.0 7v0.2.1 0v0.2.0 1
There are plenty more examples in the repo, and the author has a pretty cool blog post on the Rust journey taken to build the tool.
FIN
Remember, you can follow and interact with the full text of The Daily Drop’s free posts on Mastodon via
@[email protected]☮️ -
GoAccess; lnav; agrind
We’ve all got’em, and most of us dread needing to have to look at them, as it usually means something’s gone awry.
Yep. We’re talking about log files. Dozens. Hundreds. Thousands…of log files. Of every shape, sort, and size.
Today we present three resources (one from “back in my day”, that is still cranking through text files, today) to help you get a handle on what your logs might be saying to you. Though, if you actually hear them saying something (and, you’re not using a screen reader), you have far more issues that what may lie in those files.
There should be something for any developers, system administrators, and data crunchers who regularly work with log files to troubleshoot issues, monitor systems, or analyze application behavior.
Type your email…
Subscribe
GoAccess
GoAccess (GH) is, for me, a blast from the past. It’s a pretty spiffy log file analyzer that offers a real-time, terminal-based, and web-based interface for monitoring web server statistics. It’s designed to be a fast, and with it, you can parse virtually any web log format, including — but not limited to — Common Log Format (CLF), Combined Log Format (XLF/ELF), W3C format (IIS), and Amazon CloudFront (Download Distribution). This flexibility means we can analyze logs from a wide variety of sources without the need for extensive configuration or setup.
One neat feature of GoAccess is its ability to generate real-time, interactive reports that can be viewed in a web browser. This is achieved through its own websocket server, which pushes the latest data to the browser, allowing users to see up-to-the-minute information about their web traffic. This real-time analysis is particularly useful for quickly diagnosing issues or understanding traffic patterns as they happen.
GoAccess also supports incremental log processing. This means that it can process logs in chunks, keep track of what it has already analyzed, and then continue from where it left off. This feature is handy when analyzing large log files or for continuous monitoring over long periods. The tool can also output its data in various formats, including HTML, JSON, and CSV, providing flexibility in how the analyzed data is consumed and shared.
lnav
The Logfile Navigator, lnav (GH), is an “enhanced log file viewer that takes advantage of any semantic information that can be gleaned from the files being viewed, such as timestamps and log levels. Using this extra semantic information, lnav can do things like interleaving messages from different files, generate histograms of messages over time, and providing hotkeys for navigating through the file.” This terminal-based application also lets us merge, tail, search, filter, and query log files with ease. There’s no server to set up, no complicated configuration; just point it to a directory, and it takes care of the rest. The section header is it slupring up all my access logs.
It has direct knowledge of three particular, and one generic, log sources:
access_log: Apache common access log formatsyslog_log: Syslog formatstrace_log: Strace log formatgeneric_log: ‘Generic’ log format. This table contains messages from files that have a very simple format with a leading timestamp followed by the message.
The tool also has support for performing SQL queries on log files using the SQLite3 “virtual” table feature. For all supported log file types, lnav will create tables that can be queried using the subset of SQL that is supported by SQLite3. For example, to get the top ten URLs being accessed in any loaded Apache log files, we can execute:
;SELECT cs_uri_stem, count(*) AS total FROM access_log GROUP BY cs_uri_stem ORDER BY total DESC LIMIT 10;
Here’s the sad result on mine:
I really dislike staring at linux
journalctllogs, but withjournalctl | lnavthey become way easier to triage.Honestly, there’s so much packed into this tool that you really just have to try it out, which you can do without installing anything! Just do
ssh [email protected]in a terminal and follow along with the tutorial.Make sure to keep the extensive documentation link handy.
agrind
Photo by Maria Orlova on Pexels.comNOTE: the proper name of this tool is angle-grinder.
I cannot find better words than the author’s intro to the tool so here that is:
The [Rust-based]
agutility lets us “parse, aggregate, sum, average, min/max, percentile, and sort [our] data. [We] can see it, live-updating, in [our] terminal[s]. [It’s] designed for when, for whatever reason, [we] don’t have [our] data in graphite/ honeycomb/ kibana/ sumologic/ splunk/ etc. but still want to be able to do sophisticated analytics”.“It can process well above 1M rows per second (simple pipelines as high as 5M), so it’s usable for fairly meaty aggregation. The results will live update in your terminal as data is processed. [What’s more,
agbundles a] bare bones functional programming language coupled with a pretty terminal UI.”The basic premise is similar to that of
jq: you feed it lines of text and filter + perform operations on them in a script you fit between quotes:$ agrind '<filter1> [... <filterN>] | operator1 | operator2 | operator3 | ...'
Examples speak louder than templates.
I have to admit it was rather fun watching it live-update the counts of HTTP status codes across 53 (~220MB) of my
rud.ismain web server access logs$ time cat rud.is.access.log*| agrind '* | apache | count by status'status _count----------------------------200 909037301 98643202 41206304 39099404 34667206 8425302 4615403 3958499 3427405 1333101 749204 508502 370503 239400 106201 6500 4409 18.71s user 0.82s system 156% cpu 6.073 total
It supports defnining fields as named capture groups in regular expressions, which is pretty cool. For example, we can pick out the timestamp and path from all the
GETrequests in this synthetic Go Gin app log:2024-02-26T12:00:01Z | INFO | 200 | 90ms | 192.168.1.1 | GET /api/v1/users2024-02-26T12:00:02Z | INFO | 201 | 45ms | 192.168.1.2 | POST /api/v1/users2024-02-26T12:00:03Z | INFO | 404 | 10ms | 192.168.1.3 | GET /api/v1/unknown2024-02-26T12:00:04Z | INFO | 500 | 120ms | 192.168.1.4 | PUT /api/v1/users/1232024-02-26T12:00:05Z | INFO | 200 | 78ms | 192.168.1.5 | GET /api/v1/posts2024-02-26T12:00:06Z | INFO | 403 | 12ms | 192.168.1.6 | DELETE /api/v1/users/1232024-02-26T12:00:07Z | INFO | 200 | 65ms | 192.168.1.7 | GET /api/v1/comments2024-02-26T12:00:08Z | INFO | 422 | 47ms | 192.168.1.8 | POST /api/v1/posts2024-02-26T12:00:09Z | INFO | 200 | 89ms | 192.168.1.9 | GET /api/v1/users/123/posts2024-02-26T12:00:10Z | INFO | 204 | 15ms | 192.168.1.10 | DELETE /api/v1/posts/123
via:
$ cat gin | agrind '"GET " | parse regex "^(?P<ts>[^|]+).*GET (?P<path>.*)"'[path=/api/v1/users] [ts=2024-02-26T12:00:01Z][path=/api/v1/unknown] [ts=2024-02-26T12:00:03Z][path=/api/v1/posts] [ts=2024-02-26T12:00:05Z][path=/api/v1/comments] [ts=2024-02-26T12:00:07Z][path=/api/v1/users/123/posts] [ts=2024-02-26T12:00:09Z]
This example is from the README, but it shows how much nicer it is to do the processing in
agcsjq:curl https://api.github.com/repos/rcoh/angle-grinder/releases | \ jq '.[] | .assets | .[]' -c | \ agrind '* | json | parse "download/*/" from browser_download_url as version | sum(download_count) by version | sort by version desc'version _sum-----------------------v0.6.2 0v0.6.1 4v0.6.0 5v0.5.1 0v0.5.0 4v0.4.0 0v0.3.3 0v0.3.2 2v0.3.1 9v0.3.0 7v0.2.1 0v0.2.0 1
There are plenty more examples in the repo, and the author has a pretty cool blog post on the Rust journey taken to build the tool.
Type your email…
Subscribe
FIN
Remember, you can follow and interact with the full text of The Daily Drop’s free posts on Mastodon via
@[email protected]☮️ -
Beim Auswerten der Zugriffszahlen gab es eine Überraschung bei der Zahl der eindeutigen Besucher*innen: ab dem 22. November plötzlich scheinbar tausende.
Aber: alle Anfragen kommen aus zwei IP-Bereichen aus China. Die Aktionen scheinen total zufällig, auch der User-Agent scheint gewürfelt zu werden, daher die hohe Zahl. Für ein DOS ist es viel zu wenig. Mittlerweile ist der Spuk wieder vorbei. Keine Ahnung, was das sollte… -
Beim Auswerten der Zugriffszahlen gab es eine Überraschung bei der Zahl der eindeutigen Besucher*innen: ab dem 22. November plötzlich scheinbar tausende.
Aber: alle Anfragen kommen aus zwei IP-Bereichen aus China. Die Aktionen scheinen total zufällig, auch der User-Agent scheint gewürfelt zu werden, daher die hohe Zahl. Für ein DOS ist es viel zu wenig. Mittlerweile ist der Spuk wieder vorbei. Keine Ahnung, was das sollte… -
Beim Auswerten der Zugriffszahlen gab es eine Überraschung bei der Zahl der eindeutigen Besucher*innen: ab dem 22. November plötzlich scheinbar tausende.
Aber: alle Anfragen kommen aus zwei IP-Bereichen aus China. Die Aktionen scheinen total zufällig, auch der User-Agent scheint gewürfelt zu werden, daher die hohe Zahl. Für ein DOS ist es viel zu wenig. Mittlerweile ist der Spuk wieder vorbei. Keine Ahnung, was das sollte… -
Beim Auswerten der Zugriffszahlen gab es eine Überraschung bei der Zahl der eindeutigen Besucher*innen: ab dem 22. November plötzlich scheinbar tausende.
Aber: alle Anfragen kommen aus zwei IP-Bereichen aus China. Die Aktionen scheinen total zufällig, auch der User-Agent scheint gewürfelt zu werden, daher die hohe Zahl. Für ein DOS ist es viel zu wenig. Mittlerweile ist der Spuk wieder vorbei. Keine Ahnung, was das sollte… -
Mit 500 h/s ziemlich langsam auf dem #Arbeitsrechner; Unter #WSL ist das #MSR Modul nicht verfügbar, das ist neben der #CPU Intel Core i5 7400 3.00Ghz vermutlich #Flaschenhals. #Virenscanner hat #angeschlagen, ich konnte die Einträge jedoch aus der #Logfile entfernen. Nicht, dass das jemand #kontrollieren würde.
-
Mit 500 h/s ziemlich langsam auf dem #Arbeitsrechner; Unter #WSL ist das #MSR Modul nicht verfügbar, das ist neben der #CPU Intel Core i5 7400 3.00Ghz vermutlich #Flaschenhals. #Virenscanner hat #angeschlagen, ich konnte die Einträge jedoch aus der #Logfile entfernen. Nicht, dass das jemand #kontrollieren würde.
-
How To Change Default Sudo Log File In Linux #sudo #logfile #Linux #Linuxtips #Linuxcommands
https://ostechnix.com/how-to-change-default-sudo-log-file-in-linux/ -
How To Change Default Sudo Log File In Linux #sudo #logfile #Linux #Linuxtips #Linuxcommands
https://ostechnix.com/how-to-change-default-sudo-log-file-in-linux/ -
I need to #compress a lot of #logfile's, so, again, tested #pbzip2 #lbzip2 #lzip #xz #pixz #pigz and #zstd.
The current winner for time/size race was "lzip -7" (which is same as plzip -7, uses all threads available). #Compression rate slightly below pixz but it's hugely faster (also has better fileformat 😉).
May have been some zstd only if it would be simple to figure out the right combo of 22 options.