Let’s walk through how to extract meaningful fields like IP address, port, error level, and message content from a raw PHP warning log using regular expressions and Splunk’s rex
command.
Step 1: The Raw Log
Here’s a sample of the raw log we’re working with:
From this log, we want to extract:
-
IP Address
-
Port
-
Error Level (e.g., Warning, Notice, etc.)
-
Full message content after the error level
Step 2: Crafting the Regex Pattern
We used regexr.com to help test and refine our regular expression. After experimenting, we arrived at this pattern:
But to use this effectively in Splunk and extract named fields, we need to refactor the regex using Splunk’s capture group syntax:
Let’s break it down:
-
(?<ip>.*)
captures the IP address. -
(?<port>\d+)
captures the port number. -
(?<level>\w+)
captures the PHP error level (e.g., Warning). -
(?<message>.*)
captures the rest of the message.
Step 3: The Splunk Query
Here’s how we use this in a Splunk search:
What’s Happening Here:
-
rex
extracts the fields using the regex pattern. -
table
presents the extracted data in a readable format. -
search ip=*
filters events that contain an IP. -
stats count by message, source
aggregates the data to show how many times each message occurred per source.
A Quick Note on Regex Syntax in Splunk
When working with regex in Splunk, capturing groups should follow this syntax:
This tells Splunk to assign the matched content to the field field_name
.
Final Thoughts
Regex might look intimidating at first, but tools like Regexr make it easier to visualize and test. With consistent practice, crafting expressions and extracting fields in Splunk becomes second nature — an essential skill for log analysis and troubleshooting.
So keep practicing, and soon enough, writing regex in Splunk will feel as routine as checking your logs.