Splunk - Any Question. Any Data. One Splunk.

mail

Splunk : how to list indexes ?

With a dedicated search query in the web interface (source) :

| metadata type=sourcetypes index=* | table index sourcetype
this actually refers to data I uploaded, with a custom sourcetype, but the index field is empty ()
| metadata type=sourcetypes index=* | table index sourcetype
lists the predefined indexes : history, main and summary

From the CLI

  • export SPLUNK_HOME='/opt/splunk'; $SPLUNK_HOME/bin/splunk list index
    Your session is invalid.  Please login.
    Splunk username: bob
    Password: password
    _audit
    	/opt/splunk/var/lib/splunk/audit/db
    	/opt/splunk/var/lib/splunk/audit/colddb
    	/opt/splunk/var/lib/splunk/audit/thaweddb
    _internal
    	/opt/splunk/var/lib/splunk/_internaldb/db
    	/opt/splunk/var/lib/splunk/_internaldb/colddb
    	/opt/splunk/var/lib/splunk/_internaldb/thaweddb
    _introspection
    	/opt/splunk/var/lib/splunk/_introspection/db
    	/opt/splunk/var/lib/splunk/_introspection/colddb
    	/opt/splunk/var/lib/splunk/_introspection/thaweddb
    
  • same without prompting for credentials :
    export SPLUNK_HOME='/opt/splunk'; $SPLUNK_HOME/bin/splunk list index -auth bob:password
    • after this, credentials are cached (duration ?)
    • source : $SPLUNK_HOME/bin/splunk help auth | less
mail

Splunk glossary

index
a directory where data (events) is stored (how to list indexes ?). It is a good practice to use multiple indexes to segregate data (e.g. one for "web data", one for "security data", ...). This is convenient to :
  • have fewer data to search into when making queries, which is finally faster
  • limit access to some data to specific roles for security reasons
mail

Splunk : how to delete all the uploaded data ?

As anybody having sufficient privileges :
mail

How to import Varnish logs into Splunk ?

Situation

Splunk newbie here, there may be simpler ways to do this...

Details

Since I've not found / understood how to "detect" data fields in the uploaded log files, I'll go with a workaround : changing log files into CSV before feeding them to Splunk.

Solution

A typical log entry looks like :
12.34.56.78 - - [12/Dec/2018:02:19:04 +0100] "GET http://www.example.com/index.html HTTP/1.1" 200 345 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0"
This corresponds to the default Varnish log format :
%h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"
Because we'll process log entries with awk, let's consider them as space-separated fields (which is not true for the entry timestamp which is made of fields 4+5). We'll have to output fields as follows :
1, 4-5, 6, 7, 8, 9, 10, 11, 12-NF
...which is done by :

resultFile='data.csv'; echo 'sourceIp,timestamp,httpMethod,url,httpVersion,httpStatusCode,responseSizeBytes,httpReferer,userAgent' > "$resultFile"; for logFile in *log*; do sed -re 's/%[0-9a-fA-F]{2}/_XX/g' -e 's/,/-COMMA-/g' -e 's/\*/-ASTERISK-/g' -e 's/%s:80/-NOIP-:80/g' -e 's/%[a-z]{1,2}/-HEX-/g' "$logFile" | awk '{printf $1","$4" "; for(i=5; i<=11; i++) {printf $i","}; for(i=12; i<=NF; i++) {printf $i" "}; print""}' | tr -d '"' >> "$resultFile"; echo -n '.'; done

Notes : This should output a well-formatted CSV file ready to be uploaded into Splunk.

Last-minute notes regarding Splunk itself :

  • You may want to flush all data before retrying.
  • If you're experimenting with Splunk Free, don't forget you're limited to indexing 500MB per day. Use small datasets for trial-and-error or you'll have to wait until the next day to upload + process new data.
mail

Splunk : setup on Debian Stretch

Splunk is proprietary software available via paid license, currently existing in several editions (source) :
  • Splunk Enterprise : full-featured "on-premises" edition
  • Splunk Cloud
  • Splunk Light
  • Splunk Free : a slightly downgraded "enterprise" edition with quotas on the amount of data it can index and the number of users (see features comparison). The doc says : After 60 days you can convert to a perpetual free license or purchase a Splunk Enterprise license to continue using the expanded functionality designed for enterprise-scale deployments. This is the version we'll install here.
  • ...

Install procedure :

  1. the Splunk Free edition is actually "sold" for your contact information : you'll have to create an account on https://www.splunk.com/ before downloading
  2. you can then download a RPM, DEB or tar.gz package. The download page even outputs a download command (that may allow to download anonymously) :
    wget -O splunk-7.3.0-657388c7a488-linux-2.6-amd64.deb 'https://www.splunk.com/bin/splunk/DownloadActivityServlet?architecture=x86_64&platform=linux&version=7.3.0&product=splunk&filename=splunk-7.3.0-657388c7a488-linux-2.6-amd64.deb&wget=true'
  3. install, as root :
    dpkg -i splunk-7.3.0-657388c7a488-linux-2.6-amd64.deb
  4. have a look at these conditions about the default shell on Debian
  5. you're done

Start Splunk Enterprise for the first time (source) :

  1. cd /opt/splunk/bin/ && ./splunk start
  2. agree with the license
  3. Please enter an administrator username:
    	admin
    Please enter a new password:
    	password
  4. it will then :
    1. generate RSA private keys
    2. make certs
    3. start the Splunk daemon : splunkd
    4. spawn a web server and finally display :
      The Splunk web interface is at http://myDebianStretchHost:8000

Start Splunk at boot time with systemd (source) :

As root :
SPLUNK_HOME=/opt/splunk && $SPLUNK_HOME/bin/splunk enable boot-start -systemd-managed 1
Systemd unit file installed at /etc/systemd/system/Splunkd.service.	this is Splunkd, with a capital S
Configured as systemd managed service.
Check daemon status :
  • $SPLUNK_HOME/bin/splunk status
    splunkd is running (PID: 1739).
    splunk helpers are running (PIDs: 1799 1812 1937 1997).
    This also supports start, stop, ...
  • systemctl status Splunkd
     Splunkd.service - Systemd service file for Splunk, generated by 'splunk enable boot-start'
       Loaded: loaded (/etc/systemd/system/Splunkd.service; enabled; vendor preset: enabled)
       Active: active (running) since Thu 2019-06-27 12:12:16 CEST; 2s ago
     Main PID: 27173 (splunk)
        Tasks: 3 (limit: 4915)
       Memory: 7.9M (limit: 996.4M)
          CPU: 1.715s
       CGroup: /system.slice/Splunkd.service
               ├─27173 /opt/splunk/bin/splunk _internal_launch_under_systemd
               ├─27234 sh -c btool server list general --no-log
               └─27235 /opt/splunk/bin/splunkd btool server list general --no-log
    
    Jun 27 12:12:18 Stretch splunk[27173]:         Checking configuration... Done.
    Jun 27 12:12:18 Stretch splunk[27173]:         Checking critical directories...        Done
    Jun 27 12:12:18 Stretch splunk[27173]:         Checking indexes...
    

Next step ?

  1. read some docs. I guess you'll want to load some data in and start playing
  2. open the web interface
  3. load some data and let the journey begin...