ELASTIC SIEM: Kibana Query Language (KQL) 

Sharing is caring
This entry is part 11 of 21 in the series Threat Detection Engineering

Views: 55

Kibana supports two types of syntax languages for querying in Kibana: KQL (Kibana Query Language) and Lucene Query Syntax.

  • Kibana Query Language (KQL) is a user-friendly query language developed by Elastic specifically for Kibana. It provides autocomplete suggestions and supports filtering using various operators and functions.

  • The Lucene Query Syntax is another query language powered by an open-source search engine library used as a backend for search engines, including Elasticsearch. It is more powerful than KQL but is harder to learn for beginners.

Certain characters are reserved in ELK queries and must be escaped before usage. Reserved characters in ELK include +-=&&||&| and !. For instance, using the + character in a query will result in an error; to escape this character, precede it with a backslash (e.g. \+).

For example, say you’re searching for documents that contain the term “User+1” in the “username” field. Simply typing username:User+1 in the query bar will result in an error because the plus symbol is reserved. To escape it, type username:User\+1, and the query will return the desired result.

Wildcards match specific characters within a field value. For example, using the * wildcard will match any number of characters, while using the ? wildcard will match a single character.

product_name:monit*

name:J?n

Sometimes, values in a data set are nested like in a JSON format. Nested queries allow us to search within these objects without needing an external JSON parser.

Example Dataset

record_idincident_typeaffected_systemscomments
1DDoS[{“system”: “web-server”}, {“system”: “database”}][{“author”: “Alice”, “text”: “Mitigated DDoS attack”}, {“author”: “Bob”, “text”: “Checked logs, found suspicious IPs”}]
2Malware[{“system”: “web-server”}, {“system”: “file-server”}][{“author”: “Charlie”, “text”: “Removed malware”}, {“author”: “Eve”, “text”: “Updated antivirus software”}]
3Data breach[{“system”: “database”}][{“author”: “Alice”, “text”: “Patched vulnerability”}, {“author”: “Eve”, “text”: “Reset all user passwords”}]
4Phishing[{“system”: “email-server”}][{“author”: “Bob”, “text”: “Blocked phishing email”}, {“author”: “Charlie”, “text”: “Sent warning to all users”}]
5Insider threat[{“system”: “file-server”}, {“system”: “database”}][{“author”: “Eve”, “text”: “Investigating employee activity”}, {“author”: “Alice”, “text”: “Implementing stricter access controls”}]
#Nested Queries

#Example One
comments.author:"Alice" AND comments.text:attack

#Example Two
affected_systems.affected_files.file_name:marketing_strategy* AND affected_systems.system_name:"file-server*"

affected_systems.system_type: "web server" AND affected_systems.logged_on_users: "admin" AND incident_comments: "true positive"

Range queries allow us to search for documents with field values within a specified range. 

Example Dataset:

alert_idalert_typeresponse_time_seconds
1Malware Detection120
2Unusual Login Attempt240
3Suspicious Traffic600
4Unauthorized File Access300
5Phishing Email180
#To search for all documents where the "response_time_seconds" field is greater than or equal to 100, then the query for you to use is:

response_time_seconds >= 100

#The one for less than 300:

response_time_seconds < 300

#with AND operator

response_time_seconds >= 100 AND response_time_seconds < 300

Querying for specific date or date ranges

@timestamp<"yyyy-MM-ddTHH:mm:ssZ"

@timestamp>yyyy-MM-dd

#Nested query

response_time_seconds >= 100 AND response_time_seconds < 300

Sample Queries:

severity_level >= 9 AND incident_type: "Data Leak" 

Output:

@timestamp<"2022-11-30" AND affected_systems.system_type : email, web AND team_members.name : AJohnston

Output:

incident_id >= 1 AND incident_id < 501 and incident_type: "Data Leak" AND affected_systems.system_name : "file-server-65"

Output:

Fuzzy searching is beneficial when searching for documents with inconsistencies or typos in the data. 

For example, if you want to search for “server”, you can use a fuzzy search to return documents containing “serber”, “server01”, and “server001”. See below:

host_namestatus
server01online
serber01online
sirbir01offline
sorvor01online
workstation01offline
workstation001offline

A fuzziness of 1 returns,

host_name:server01~1
{
  "host_name": "server01",
  "status": "online"
},
{
  "host_name": "serber01",
  "status": "online"
}
Output

A fuzziness of 2 returns,

host_name:server01~2
{ "host_name": "server01", "status": "online" }, { "host_name": "serber01", "status": "online" }, { "host_name": "sorvor01", "status": "online" }, 
Output

NOTE: Fuzzy searching does not work on nested data and only matches on one-word strings. Despite the limitations, it is still useful, especially for finding typos.

Turn “OFF” KQL as boosting only works with Lu

Proximity searches allow you to search for documents where the field values contain two or more terms within a specified distance. In KQL, you can use the match_phrase query with the slop parameter to perform a proximity search. The slop parameter sets the maximum distance that the terms can be from each other. For example, a slop value of 2 means that the words can be up to 2 positions away.

field_name:"search term"~slop_value

Sample Dataset:

log_idlog_message
1Server error: failed login attempt.
2Login server – failed on startup with error.
3Login to server failed successfully.
4Server: Detected error in connection.

Example Query:

log_message:"server error"~1

Output:

{ "log_id": 1, "log_message": "Server error: failed login attempt." }, { "log_id": 4, "log_message": "Server: Detected error in connection." }
Please note in the results above that “server” and “error” have one word or less in between them.

Another Query:

log_message:"failed login"~0

Output:

{
  "log_id": 1,
  "log_message": "Server error: failed login attempt."
}
Note: Slop value is set to 0

Another Query:

log_message:"server error"~4

A Nested Query:

log_message:"server error"~1 OR "login server"~1

Output:

Another Query:

incident_comments: "data leak true negative"~3

Output:

Another Query:

incident_comments: "detected negative"~2 AND team_members.name: AJohnston 

Output:

Regular expressions

Regular expressions (or regex, regexp) allow you to use a pattern to match field values. We can use regexp in Kibana to search for complex patterns that cannot easily be found using simple query strings or wildcards.

Consider the following dataset:

IDDateEvent TypeDescriptionSource IPDestination IPURL
12023-04-10DDoS AttackDistributed denial of service attack on a company’s website192.168.1.10203.0.113.1http://www.example1.com
22023-04-12PhishingPhishing email attempting to steal user credentials192.168.1.11203.0.113.2http://www.example2.com/login
32023-04-15Malware InfectionMalware infection on a user’s computer192.168.1.12203.0.113.3http://www.example3.com/download
42023-04-16XSS AttackCross-site scripting attack on a web application192.168.1.13203.0.113.4http://www.example4.com/comment
52023-04-20SQL InjectionSQL injection attack on a company’s database192.168.1.14203.0.113.5http://www.example5.com/query

Match all Event Types:

Event_Type:/.*/

Match Event Types that start with S or M:

Event_Type:/(S|M).*/

Another Query based on Description field:

Description:/.*/

Another query to get values starting with s or m in the Description field:

Description:/(s|m).*/

A nested query:

Description:/(s|m).*/ AND /user.*/

Query to detect Comments containing the word RANSOMWARE:

ncident_comments: /(r).*/ AND client_list*

Series Navigation<< SOC Tools and Useful LinksSOC Home LAB: Elastic SIEM Installation >>