An Overview of Query Languages Used in Cybersecurity and SIEM

Query languages play an essential role in threat detection, incident response, and overall cybersecurity management. Security professionals use them to sift through vast amounts of data, extracting insights and manipulating data from various internal and external sources. 

This blog provides a technical overview of the various query languages used in cybersecurity. It discusses their business and operational contexts, presents sample use cases, and explores the emerging role of artificial intelligence (AI) and natural language processing (NLP) in enhancing their effectiveness.

Types of Query Languages in Cybersecurity

1. Structured Query Language (SQL)

Technical overview: Structured Query Language (SQL) is a standard programming language used to manage and manipulate relational databases, particularly those that work with structured data. SQL enables users to query databases, inserting, updating, and deleting records, and managing database schemas. In cybersecurity, SQL is often used to interact with expansive databases that store security-related information, such as logs, user data, and access records. It is highly established, having been in use for approximately 50 years. 

Business context: From a business perspective, SQL is critical for managing and analyzing security data stored in relational databases. It helps organizations track and audit user activities, monitor access to sensitive information, and ensure compliance with regulatory requirements. Effective use of SQL can enhance data-driven decision-making and support strategic planning in cybersecurity initiatives.

Operational context: Operationally, SQL queries are employed to generate reports on security events, identify anomalies, and investigate incidents. Security analysts use SQL to extract relevant data from logs and databases to conduct forensic investigations and generate actionable insights.

Sample Use Cases

Unauthorized access detection: SQL can be used to query access logs to detect unauthorized access. For example, a query might be written to detect instances where a user has accessed files or systems outside their typical access pattern.


SELECT user_id, access_time, resource

FROM access_logs

WHERE access_time > DATEADD(day, -7, GETDATE())

AND resource NOT IN (SELECT allowed_resources FROM user_permissions WHERE user_id = access_logs.user_id);

Audit trail analysis: SQL queries can help generate audit trails for compliance purposes. By querying the database for user activities and changes, security teams can ensure that all actions are properly logged and meet regulatory standards.


SELECT user_id, action_type, action_time

FROM audit_logs

WHERE action_time BETWEEN '2024-01-01' AND '2024-01-31';

Anomaly detection in user behavior: SQL can analyze user behavior patterns to identify anomalies, such as users who have unusually high access or changes in a short period.


SELECT user_id, COUNT(*) AS num_accesses

FROM access_logs

WHERE access_time BETWEEN DATEADD(hour, -1, GETDATE()) AND GETDATE()

GROUP BY user_id

HAVING COUNT(*) > 100;

Because SQL requires an advanced technical skill set, it is not a practical tool for non-technical professionals. But there are ways around that.

2. Security Information and Event Management (SIEM) Query Languages

Technical overview: SIEM systems use specific query languages to analyze and correlate security event data from various sources. Common SIEM query languages include Splunk’s Search Processing Language (SPL) and Elastic Security’s Query Language. Most vendors (particularly the larger ones) often have their own proprietary query languages. 

Business context: SIEM query languages enable organizations to conduct comprehensive security monitoring and analysis. They help businesses correlate events from different systems, detect sophisticated threats, and ensure compliance with security policies. Crafting effective queries for SIEM systems also facilitates enhanced situational awareness and faster incident response.

Operational context: In an operational setting, security operations teams use SIEM query languages to create dashboards, set up alerts, and generate reports based on security events. They also use these queries to monitor real-time data, investigate incidents, and perform threat-hunting activities.

Sample use cases

Real-time threat detection: SIEM systems use queries to detect real-time threats — such as distributed denial of service (DDoS) attacks — by correlating data from various sources. 


index=network_logs source="firewall" action="deny"

| stats count by src_ip

| where count > 1000

Alerting on suspicious activity: SIEM query languages can be used to set up alerts for suspicious activity, such as multiple failed login attempts, which could indicate that a brute-force attack is underway.


index=auth_logs action="failure"

| stats count by user_id, src_ip

| where count > 10

Incident investigation: SIEM queries can aid incident investigation by retrieving relevant log data, such as all events related to a specific user during a given timeframe.


index=all_logs user="jdoe"

| stats count by event_type, source

3. Query Languages in Network Security

Technical overview: Network security tools often use specialized query languages to analyze traffic data and detect anomalies. Examples include Wireshark’s display filter language and the Bro (Zeek) scripting language.

Business context: These query languages help businesses monitor network traffic, identify malicious activities, and protect critical infrastructure. They provide visibility into network behavior and help prevent data breaches and service disruptions.

Operational context: Security analysts use network security queries to filter and analyze network traffic data, set up intrusion detection rules, and troubleshoot network issues. They are also useful for ensuring network integrity and responding to potential threats.

Sample use cases

Traffic analysis: Security professionals may use Wireshark’s display filter language to analyze traffic to uncover suspicious patterns or potential threats.


ip.src == 192.168.1.100 && tcp.port == 80

Intrusion Detection: Bro (Zeek) scripting can be used to create custom rules for detecting specific types of network intrusions.


event connection_established(c: connection) {

    if ( c$orig_h == 192.168.1.100 && c$resp_p == 80 ) {

        print fmt("Suspicious connection from %s to port %d", c$orig_h, c$resp_p);

    }

}

Anomaly detection: Security tools use queries to detect anomalies in network traffic, such as unusual data transfers or unexpected protocol usage.


event protocol_dissection(c: connection) {

    if ( c$protocol == "FTP" && c$resp_h == 10.10.10.10 ) {

        print "Unusual FTP traffic detected.";

    }

}

The Role of AI and NLP

AI and NLP are transforming the use of query languages in cybersecurity. AI can analyze vast amounts of data quickly and accurately, identifying patterns and anomalies that traditional methods may miss. NLP’s use of natural language allows even non-technical users to query data and interpret results.

Value add:

  • Simplification: NLP enables security professionals to formulate queries using plain language, reducing the need for specialized knowledge of query languages. This democratizes access to powerful data analysis tools.
  • Enhanced insights: AI-driven analytics can uncover insights and correlations that might not be immediately apparent, providing a deeper understanding of security data.
  • Automation: AI can automate the creation and execution of queries, streamline incident response processes, and improve the efficiency of security operations.

Future implications: As AI and NLP technologies continue to evolve, we can expect significant advancements in query languages for cybersecurity:

  • More intuitive interfaces: Future query systems will likely offer more intuitive interfaces powered by NLP, allowing users to generate and refine queries using natural language.
  • Increased automation: AI-driven automation will handle routine query tasks and provide real-time threat analysis, reducing the manual workload for security teams.
  • Predictive capabilities: AI will enhance predictive analytics, helping organizations anticipate potential threats and take proactive measures based on data patterns.

How Anomali is Transforming Cybersecurity Queries

While query languages are fairly low in the cybersecurity stack, they are vital, enabling professionals to analyze and manage security data effectively. SQL, SIEM query languages (including vendor-specific variants), and network security query languages each play a crucial role in different aspects of security management. They provide the means to detect, analyze, and respond to threats, making them indispensable for both business and operational contexts. 

The increasing use of NLP is reducing the technical sophistication required to build effective queries. Anomali is already ahead of the game here. Instead of using a query language to craft the sample prompts we’ve seen in this blog, technical and non-technical analysts — even business users — can simply enter a natural language prompt, such as “Show me all internal systems in the US that are potentially exposed to BlackSuit ransomware over the past three days.”

Using NLP reduces the learning curve and significantly speeds up threat investigation. For example, Anomali can search a petabyte of data in seconds, a task that previously took hours. This translates into huge and immediate productivity gains, reducing the workload for overtaxed security teams while helping organizations stay ahead of emerging threats.