Splunk count by field value

Splunk count by field value DEFAULT

stats

Description

Calculates aggregate statistics, such as average, count, and sum, over the results set. This is similar to SQL aggregation. If the command is used without a clause, only one row is returned, which is the aggregation over the entire incoming result set. If a clause is used, one row is returned for each distinct value specified in the clause.

The command can be used for several SQL-like operations. If you are familiar with SQL but new to SPL, see Splunk SPL for SQL users.

Difference between stats and eval commands

The command calculates statistics based on fields in your events. The command creates new fields in your events by using existing fields and an arbitrary expression.

An image that shows two tables and an example of the stats command in between the tables. The top table shows 2 columns: Time and Event. There are two rows in the table that show sample events. There are timestamps in the Time column. The Event column shows the beginning of the events. The first row shows a GET with an item added to a cart. The second row shows a POST.

Syntax

Simple:
stats (stats-function(field) [AS field]) [BY field-list]

Complete:
Required syntax is in bold.

| stats
[partitions=<num>]
[allnum=<bool>]
[delim=<string>]
( <stats-agg-term> | <sparkline-agg-term> )
[<by-clause>]
[<dedup_splitvals>]

Required arguments

stats-agg-term
Syntax: <stats-func>(<evaled-field> | <wc-field>) [AS <wc-field>]
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you specify. You can use wild card characters in field names. For more information on eval expressions, see Types of eval expressions in the Search Manual.
sparkline-agg-term
Syntax: <sparkline-agg> [AS <wc-field>]
Description: A sparkline aggregation function. Use the AS clause to place the result into a new field with a name that you specify. You can use wild card characters in the field name.

Optional arguments

allnum
Syntax: allnum=<bool>
Description: If true, computes numerical statistics on each field if and only if all of the values of that field are numerical.
Default: false
by-clause
Syntax: BY <field-list>
Description: The name of one or more fields to group by. You cannot use a wildcard character to specify multiple fields with similar names. You must specify each field separately. The clause returns one row for each distinct value in the clause fields. If no clause is specified, the command returns only one row, which is the aggregation over the entire incoming result set.
dedup_splitvals
Syntax: dedup_splitvals=<boolean>
Description: Specifies whether to remove duplicate values in multivalued clause fields.
Default: false
delim
Syntax: delim=<string>
Description: Specifies how the values in the list() or values() aggregation are delimited.
Default: a single space
partitions
Syntax: partitions=<num>
Description: If specified, partitions the input data based on the split-by fields for multithreaded reduce. The partitions argument runs the reduce step (in parallel reduce processing) with multiple threads in the same search process on the same machine. Compare that with parallel reduce, using the redistribute command, that runs the reduce step in parallel on multiple machines.
Default: 1

Stats function options

stats-func
Syntax: The syntax depends on the function that you use. Refer to the table below.
Description: Statistical and charting functions that you can use with the command. Each time you invoke the command, you can use one or more functions. However, you can only use one clause. See Usage.
The following table lists the supported functions by type of function. Use the links in the table to see descriptions and examples for each function. For an overview about using functions with commands, see Statistical and charting functions.

Sparkline function options

Sparklines are inline charts that appear within table cells in search results to display time-based trends associated with the primary key of each row. Read more about how to "Add sparklines to your search results" in the Search Manual.

sparkline-agg
Syntax: sparkline (count(<wc-field>), <span-length>) | sparkline (<sparkline-func>(<wc-field>), <span-length>)
Description: A sparkline specifier, which takes the first argument of a aggregation function on a field and an optional timespan specifier. If no timespan specifier is used, an appropriate timespan is chosen based on the time range of the search. If the sparkline is not scoped to a field, only the count aggregator is permitted. You can use wildcard characters in the field name. See the Usage section.
sparkline-func
Syntax: c() | count() | dc() | mean() | avg() | stdev() | stdevp() | var() | varp() | sum() | sumsq() | min() | max() | range()
Description: Aggregation function to use to generate sparkline values. Each sparkline value is produced by applying this aggregation to the events that fall into each particular time bin.

Usage

The command is a transforming command. See Command types.

Eval expressions with statistical functions

When you use the command, you must specify either a statistical function or a sparkline function. When you use a statistical function, you can use an eval expression as part of the statistical function. For example:

Statistical functions that are not applied to specific fields

With the exception of the function, when you pair the command with functions that are not applied to specific fields or expressions that resolve into fields, the search head processes it as if it were applied to a wildcard for all fields. In other words, when you have in a search, it returns results for .

This "implicit wildcard" syntax is officially deprecated, however. Make the wildcard explicit. Write when you want a function to apply to all possible fields.

Numeric calculations

During calculations, numbers are treated as double-precision floating-point numbers, subject to all the usual behaviors of floating point numbers. If the calculation results in the floating-point special value NaN, it is represented as "nan" in your results. The special values for positive and negative infinity are represented in your results as "inf" and "-inf" respectively. Division by zero results in a null field.

There are situations where the results of a calculation contain more digits than can be represented by a floating- point number. In those situations precision might be lost on the least significant digits. For an example of how to correct this, see Example 2 of the basic examples for the sigfig(X) function.

Functions and memory usage

Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the function requires far more memory than the function. The and functions also can consume a lot of memory.

If you are using the function without a split-by field or with a low-cardinality split-by by field, consider replacing the function with the the function (estimated distinct count). The function might result in significantly lower memory usage and run times.

Memory and stats search performance

A pair of settings strike a balance between the performance of searches and the amount of memory they use during the search process, in RAM and on disk. If your searches are consistently slow to complete you can adjust these settings to improve their performance, but at the cost of increased search-time memory usage, which can lead to search failures.

If you use Splunk Cloud Platform, you need to file a Support ticket to change these settings.

For more information, see Memory and stats search performance in the Search Manual.

Event order functions

Using the and functions when searching based on time does not produce accurate results.

  • To locate the first value based on time order, use the function, instead of the function.
  • To locate the last value based on time order, use the function, instead of the function.


For example, consider the following search.

Replace the and functions when you use the and commands for ordering events based on time. The following search shows the function changes.

Wildcards in BY clauses

The command does not support wildcard characters in field values in BY clauses.

For example, you cannot specify .

Renaming fields

You cannot rename one field with multiple names. For example if you have field A, you cannot rename A as B, A as C. The following example is not valid.

Basic examples

1. Return the average transfer rate for each host

2. Search the access logs, and return the total number of hits from the top values of "referer_domain"

Search the access logs, and return the total number of hits from the top values of "referer_domain". The "top" command returns a count and percent value for each "referer_domain".

3. Calculate the average time for each hour for similar fields using wildcard characters

Return the average, for each hour, of any unique field that ends with the string "lay". For example, delay, xdelay, relay, etc.

4. Remove duplicates in the result set and return the total count for the unique results

Remove duplicates of results with the same "host" value and return the total count of the remaining results.

5. In a multivalue BY field, remove duplicate values

For each unique value of , return the average value of . Deduplicates the values in the .

Extended examples

1. Compare the difference between using the stats and chart commands

This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.

This search uses the command to count the number of events for a combination of HTTP status code values and host:

The BY clause returns one row for each distinct value in the BY clause fields. In this search, because two fields are specified in the BY clause, every unique combination of status and host is listed on separate row.

The results appear on the Statistics tab and look something like this:

status host count
www1
www2
www3
www1
www2
www3
www2
www1
www2

If you click the Visualization tab, the field forms the X-axis and the and fields form the data series. The problem with this chart is that the host values (www1, www2, www3) are strings and cannot be measured in a chart.

Substitute the command for the command in the search.

With the command, the two fields specified after the BY clause change the appearance of the results on the Statistics tab. The BY clause also makes the results suitable for displaying the results in a chart visualization.

  • The first field you specify is referred to as the <row-split> field. In the table, the values in this field become the labels for each row. In the chart, this field forms the X-axis.
  • The second field you specify is referred to as the <column-split> field. In the table, the values in this field are used as headings for each column. In the chart, this field forms the data series.

The results appear on the Statistics tab and look something like this:

status www1 www2 www3
0 0
0

If you click the Visualization tab, the field forms the X-axis, the values in the field form the data series, and the Y-axis shows the .

2. Use eval expressions to count the different types of requests against each Web server

This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.

Run the following search to use the command to determine the number of different page requests, GET and POST, that occurred for each Web server.

This example uses expressions to specify the different field values for the command to count.

  • The first clause uses the function to count the Web access events that contain the field value . Then, using the AS keyword, the field that represents these results is renamed GET.
  • The second clause does the same for POST events.
  • The counts of both types of events are then separated by the web server, using the BY clause with the field.

The results appear on the Statistics tab and look something like this:

host GET POST
www1
www2
www3

You can substitute the command for the command in this search. You can then click the Visualization tab to see a chart of the results.

3. Calculate a wide range of statistics by a specific field

Count the number of earthquakes that occurred for each magnitude range

This search uses recent earthquake data downloaded from the USGS Earthquakes website. The data is a comma separated ASCII text file that contains magnitude (mag), coordinates (latitude, longitude), region (place), etc., for each earthquake recorded.

You can download a current CSV file from the USGS Earthquake Feeds and upload the file to your Splunk instance. This example uses the All Earthquakes data from the past 30 days.

Run the following search to calculate the number of earthquakes that occurred in each magnitude range. This data set is comprised of events over a day period.

  • This search uses to define each of the ranges for the magnitude field, .
  • The rename command is then used to rename the field to "Magnitude Range".


The results appear on the Statistics tab and look something like this:

Magnitude Range Number of Earthquakes
18
11
3

Click the Visualization tab to see the result in a chart.


Calculate aggregate statistics for the magnitudes of earthquakes in an area

Search for earthquakes in and around California. Calculate the number of earthquakes that were recorded. Use statistical functions to calculate the minimum, maximum, range (the difference between the min and max), and average magnitudes of the recent earthquakes. List the values by magnitude type.

The results appear on the Statistics tab and look something like this:

magType count max(mag) min(mag) range(mag) avg(mag)
H
MbLg 1 0 0 0
Md
Me 2
Ml
Mw 6
ml 10


Find the mean, standard deviation, and variance of the magnitudes of the recent quakes

Search for earthquakes in and around California. Calculate the number of earthquakes that were recorded. Use statistical functions to calculate the mean, standard deviation, and variance of the magnitudes for recent earthquakes. List the values by magnitude type.

The results appear on the Statistics tab and look something like this:

magType count mean(mag) std(mag) var(mag)
H
MbLg 1
Md
Me 2
Ml
Mw 6
ml 10

The values should be exactly the same as the values calculated using .

4. In a table display items sold by ID, type, and name and calculate the revenue for each product

This example uses the sample dataset from the Search Tutorial and a field lookup to add more information to the event data.
  • Download the data set from Add data tutorial and follow the instructions to load the tutorial data.
  • Download the CSV file from Use field lookups tutorial and follow the instructions to set up the lookup definition to add price and productName to the events.

After you configure the field lookup, you can run this search using the time range, All time.

Create a table that displays the items sold at the Buttercup Games online store by their ID, type, and name. Also, calculate the revenue for each product.

This example uses the function to display the corresponding and values for each . Then, it uses the function to calculate a running total of the values of the field.

Also, this example renames the various fields, for better display. For the functions, the renames are done inline with an "AS" clause. The command is used to change the name of the field, since the syntax does not let you rename a split-by field.

Finally, the results are piped into an expression to reformat the field values so that they read as currency, with a dollar sign and commas.

This returns the following table of results:

This image shows the results on the Statistic tab. There are 14 results , organized by Product ID. There are 4 columns in the results: Product ID, Type, Product Name, and Revenue.

5. Determine how much email comes from each domain

This example uses sample email data. You should be able to run this search on any email data by replacing the with the value and the field with email address field name in your data. For example, the email might be , , or ).

Find out how much of the email in your organization comes from .com, .net, .org or other top level domains.

The command in this search contains two expressions, separated by a comma.

  • The first part of this search uses the command to break up the email address in the field. The is defined as the portion of the field after the symbol.
    • The function is used to break the field into a multivalue field called . The first value of is everything before the "@" symbol, and the second value is everything after.
    • The function is used to set to the second value in the multivalue field .
  • The results are then piped into the command. The function is used to count the results of the expression.
  • The uses the function to compare the to a regular expression that looks for the different suffixes in the domain. If the value of matches the regular expression, the is updated for each suffix, , , and . Other domain suffixes are counted as .

The results appear on the Statistics tab and look something like this:

.com .net .org other
0

6. Search Web access logs for the total number of hits from the top 10 referring domains

This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time range Yesterday when you run the search.

This example searches the web access logs and return the total number of hits from the top 10 referring domains.

This search uses the command to find the ten most common referer domains, which are values of the field. Some events might use instead of . The command returns a count and percent value for each .

This image shows the total number of times each referrer accesses the web site.

You can then use the command to calculate a total for the top 10 referrer accesses.

The function adds the values in the to produce the total number of times the top 10 referrers accessed the web site.

This image shows a single numeric value for the total.

See also

Functions
Statistical and charting functions
Commands
eventstats
rare
sistats
streamstats
top
Blogs
Getting started with stats, eventstats and streamstats
Search commands > stats, chart, and timechart
Smooth operator | Searching for multiple field values
Sours: https://docs.splunk.com/Documentation/Splunk//SearchReference/Stats

I'm working on an antivirus correlation rule, and I'm running into a few issues. I want to make sure dest, signature, file_path, and file_hash are all in my notable event so I can call those variables in adaptive responses.

Below is the current search I have and it works very well as far as grouping multiple file_paths with the destination so when I call the variable, it shows them both. The issue I have is that the count always goes off of whatever the biggest field is in the row. I want to only show count for the risk_signature field.

Please see screenshot for additional information.

alt text

Edit 1:

The sum(count) by dest or by anything else changes some numbers but most stay the same. There are some weird entries per the screenshot below. The top one is the original search and the second one is the sum(count) search.

alt text

Edit 2:

I think I figured it out. If I do a dc(signature), I get a count and then I can just modify it where total_signatures > 1.

Sours: https://community.splunk.com/t5/Splunk-Search/How-do-you-do-a-stats-count-by-a-specific-field/m-p/
  1. Xbox series s pro controller
  2. Razer phone 2 lineage os
  3. Almost home animal shelter

Hi. Been trying to work this one out for hours I'm close!!!

We are Splunking data such that each Host has a field "SomeText" which is some arbitrary string, and that string may be repeated on that host any number of times. It may also appear on other hosts Basically, think of something like a syslog file your crond message can be any number of different strings.

Let's say that Host1 has the following strings:

"The quick brown fox" shows up 5 times

"jumps over the" shows up 2 times

"lazy dog" shows up 10 times

"My dog has fleas" shows up 2 times

"So does yours" also shows up 2 times

I want a chart that shows me:







Host110"lazy dog"
5"The quick brown fox"
2"jumps over the"
2"My dog has fleas"
2"So does yours"

But what I GET is this:







Host11"lazy dog"
2"The quick brown fox"
5"jumps over the"
"My dog has fleas"
"So does yours"

(I think the string column is actually sorted alphabetically).

This is a mockup of the search I'm running, with field names obviously simplified:

index=myindex earliest=h | stats count(SomeText) as textCount by SomeText host | stats values(textCount) as Count,values(SomeText) as "Text" by host

What am I missing? How can I marry up the # of times a message appears with that message?

Thanks for any ideas.

Sours: https://community.splunk.com/t5/Splunk-Search/Counting-distinct-field-values-and-dislaying-count-and-value/m-p/
Splunk Commands : Detail discussion on commands related to multivalue fields

They wanted to surprise everyone, Alexey answered. How old is he. she continued. She, Alexey corrected her, she is three and a half weeks old.

Count value field splunk by

Back away with a whisper: - I very it am sorry, excuse When he reached the door, the Negro turned around to leave the room at that moment he stood half-turn to everyone and everyone paid attention to the thing that was in his pants. It was something, even through the pants everyone felt the power of the instrument, and Irina involuntarily licked her lips, which did not escape Alexey.

He walked up to her from behind, put his arm around her waist and quietly asked in her ear: - What do you want so much. - Not much, but you have to try everything.

Splunk - Mastering SPL (4) Less Used BUT Impactful Commands

And they told everything, though they were often interrupted and started moaning, and at some point Christina crumpled the sheet with a handle and began to finish. What are all the same cool these girls. How else to please them.

You will also like:

My penis, I immediately swam like a boy, despite the fact that I am married. I already opened my mouth to apologize, but then my I, trying to behave majestically and gracefully, approached the marble column installed. In the center of the spacious hall of the imperial palace.



5111 5112 5113 5114 5115