Prometheus plugin
For more information about what this plugin does and the data streams it retrieves, see:
The Prometheus data source enables you to query data from your Prometheus environment for use in your dashboards, analytics and monitoring.
The data source supports the following types of data streams:
- PromQL
- Simple query (an easy-to-use alternative to PromQL for common queries)
- Alerts
- Status of the Prometheus instance
In addition to data streams, the data source can index labels in your Prometheus database as objects in SquaredUp. Indexing objects makes it easier to query data, correlate Prometheus data with other data sources, and explore the data interactively. Object indexing is explained further here.
The Prometheus data source is available as a Cloud data source and also as an On-prem data source. Use the On-prem data source if your Prometheus instance is on a private network and is not publicly accessible.
Object Indexing
Prometheus time series are described by a set of labels. These labels can be indexed by SquaredUp to make it easier to query, explore and correlate metrics.
The indexed labels are stored in SquaredUp as objects. For example, when (at a minimum) the job label is indexed an object is created for each unique value, such as prometheus-local
.
If Index Open Telemetry Semantic Convention labels is selected when you Add a Prometheus data source to SquaredUp, then additional labels that conform to the Open Telemetry Semantic Conventions for resources are also indexed. For example, the k8s_pod_name
label will be indexed as Kubernetes Pod
objects. See https://opentelemetry.io/docs/specs/semconv/resource/ for more examples of resource labels.
These objects can then be selected in the tile editor when using the object-aware version of the PromQL and Simple Query data streams. When selected, the data streams are automatically scoped to the required label values, so there is no need to create "label matcher" expressions in PromQL.
They can also be viewed in the map, and you can even configure custom correlations between Prometheus objects and objects form other data sources, for example the Kubernetes Pod
objects, with other data source objects, for example AWS EKS Kubernetes Pod
objects.
To index custom labels that are not automatically indexed with Open Telemetry Semantic Conventions, use the Index custom labels option when you Add a Prometheus data source to SquaredUp.
To add a data source click on the + next to Data Sources on the left-hand menu in SquaredUp. Search for the data source and click on it to open the Configure data source page.
Before you start
Prerequisite for the On-prem version
The On-prem version requires an agent installed on a host that has access to your private network.
If you have already created an agent in SquaredUp that you can use for this data source, you can skip this step and choose the agent group you want to use while adding the data source.
Configuring and deploying an agent
If you have already created an agent in SquaredUp that you can use for this data source, you can skip this step and choose the agent group you want to use while Configuring the data source.
See one of the following, depending on your platform type:
Configuring the data source
- Display Name:
Enter a display name for the data source. If connecting to multiple Prometheus instances, you may want to give this data source a more descriptive name such asPrometheus Prod
. - Agent Group:
If configuring the On-prem version, select the Agent Group from the list. - Prometheus URL:
Enter the URL of your Prometheus instance. Use the On-prem version to access a Prometheus instance that is not publicly accessible. - Authentication Type:
Select one of the following authentication types, used to access your server, from the Query Type field:- None: No authentication is required.
- API Token: You must enter the API authentication token in the API Token field.
- Basic Authentication: You must enter a Username and Password in the corresponding fields.
Ignore Certificate errors:
If you activate this checkbox the data source will ignore certificate errors when accessing the server. This is useful if you have self-signed certificates.
- Import Semantic Conventions compatible labels:
Select whether to index Open Telemetry Semantic Conventions labels. For example, thek8s_pod_name
label will be indexed asKubernetes Pod
objects. - Enable object indexing:
Select whether to index labels as objects. Indexing labels as objects makes it easier to correlate, explore and query your Prometheus data (see Object indexing) Selecting this option will, at a minimum, index Prometheus jobs (the "job" label). Selecting this option reveals additional indexing options. - Index Open Telemetry labels:
Select whether to index Open Telemetry Semantic Conventions labels. For example, thek8s_pod_name
label will be indexed asKubernetes Pod
objects. For more information on Open Telemetry Semantic Conventions, see https://opentelemetry.io/docs/specs/semconv/resource/. - Index custom labels:
Select whether to index custom labels. These are additional labels that will be indexed as objects in SquaredUp.- Click Add a custom index.
- Enter one of more labels to index as an object. For example
app_instance
. Use commas to separate multiple labels, for exampleapp_instance_namespace, app_instance_id
. - Enter the type of object that will be created by the index, for example
App Instance
. - Click Add another custom index to add further indexes.
Restrict access to this data source:
You can enable this option if you only want certain users or groups to have access to the data source, or the permission to link it to new workspaces. See data source access control for more information.The term data source here really means data source instance. For example, a user may configure two instances of the AWS data source, one for their development environment and one for production. In that case, each data source instance has its own access control settings.
By default, Restrict access to this data source is set to off. The data source can be viewed, edited and administered by anyone. If you would like to control who has access to this data source, switch Restrict access to this data source to on.
Use the Restrict access to this data source dropdown to control who has access to the workspace:
- By default, the user setting the permissions for the data source will be given Full Control and the Everyone group will be given Link to workspace permissions.
- Tailor access to the data source, as required, by selecting individual users or user groups from the dropdown and giving them Link to workspace or Full Control permissions.
- If the user is not available from the dropdown, you are able to invite them to the data source by typing in their email address and then clicking Add. The new user will then receive an email inviting them to create an account on SquaredUp. Once the account has been created, they will gain access to the organization.
- At least one user or group must be given Full Control.
- Admin users can edit the configuration, modify the Access Control List (ACL) and delete the data source, regardless of the ACL chosen.
See Access control for more information.
- Click Test and add to validate the data source configuration.
Next steps
PromQL vs Simple Query
PromQL is commonly used for querying Prometheus and there are many resources available that include tutorials and example queries. PromQL can be difficult to get right however, even for experienced users.
The Simple Query data streams offer an alternative, easier-to-use method of querying Prometheus data for common queries.
Simple Query is the recommended data stream for non-experience PromQL users.
Global vs object-aware
The PromQL and Simple Query data streams both have global and object-aware versions. Object-aware versions use the objects indexed from the database labels to make it easier to scope your queries to the required labels. For example, an object-aware simple query can be scoped to Kubernetes Nodes by browsing and selecting the nodes during the Objects step of the tile editor.
Object-aware versions are recommended.
Using object-aware data streams requires "Enable object indexing" to be configured.
Query results
With the exception of the Alerts data stream, all data streams return data in a tabular format, with the following columns:
Timestamp
: timestamp of the metric value. If single values are requests (typically an ‘instant’ query) the the timestamp should be the current time. For ‘range’ queries, the timestamps will be at regular intervals across the configured timeframe.Value
: the metric value. This is displayed with 5 decimal places by default. Use the Columns step of the tile editor to change the number of decimal places.Auto Label
: this is a column automatically created by the data stream to be used as a series label on graphs and other visualizations. It is constructed from the labels that uniquely identify a distinct time series in the returned data. This columns is not stored in Prometheus and cannot be used for aggregations and filtering.Labels
: the remaining columns are all the labels for the Prometheus time series.
Troubleshooting query timeouts
It can be common for queries to timeout if they attempt to return a large dataset. To reduce the chance of timeouts, follow these best practices:
- Scope queries to only the required labels. Use object selections (with object-aware data streams) and label filters to filter the query to the data of interest.
- Aggregate by labels. A common cause of timeouts is a query that attempts to return a large number of unique time series. For example, a query for an HTTP response time may try to return a time series for each unique client IP address. Aggregating by the client IP address will reduce the number of time series returned
- Increase the time interval. If queries are still causing timeouts, try increasing the time interval to reduce the number of data points returned.
If the query causes a timeout and you are not sure why, try changing the query to an "instant" or "single value" query that returns just the latest data point for each time series. Use this to filter and aggregate labels to reduce the number of time series returned by the query.
Data streams
The following data streams are installed with this plugin.
Data streams standardize data from all the different shapes and formats your tools use into a straightforward tabular format.
While creating a tile you can tweak data streams by grouping or aggregating specific columns.
Depending on the kind of data, SquaredUp will automatically suggest how to visualize the result, for example as a table or line graph.
Data streams can be either global or scoped:
- Global data streams are unscoped and return information of a general nature (e.g. "Get the current number of unused hosts").
- A scoped data stream gets information relevant to the specific set objects supplied in the tile scope (e.g. "Get the current session count for these hosts").
See Data Streams for more information.
- In the tile editor, filter by the Prometheus data source, select the PromQL data stream and then click Next.
You can either select the object-aware PromQL data stream or the global PromQL data stream.
- If you selected the object-aware Prom QL data stream, select the objects you want to use and then click Next.
- Enter a Query, for example:
sum by (instance) (rate(kubedns_probe_dnsmasq_latency_ms_sum[1m])) / sum by (instance) (rate(kubedns_probe_dnsmasq_latency_ms_count[1m]))
If using the object-aware version, label matches that scope the query to the selected objects can be substituted using{{{matchers}}}
. For example:rate(rpc_server_duration_milliseconds_sum{ {{{matchers}}} }[5m]) / rate(rpc_server_duration_milliseconds_count{ {{{matchers}}} }[5m])
Mustache parameters are supported when scoped to an object. As Prometheus uses curly brackets in PromQL, an additional space is needed before and after the brackets of a mustache parameter. You can see an example query above.
A mustache parameter is a dynamic value, the actual value will be inserted to replace the field in curly braces. For example,
{{timeframe.start}}
will insert the start time based on the timeframe configured within the tile, or{{name}}
will insert the name of the object(s) in scope.This data stream supplies scoped objects individually for mustache parameters. When there are multiple objects in scope this data source will send the query multiple times, once for each object. The results are then displayed together, for example in a single table.
You can use properties of objects and write them in between curly braces e.g
{{name}}
to use them as mustache parameters. Whenever you use mustache parameters, you need to use a scope of objects that contain the property you're referencing.For example, if objects of type "host" have a property called
name
, you can use{{name}}
. This will resolve{{name}}
to the value of the name property of the different "host" objects used in the scope. - Select the query type. Typically, Range queries are used for graphing results but some Instant queries can also return results over a time range:
- Range: Evaluates over the timeframe at regular intervals.
- Instant: Evaluates once. Typically this will be at the current time, but may be a past evaluation time if the timeframe is set to the past (for example, last month).
- If Range is selected, select the time interval for the Range query:
- Automatic: The interval is selected automatically based on the timeframe. For example, 60 seconds for a timeframe of the last hour, and 1 hour for a timeframe of the last 7 days.
- Custom: Specify a custom interval. The interval can be specified as the number of seconds, or a Prometheus time duration (a number followed by s, m, h, d, w or y for seconds, minutes, days, hours, weeks or years).
- In the tile editor, filter by the Prometheus data source, select the Simple Query data stream and then click Next.
You can either select the object-aware Simple Query data stream or the global Simple Query data stream. The object-aware version is recommended. - If you selected the object-aware data stream, select the objects you want to scope to and then click Next.
- Metric:
Select a metric. If the object-aware data stream is used, the available metrics for the objects selected are listed, otherwise all metrics are listed. The list shows the metric name followed by the unit and metric type in brackets.Underscores and "total" are removed from the underlying Prometheus metric names.
- The other options depend on the type of metric selected. Prometheus supports four types of metrics- counter, gauge, histogram and summary:
- Select the type of query :
- Total count: this uses the PromQL
increase
function - Rate per second over time (typically used for graphing): this uses the PromQL
rate
function
- Total count: this uses the PromQL
- Select the label to optionally filter by, and the values to filter. The values will be logical ORed (the label may match any of the values). Regex expressions are supported.
- Select the type of label aggregation:
- None (default)
- Average
- Minimum
- Maximum
- Sum
- Top
- Bottom
- If Top or Bottom are selected, specify the number of results to return
- Select the labels by which you want to aggregation
- None: Aggregates over all labels
- Selected labels: Aggregates by the specific labels
- All labels except selected: Aggregates by all labels except the specified labels
- If Selected labels or All labels except selected is selected, specify the labels for the aggregation
- If Rate per second over time is selected as the query type, select the time interval for the Range query:
- Automatic: The interval is selected automatically based on the timeframe. For example, 60 seconds for a timeframe of the last hour, and 1 hour for a timeframe of the last 7 days.
- Custom: Specify a custom interval. The interval can be specified as the number of seconds, or a Prometheus time duration (a number followed by s, m, h, d, w or y for seconds, minutes, days, weeks or years).
- Select the type of query
- Single value: Equivalent to a PromQL
instant
query. - Values over time: Equivalent to a PromQL
range
query.
- Single value: Equivalent to a PromQL
- Select the time-based aggregation. For a single value query this is evaluated over the whole timeframe. For a values over time query this is evaluated over each time interval (step).
- Latest (most recent, or last value in the timeframe or interval)
- Average
- Minimum
- Maximum
- Select the label to optionally filter by, and the values to filter. The values will be logical ORed (the label may match any of the values). Regex expressions are supported.
- Select the type of label aggregation:
- None (default)
- Average
- Minimum
- Maximum
- Sum
- Top
- Bottom
- If Top or Bottom are selected, specify the number of results to return
- Select the labels by which you want to aggregation
- None: Aggregates over all labels
- Selected labels: Aggregates by the specific labels
- All labels except selected: Aggregates by all labels except the specified labels.
- If Selected labels or All labels except selected is selected, specify the labels for the aggregation.
- If Rate per second over time is selected as the query type, select the time interval for the Range query:
- Automatic: The interval is selected automatically based on the timeframe. For example, 60 seconds for a timeframe of the last hour, and 1 hour for a timeframe of the last 7 days.
- Custom: Specify a custom interval. The interval can be specified as the number of seconds, or a Prometheus time duration (a number followed by s, m, h, d, w or y for seconds, minutes, days, weeks or years).
- Select the type of query:
- Total count of samples
- Rate per second of samples over time
- Sample values over time
- If Sample values over time is selected as the query type, choose how to aggregate the individual sample values over each time interval (step):
- Average
- Quantile (a quantile determines the value for which a specified) percentage of samples are within.
- If Quantile is selected, specify the quantile percentage. For example, 95% will calculate the value that 95% of samples are within. The value calculated by Prometheus may be approximate.
- If Quantile is selected, select which labels to aggregate the histograms over.
- Select the label to optionally filter by, and the values to filter. The values will be logical ORed (the label may match any of the values). Regex expressions are supported.
- If Total count of samples or Rate per second of samples over time is selected as the query type, select the type of label aggregation:
- None (default)
- Average
- Minimum
- Maximum
- Sum
- Top
- Bottom
- If Top or Bottom are selected, specify the number of results to return.
- Select the labels by which you want to aggregation:
- None: Aggregates over all labels
- Selected labels: Aggregates by the specific labels (see below)
- All labels except selected: Aggregates by all labels except the specified labels (see below)
- If Selected labels or All labels except selected is selected, specify the labels for the aggregation.
- If Rate per second over time or Sample values over time is selected as the query type, select the time interval for the Range query:
- Automatic: The interval is selected automatically based on the timeframe. For example, 60 seconds for a timeframe of the last hour, and 1 hour for a timeframe of the last 7 days.
- Custom: Specify a custom interval. The interval can be specified as the number of seconds, or a Prometheus time duration (a number followed by s, m, h, d, w or y for seconds, minutes, days, weeks or years).
- Select the type of query:
- Total count of samples
- Rate per second of samples over time
- Sample values over time
- If Sample values over time is selected, choose how to aggregate the individual sample values over each time interval (step):
- Average
- Quantile (a quantile determines the value for which a specified percentage of samples are within.)
- If Quantile is selected, specify the quantile percentage. For example, 95% will calculate the value that 95% of samples are within. Available quantile percentages are determined by the metric source (scrape target).
- Select the label to optionally filter by, and the values to filter. The values will be logical ORed (the label may match any of the values). Regex expressions are supported.
- If Total count of samples or Rate per second of samples over time is selected as the query type, select the type of label aggregation:
- None (default)
- Average
- Minimum
- Maximum
- Sum
- Top
- Bottom
- If Top or Bottom are selected, specify the number of results to return
- Select the labels by which you want to aggregation:
- None: Aggregates over all labels
- Selected labels: Aggregates by the specific labels (see below)
- All labels except selected: Aggregates by all labels except the specified labels (see below)
- If Selected labels or All labels except selected is selected, specify the labels for the aggregation.
- If Rate per second over time or Sample values over time is selected as the query type, select the time interval for the Range query:
- Automatic: The interval is selected automatically based on the timeframe. For example, 60 seconds for a timeframe of the last hour, and 1 hour for a timeframe of the last 7 days.
- Custom: Specify a custom interval. The interval can be specified as the number of seconds, or a Prometheus time duration (a number followed by s, m, h, d, w or y for seconds, minutes, days, weeks or years).
- Select the type of query :
The Alert data stream returns all active alerts. This data stream does not have any parameters, but you can use the Shaping step of the tile editor to optionally filter and aggregate.
The Status data streams provide data about the Prometheus instance:
- Status / Targets: The health of the Prometheus targets.
- Status / TSDB: The health of the Prometheus Time Series Database.
- Status / Runtime: The runtime info of the Prometheus instance.