Dashboards reference

This document contains a complete reference on Sourcegraph's available dashboards, as well as details on how to interpret the panels and metrics.

To learn more about Sourcegraph's metrics and how to view these dashboards, see our metrics guide.

Frontend

Serves all end-user browser and API requests.

To see this dashboard, visit /-/debug/grafana/d/frontend/frontend on your Sourcegraph instance.

Frontend: Search at a glance

frontend: 99th_percentile_search_request_duration

99th percentile successful search request duration over 5m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum by (le)(rate(src_search_streaming_latency_seconds_bucket{source="browser"}[5m])))

frontend: 90th_percentile_search_request_duration

90th percentile successful search request duration over 5m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le)(rate(src_search_streaming_latency_seconds_bucket{source="browser"}[5m])))

frontend: timeout_search_responses

Timeout search responses every 5m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
sum(increase(src_search_streaming_response{status=~"timeout|partial_timeout",source="browser"}[5m])) / sum(increase(src_search_streaming_response{source="browser"}[5m])) * 100

frontend: hard_error_search_responses

Hard error search responses every 5m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
sum(increase(src_search_streaming_response{status="error",source="browser"}[5m])) / sum(increase(src_search_streaming_response{source="browser"}[5m])) * 100

frontend: search_no_results

Searches with no results every 5m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
sum(increase(src_search_streaming_response{status="no_results",source="browser"}[5m])) / sum(increase(src_search_streaming_response{source="browser"}[5m])) * 100

frontend: search_alert_user_suggestions

Search alert user suggestions shown every 5m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
sum by (alert_type)(increase(src_search_streaming_response{status="alert",alert_type!~"timed_out",source="browser"}[5m])) / ignoring(alert_type) group_left sum(increase(src_search_streaming_response{source="browser"}[5m])) * 100

frontend: page_load_latency

90th percentile page load latency over all routes over 10m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100020 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.9, sum by(le) (rate(src_http_request_duration_seconds_bucket{route!="raw",route!="blob",route!~"graphql.*"}[10m])))

SHELL
sum by (alert_type)(increase(src_graphql_search_response{status="alert",alert_type!~"timed_out",source="browser",request_name="CodeIntelSearch"}[5m])) / ignoring(alert_type) group_left sum(increase(src_graphql_search_response{source="browser",request_name="CodeIntelSearch"}[5m])) * 100

SHELL
sum by (alert_type)(increase(src_graphql_search_response{status="alert",alert_type!~"timed_out",source="other"}[5m])) / ignoring(alert_type) group_left sum(increase(src_graphql_search_response{status="alert",source="other"}[5m]))

Frontend: Site configuration client update latency

frontend: frontend_site_configuration_duration_since_last_successful_update_by_instance

Duration since last successful site configuration update (by instance)

The duration since the configuration client used by the "frontend" service last successfully updated its site configuration. Long durations could indicate issues updating the site configuration.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
src_conf_client_time_since_last_successful_update_seconds{job=~`(sourcegraph-)?frontend`,instance=~`${internalInstance:regex}`}

frontend: frontend_site_configuration_duration_since_last_successful_update_by_instance

Maximum duration since last successful site configuration update (all "frontend" instances)

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=100301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
max(max_over_time(src_conf_client_time_since_last_successful_update_seconds{job=~`(sourcegraph-)?frontend`,instance=~`${internalInstance:regex}`}[1m]))

SHELL
sum by (op)(increase(src_codeintel_resolvers_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_codeintel_resolvers_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_codeintel_resolvers_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_autoindex_enqueuer_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindex_enqueuer_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_codeintel_autoindex_enqueuer_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_store_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum(increase(src_workerutil_dbworker_store_errors_total{domain='codeintel_index_jobs',job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum(increase(src_workerutil_dbworker_store_total{domain='codeintel_index_jobs',job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum(increase(src_workerutil_dbworker_store_errors_total{domain='codeintel_index_jobs',job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_lsifstore_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_lsifstore_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_lsifstore_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_gitserver_client_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_gitserver_client_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_gitserver_client_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploadstore_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_codeintel_uploadstore_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_codeintel_uploadstore_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op,scope)(increase(src_gitserver_client_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op,scope)(increase(src_gitserver_client_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op,scope)(increase(src_gitserver_client_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op,scope)(increase(src_gitserver_repositoryservice_client_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op,scope)(increase(src_gitserver_repositoryservice_client_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op,scope)(increase(src_gitserver_repositoryservice_client_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_batches_dbstore_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_batches_dbstore_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_batches_dbstore_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_batches_service_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_batches_service_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_batches_service_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_batches_httpapi_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op)(increase(src_batches_httpapi_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op)(increase(src_batches_httpapi_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum(increase(src_oobmigration_errors_total{op="up",job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum(increase(src_oobmigration_total{op="up",job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum(increase(src_oobmigration_errors_total{op="up",job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum(increase(src_oobmigration_errors_total{op="down",job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum(increase(src_oobmigration_total{op="down",job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum(increase(src_oobmigration_errors_total{op="down",job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

SHELL
sum(rate(grpc_server_handled_total{grpc_method=~`${zoekt_configuration_method:regex}`,instance=~`${internalInstance:regex}`,grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService"}[2m])) by (grpc_method, grpc_code)

Frontend: Zoekt Configuration GRPC "internal error" metrics

frontend: zoekt_configuration_grpc_clients_error_percentage_all_methods

Client baseline error percentage across all methods over 2m

The percentage of gRPC requests that fail across all methods (regardless of whether or not there was an internal error), aggregated across all "zoekt_configuration" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=101900 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_code!="OK"}[2m])))) / ((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService"}[2m])))))))

frontend: zoekt_configuration_grpc_clients_error_percentage_per_method

Client baseline error percentage per-method over 2m

The percentage of gRPC requests that fail per method (regardless of whether or not there was an internal error), aggregated across all "zoekt_configuration" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=101901 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_method=~"${zoekt_configuration_method:regex}",grpc_code!="OK"}[2m])) by (grpc_method))) / ((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_method=~"${zoekt_configuration_method:regex}"}[2m])) by (grpc_method))))))

frontend: zoekt_configuration_grpc_clients_all_codes_per_method

Client baseline response codes rate per-method over 2m

The rate of all generated gRPC response codes per method (regardless of whether or not there was an internal error), aggregated across all "zoekt_configuration" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=101902 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_method=~"${zoekt_configuration_method:regex}"}[2m])) by (grpc_method, grpc_code))

frontend: zoekt_configuration_grpc_clients_internal_error_percentage_all_methods

Client-observed gRPC internal error percentage across all methods over 2m

The percentage of gRPC requests that appear to fail due to gRPC internal errors across all methods, aggregated across all "zoekt_configuration" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "zoekt_configuration" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

Note: Internal errors are detected via a very coarse heuristic (seeing if the error starts with grpc:, etc.). Because of this, it`s possible that some gRPC-specific issues might not be categorized as internal errors.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=101910 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_code!="OK",is_internal_error="true"}[2m])))) / ((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService"}[2m])))))))

frontend: zoekt_configuration_grpc_clients_internal_error_percentage_per_method

Client-observed gRPC internal error percentage per-method over 2m

The percentage of gRPC requests that appear to fail to due to gRPC internal errors per method, aggregated across all "zoekt_configuration" clients.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=101911 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_method=~"${zoekt_configuration_method:regex}",grpc_code!="OK",is_internal_error="true"}[2m])) by (grpc_method))) / ((sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_method=~"${zoekt_configuration_method:regex}"}[2m])) by (grpc_method))))))

frontend: zoekt_configuration_grpc_clients_internal_error_all_codes_per_method

Client-observed gRPC internal error response code rate per-method over 2m

The rate of gRPC internal-error response codes per method, aggregated across all "zoekt_configuration" clients.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=101912 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(grpc_method_status{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",is_internal_error="true",grpc_method=~"${zoekt_configuration_method:regex}"}[2m])) by (grpc_method, grpc_code))

SHELL
(sum(rate(src_grpc_client_retry_attempts_total{grpc_service=~"sourcegraph.zoekt.configuration.v1.ZoektConfigurationService",grpc_method=~"${zoekt_configuration_method:regex}",is_retried="true"}[2m])) by (grpc_method))

SHELL
sum(rate(grpc_server_handled_total{grpc_method=~`${internal_api_method:regex}`,instance=~`${internalInstance:regex}`,grpc_service=~"api.internalapi.v1.ConfigService"}[2m])) by (grpc_method, grpc_code)

Frontend: Internal Api GRPC "internal error" metrics

frontend: internal_api_grpc_clients_error_percentage_all_methods

Client baseline error percentage across all methods over 2m

The percentage of gRPC requests that fail across all methods (regardless of whether or not there was an internal error), aggregated across all "internal_api" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102200 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",grpc_code!="OK"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService"}[2m])))))))

frontend: internal_api_grpc_clients_error_percentage_per_method

Client baseline error percentage per-method over 2m

The percentage of gRPC requests that fail per method (regardless of whether or not there was an internal error), aggregated across all "internal_api" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102201 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",grpc_method=~"${internal_api_method:regex}",grpc_code!="OK"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",grpc_method=~"${internal_api_method:regex}"}[2m])) by (grpc_method))))))

frontend: internal_api_grpc_clients_all_codes_per_method

Client baseline response codes rate per-method over 2m

The rate of all generated gRPC response codes per method (regardless of whether or not there was an internal error), aggregated across all "internal_api" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102202 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",grpc_method=~"${internal_api_method:regex}"}[2m])) by (grpc_method, grpc_code))

frontend: internal_api_grpc_clients_internal_error_percentage_all_methods

Client-observed gRPC internal error percentage across all methods over 2m

The percentage of gRPC requests that appear to fail due to gRPC internal errors across all methods, aggregated across all "internal_api" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "internal_api" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102210 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",grpc_code!="OK",is_internal_error="true"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService"}[2m])))))))

frontend: internal_api_grpc_clients_internal_error_percentage_per_method

Client-observed gRPC internal error percentage per-method over 2m

The percentage of gRPC requests that appear to fail to due to gRPC internal errors per method, aggregated across all "internal_api" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "internal_api" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102211 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",grpc_method=~"${internal_api_method:regex}",grpc_code!="OK",is_internal_error="true"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",grpc_method=~"${internal_api_method:regex}"}[2m])) by (grpc_method))))))

frontend: internal_api_grpc_clients_internal_error_all_codes_per_method

Client-observed gRPC internal error response code rate per-method over 2m

The rate of gRPC internal-error response codes per method, aggregated across all "internal_api" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "internal_api" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102212 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"api.internalapi.v1.ConfigService",is_internal_error="true",grpc_method=~"${internal_api_method:regex}"}[2m])) by (grpc_method, grpc_code))

SHELL
(sum(rate(src_grpc_client_retry_attempts_total{grpc_service=~"api.internalapi.v1.ConfigService",grpc_method=~"${internal_api_method:regex}",is_retried="true"}[2m])) by (grpc_method))

SHELL
max by(owner) (observability_test_metric_critical)

SHELL
sum(rate(src_frontend_account_lockouts_total[1m]))

Frontend: External HTTP Request Rate

frontend: external_http_request_rate_by_host

Rate of external HTTP requests by host over 1m

Shows the rate of external HTTP requests made by Sourcegraph to other services, broken down by host.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102600 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (host) (rate(src_http_client_external_request_count{host=~`${httpRequestHost:regex}`}[1m]))

frontend: external_http_request_rate_by_host_by_code

Rate of external HTTP requests by host and response code over 1m

Shows the rate of external HTTP requests made by Sourcegraph to other services, broken down by host and response code.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102610 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (host, status_code) (rate(src_http_client_external_request_count{host=~`${httpRequestHost:regex}`}[1m]))

Frontend: Cody API requests

frontend: cody_api_rate

Rate of API requests to cody endpoints (excluding GraphQL)

Rate (QPS) of requests to cody related endpoints. completions.stream is for the conversational endpoints. completions.code is for the code auto-complete endpoints.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102700 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum by (route, code)(irate(src_http_request_duration_seconds_count{route=~"^completions.*"}[5m]))

Frontend: Cloud KMS and cache

frontend: cloudkms_cryptographic_requests

Cryptographic requests to Cloud KMS every 1m

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102800 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum(increase(src_cloudkms_cryptographic_total[1m]))

frontend: encryption_cache_hit_ratio

Average encryption cache hit ratio per workload

Encryption cache hit ratio (hits/(hits+misses)) - minimum across all instances of a workload.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102801 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
min by (kubernetes_name) (src_encryption_cache_hit_total/(src_encryption_cache_hit_total+src_encryption_cache_miss_total))

frontend: encryption_cache_evictions

Rate of encryption cache evictions - sum across all instances of a given workload

Rate of encryption cache evictions (caused by cache exceeding its maximum size) - sum across all instances of a workload

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=102802 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (kubernetes_name) (irate(src_encryption_cache_eviction_total[5m]))

SHELL
(sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*frontend.*"}[5m])) / (sum by (name, job_name) (rate(src_periodic_goroutine_tenant_success_total{job=~".*frontend.*"}[5m])) + sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*frontend.*"}[5m])))) * 100

SHELL
sum by (app_name, db_name) (increase(src_pgsql_conns_closed_max_idle_time{app_name="frontend"}[5m]))

Frontend: (frontend|sourcegraph-frontend) (CPU, Memory)

frontend: cpu_usage_percentage

CPU usage

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^(frontend|sourcegraph-frontend).*"}

frontend: memory_usage_percentage

Memory usage percentage (total)

An estimate for the active memory in use, which includes anonymous memory, file memory, and kernel memory. Some of this memory is reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^(frontend|sourcegraph-frontend).*"}

frontend: memory_working_set_bytes

Memory usage bytes (total)

An estimate for the active memory in use in bytes, which includes anonymous memory, file memory, and kernel memory. Some of this memory is reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103102 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max by (name) (container_memory_working_set_bytes{name=~"^(frontend|sourcegraph-frontend).*"})

frontend: memory_rss

Memory (RSS)

The total anonymous memory in use by the application, which includes Go stack and heap. This memory is is non-reclaimable, and high usage may trigger OOM kills. Note: the metric is named RSS because to match the cadvisor name, but anonymous is more accurate."

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103110 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_rss{name=~"^(frontend|sourcegraph-frontend).*"} / container_spec_memory_limit_bytes{name=~"^(frontend|sourcegraph-frontend).*"}) by (name) * 100.0

frontend: memory_total_active_file

Memory usage (active file)

This metric shows the total active file-backed memory currently in use by the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103111 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^(frontend|sourcegraph-frontend).*"} / container_spec_memory_limit_bytes{name=~"^(frontend|sourcegraph-frontend).*"}) by (name) * 100.0

frontend: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103112 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^(frontend|sourcegraph-frontend).*"} / container_spec_memory_limit_bytes{name=~"^(frontend|sourcegraph-frontend).*"}) by (name) * 100.0

Frontend: Container monitoring (not available on server)

frontend: container_missing

Container missing

This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod (frontend\|sourcegraph-frontend) (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p (frontend\|sourcegraph-frontend).
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' (frontend\|sourcegraph-frontend) (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the (frontend|sourcegraph-frontend) container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs (frontend\|sourcegraph-frontend) (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103200 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^(frontend|sourcegraph-frontend).*"}) > 60)

frontend: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103201 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^(frontend|sourcegraph-frontend).*"}

frontend: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103202 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^(frontend|sourcegraph-frontend).*"}

frontend: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This value indicates the number of filesystem read and write operations by containers of this service. When extremely high, this can indicate a resource usage problem, or can cause problems with the service itself, especially if high values or spikes correlate with {{CONTAINER_NAME}} issues.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103203 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^(frontend|sourcegraph-frontend).*"}[1h]) + rate(container_fs_writes_total{name=~"^(frontend|sourcegraph-frontend).*"}[1h]))

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^(frontend|sourcegraph-frontend).*"})

Frontend: Golang runtime monitoring

frontend: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*(frontend|sourcegraph-frontend)"})

frontend: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103401 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*(frontend|sourcegraph-frontend)"})

Frontend: Kubernetes monitoring (only available on Kubernetes)

frontend: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*(frontend|sourcegraph-frontend)"}) / count by (app) (up{app=~".*(frontend|sourcegraph-frontend)"}) * 100

SHELL
sum(increase(zoekt_final_aggregate_size_count{reason="timer_expired"}[1d])) / sum(increase(zoekt_final_aggregate_size_count[1d])) * 100

Frontend: Email delivery

frontend: email_delivery_failures

Email delivery failure rate over 30 minutes

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103700 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum(increase(src_email_send{success="false"}[30m])) / sum(increase(src_email_send[30m])) * 100

frontend: email_deliveries_total

Total emails successfully delivered every 30 minutes

Total emails successfully delivered.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103710 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum (increase(src_email_send{success="true"}[30m]))

frontend: email_deliveries_by_source

Emails successfully delivered every 30 minutes by source

Emails successfully delivered by source, i.e. product feature.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103711 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (email_source) (increase(src_email_send{success="true"}[30m]))

SHELL
sum(rate(src_graphql_search_response{source=~"searchblitz.*", status!="success"}[$sentinel_sampling_duration])) by (status)

Frontend: Incoming webhooks

frontend: p95_time_to_handle_incoming_webhooks

P95 time to handle incoming webhooks

p95 response time to incoming webhook requests from code hosts.

Increases in response time can point to too much load on the database to keep up with the incoming requests.

See this documentation page for more details on webhook requests: (https://sourcegraph.com/docs/admin/config/webhooks/incoming)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/frontend/frontend?viewPanel=103900 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, sum  (rate(src_http_request_duration_seconds_bucket{route=~"webhooks|github.webhooks|gitlab.webhooks|bitbucketServer.webhooks|bitbucketCloud.webhooks"}[5m])) by (le, route))

SHELL
sum by (op,extended_mode)(increase(src_insights_aggregations_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) / (sum by (op,extended_mode)(increase(src_insights_aggregations_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m])) + sum by (op,extended_mode)(increase(src_insights_aggregations_errors_total{job=~"^(frontend|sourcegraph-frontend).*"}[5m]))) * 100

Git Server

Stores, manages, and operates Git repositories.

To see this dashboard, visit /-/debug/grafana/d/gitserver/gitserver on your Sourcegraph instance.

gitserver: go_routines

Go routines

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
go_goroutines{app="gitserver", instance=~`${shard:regex}`}

gitserver: disk_space_remaining

Disk space remaining

Indicates disk space remaining for each gitserver instance. When disk space is low, gitserver may experience slowdowns or fails to fetch repositories.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(src_gitserver_disk_space_available{instance=~`${shard:regex}`} / src_gitserver_disk_space_total{instance=~`${shard:regex}`}) * 100

gitserver: cpu_throttling_time

Container CPU throttling time %

A high value indicates that the container is spending too much time waiting for CPU cycles.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (container_label_io_kubernetes_pod_name) ((rate(container_cpu_cfs_throttled_periods_total{container_label_io_kubernetes_container_name="gitserver", container_label_io_kubernetes_pod_name=~`${shard:regex}`}[5m]) / rate(container_cpu_cfs_periods_total{container_label_io_kubernetes_container_name="gitserver", container_label_io_kubernetes_pod_name=~`${shard:regex}`}[5m])) * 100)

gitserver: cpu_usage_seconds

Cpu usage seconds

This value should not exceed 75% of the CPU limit over a longer period of time.
- We cannot alert on this as we don`t know the resource allocation.
- If this value is high for a longer time, consider increasing the CPU limit for the container.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (container_label_io_kubernetes_pod_name) (rate(container_cpu_usage_seconds_total{container_label_io_kubernetes_container_name="gitserver", container_label_io_kubernetes_pod_name=~`${shard:regex}`}[5m]))

gitserver: memory_major_page_faults

Gitserver page faults

The number of major page faults in a 5 minute window for gitserver. If this number increases significantly, it indicates that more git API calls need to load data from disk. There may not be enough memory to efficiently support the amount of API requests served concurrently.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100020 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
rate(container_memory_failures_total{failure_type="pgmajfault", name=~"^gitserver.*"}[5m])

gitserver: high_memory_git_commands

Number of git commands that exceeded the threshold for high memory usage

This graph tracks the number of git subcommands that gitserver ran that exceeded the threshold for high memory usage. This graph in itself is not an alert, but it is used to learn about the memory usage of gitserver.

If gitserver frequently serves requests where the status code is KILLED, this graph might help to correlate that with the high memory usage.

This graph spiking is not a problem necessarily. But when subcommands or the whole gitserver service are getting OOM killed and this graph shows spikes, increasing the memory might be useful.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100021 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sort_desc(sum(sum_over_time(src_gitserver_exec_high_memory_usage_count{instance=~`${shard:regex}`}[2m])) by (cmd))

gitserver: running_git_commands

Git commands running on each gitserver instance

A high value signals load.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100030 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (instance, cmd) (src_gitserver_exec_running{instance=~`${shard:regex}`})

gitserver: git_commands_received

Rate of git commands received

per second rate per command

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100031 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (cmd) (rate(src_gitserver_exec_duration_seconds_count{instance=~`${shard:regex}`}[5m]))

gitserver: echo_command_duration_test

Echo test command duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100040 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(src_gitserver_echo_duration_seconds)

gitserver: repo_corrupted

Number of times a repo corruption has been identified

A non-null value here indicates that a problem has been detected with the gitserver repository storage. Repository corruptions are never expected. This is a real issue. Gitserver should try to recover from them by recloning repositories, but this may take a while depending on repo size.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100041 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum(rate(src_gitserver_repo_corrupted[5m]))

gitserver: repository_clone_queue_size

Repository clone queue size

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100050 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum(src_gitserver_clone_queue)

gitserver: src_gitserver_client_concurrent_requests

Number of concurrent requests running against gitserver client

This metric is only for informational purposes. It indicates the current number of concurrently running requests by process against gitserver gRPC.

It does not indicate any problems with the instance, but can give a good indication of load spikes or request throttling.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100051 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (job, instance) (src_gitserver_client_concurrent_requests)

SHELL
sum(src_gitserver_gitservice_running{instance=~`${shard:regex}`}) by (gitservice)

SHELL
sum(rate(src_gitserver_janitor_data_structure_existence_total{instance=~`${shard:regex}`, exists="true"}[5m])) by (data_structure)

SHELL
sum(rate(src_gitserver_retry_success_total{instance=~`${shard:regex}`}[5m])) / sum(rate(src_gitserver_retry_attempts_total{instance=~`${shard:regex}`}[5m]))

SHELL
(sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*gitserver.*"}[5m])) / (sum by (name, job_name) (rate(src_periodic_goroutine_tenant_success_total{job=~".*gitserver.*"}[5m])) + sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*gitserver.*"}[5m])))) * 100

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^gitserver.*"} / container_spec_memory_limit_bytes{name=~"^gitserver.*"}) by (name) * 100.0

gitserver: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=100512 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^gitserver.*"} / container_spec_memory_limit_bytes{name=~"^gitserver.*"}) by (name) * 100.0

SHELL
sum by (container_label_io_kubernetes_pod_name) (rate(container_network_receive_errors_total{container_label_io_kubernetes_pod_name=~`${instance:regex}`}[5m]))

SHELL
sum by (type) (rate(vcssyncer_clone_duration_seconds_count{type=~`${vcsSyncerType:regex}`, success="false"}[1m]))

SHELL
sum by (type) (rate(vcssyncer_fetch_duration_seconds_count{type=~`${vcsSyncerType:regex}`, success="false"}[1m]))

SHELL
sum by (type) (rate(vcssyncer_is_cloneable_duration_seconds_count{type=~`${vcsSyncerType:regex}`, success="false"}[1m]))

Git Server: Gitserver: Gitserver Backend

gitserver: concurrent_backend_operations

Number of concurrently running backend operations

The number of requests that are currently being handled by gitserver backend layer, at the point in time of scraping.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
src_gitserver_backend_concurrent_operations

gitserver: gitserver_backend_total

Aggregate operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum(increase(src_gitserver_backend_total{job=~"^gitserver.*"}[5m]))

gitserver: gitserver_backend_99th_percentile_duration

Aggregate successful operation duration distribution over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum  by (le)(rate(src_gitserver_backend_duration_seconds_bucket{job=~"^gitserver.*"}[5m]))

gitserver: gitserver_backend_errors_total

Aggregate operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum(increase(src_gitserver_backend_errors_total{job=~"^gitserver.*"}[5m]))

gitserver: gitserver_backend_error_rate

Aggregate operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum(increase(src_gitserver_backend_errors_total{job=~"^gitserver.*"}[5m])) / (sum(increase(src_gitserver_backend_total{job=~"^gitserver.*"}[5m])) + sum(increase(src_gitserver_backend_errors_total{job=~"^gitserver.*"}[5m]))) * 100

gitserver: gitserver_backend_total

operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101020 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_gitserver_backend_total{job=~"^gitserver.*"}[5m]))

gitserver: gitserver_backend_99th_percentile_duration

99th percentile successful operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101021 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_gitserver_backend_duration_seconds_bucket{job=~"^gitserver.*"}[5m])))

gitserver: gitserver_backend_errors_total

operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101022 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_gitserver_backend_errors_total{job=~"^gitserver.*"}[5m]))

gitserver: gitserver_backend_error_rate

operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101023 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_gitserver_backend_errors_total{job=~"^gitserver.*"}[5m])) / (sum by (op)(increase(src_gitserver_backend_total{job=~"^gitserver.*"}[5m])) + sum by (op)(increase(src_gitserver_backend_errors_total{job=~"^gitserver.*"}[5m]))) * 100

SHELL
sum by (op,scope)(increase(src_gitserver_client_errors_total{job=~"^*.*"}[5m])) / (sum by (op,scope)(increase(src_gitserver_client_total{job=~"^*.*"}[5m])) + sum by (op,scope)(increase(src_gitserver_client_errors_total{job=~"^*.*"}[5m]))) * 100

SHELL
sum by (op,scope)(increase(src_gitserver_repositoryservice_client_errors_total{job=~"^*.*"}[5m])) / (sum by (op,scope)(increase(src_gitserver_repositoryservice_client_total{job=~"^*.*"}[5m])) + sum by (op,scope)(increase(src_gitserver_repositoryservice_client_errors_total{job=~"^*.*"}[5m]))) * 100

Git Server: Repos disk I/O metrics

gitserver: repos_disk_reads_sec

Read request rate over 1m (per instance)

The number of read requests that were issued to the device per second.

Note: Disk statistics are per device, not per service. In certain environments (such as common docker-compose setups), gitserver could be one of many services using this disk. These statistics are best interpreted as the load experienced by the device gitserver is using, not the load gitserver is solely responsible for causing.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))

gitserver: repos_disk_writes_sec

Write request rate over 1m (per instance)

The number of write requests that were issued to the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))

gitserver: repos_disk_read_throughput

Read throughput over 1m (per instance)

The amount of data that was read from the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101310 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_bytes_total{instance=~`node-exporter.*`}[1m])))))

gitserver: repos_disk_write_throughput

Write throughput over 1m (per instance)

The amount of data that was written to the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101311 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_written_bytes_total{instance=~`node-exporter.*`}[1m])))))

gitserver: repos_disk_read_duration

Average read duration over 1m (per instance)

The average time for read requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101320 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_time_seconds_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))))

gitserver: repos_disk_write_duration

Average write duration over 1m (per instance)

The average time for write requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101321 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_write_time_seconds_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))))

gitserver: repos_disk_read_request_size

Average read request size over 1m (per instance)

The average size of read requests that were issued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101330 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_bytes_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))))

gitserver: repos_disk_write_request_size)

Average write request size over 1m (per instance)

The average size of write requests that were issued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101331 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_written_bytes_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))))

gitserver: repos_disk_reads_merged_sec

Merged read request rate over 1m (per instance)

The number of read requests merged per second that were queued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101340 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_merged_total{instance=~`node-exporter.*`}[1m])))))

gitserver: repos_disk_writes_merged_sec

Merged writes request rate over 1m (per instance)

The number of write requests merged per second that were queued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101341 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_merged_total{instance=~`node-exporter.*`}[1m])))))

gitserver: repos_disk_average_queue_size

Average queue size over 1m (per instance)

The number of I/O operations that were being queued or being serviced. See https://blog.actorsfit.com/a?ID=00200-428fa2ac-e338-4540-848c-af9a3eb1ebd2 for background (avgqu-sz).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101350 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(max by (instance) (gitserver_mount_point_info{mount_name="reposDir",instance=~`${shard:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_io_time_weighted_seconds_total{instance=~`node-exporter.*`}[1m])))))

SHELL
sum(rate(grpc_server_handled_total{grpc_method=~`${git_service_method:regex}`,instance=~`${shard:regex}`,grpc_service=~"gitserver.v1.GitserverService"}[2m])) by (grpc_method, grpc_code)

Git Server: Git Service GRPC "internal error" metrics

gitserver: git_service_grpc_clients_error_percentage_all_methods

Client baseline error percentage across all methods over 2m

The percentage of gRPC requests that fail across all methods (regardless of whether or not there was an internal error), aggregated across all "git_service" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",grpc_code!="OK"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService"}[2m])))))))

gitserver: git_service_grpc_clients_error_percentage_per_method

Client baseline error percentage per-method over 2m

The percentage of gRPC requests that fail per method (regardless of whether or not there was an internal error), aggregated across all "git_service" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",grpc_method=~"${git_service_method:regex}",grpc_code!="OK"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",grpc_method=~"${git_service_method:regex}"}[2m])) by (grpc_method))))))

gitserver: git_service_grpc_clients_all_codes_per_method

Client baseline response codes rate per-method over 2m

The rate of all generated gRPC response codes per method (regardless of whether or not there was an internal error), aggregated across all "git_service" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101502 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",grpc_method=~"${git_service_method:regex}"}[2m])) by (grpc_method, grpc_code))

gitserver: git_service_grpc_clients_internal_error_percentage_all_methods

Client-observed gRPC internal error percentage across all methods over 2m

The percentage of gRPC requests that appear to fail due to gRPC internal errors across all methods, aggregated across all "git_service" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "git_service" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101510 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",grpc_code!="OK",is_internal_error="true"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService"}[2m])))))))

gitserver: git_service_grpc_clients_internal_error_percentage_per_method

Client-observed gRPC internal error percentage per-method over 2m

The percentage of gRPC requests that appear to fail to due to gRPC internal errors per method, aggregated across all "git_service" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "git_service" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101511 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",grpc_method=~"${git_service_method:regex}",grpc_code!="OK",is_internal_error="true"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",grpc_method=~"${git_service_method:regex}"}[2m])) by (grpc_method))))))

gitserver: git_service_grpc_clients_internal_error_all_codes_per_method

Client-observed gRPC internal error response code rate per-method over 2m

The rate of gRPC internal-error response codes per method, aggregated across all "git_service" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "git_service" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101512 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverService",is_internal_error="true",grpc_method=~"${git_service_method:regex}"}[2m])) by (grpc_method, grpc_code))

SHELL
(sum(rate(src_grpc_client_retry_attempts_total{grpc_service=~"gitserver.v1.GitserverService",grpc_method=~"${git_service_method:regex}",is_retried="true"}[2m])) by (grpc_method))

SHELL
sum(rate(grpc_server_handled_total{grpc_method=~`${repository_service_method:regex}`,instance=~`${shard:regex}`,grpc_service=~"gitserver.v1.GitserverRepositoryService"}[2m])) by (grpc_method, grpc_code)

Git Server: Repository Service GRPC "internal error" metrics

gitserver: repository_service_grpc_clients_error_percentage_all_methods

Client baseline error percentage across all methods over 2m

The percentage of gRPC requests that fail across all methods (regardless of whether or not there was an internal error), aggregated across all "repository_service" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101800 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_code!="OK"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService"}[2m])))))))

gitserver: repository_service_grpc_clients_error_percentage_per_method

Client baseline error percentage per-method over 2m

The percentage of gRPC requests that fail per method (regardless of whether or not there was an internal error), aggregated across all "repository_service" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101801 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_method=~"${repository_service_method:regex}",grpc_code!="OK"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_method=~"${repository_service_method:regex}"}[2m])) by (grpc_method))))))

gitserver: repository_service_grpc_clients_all_codes_per_method

Client baseline response codes rate per-method over 2m

The rate of all generated gRPC response codes per method (regardless of whether or not there was an internal error), aggregated across all "repository_service" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101802 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_method=~"${repository_service_method:regex}"}[2m])) by (grpc_method, grpc_code))

gitserver: repository_service_grpc_clients_internal_error_percentage_all_methods

Client-observed gRPC internal error percentage across all methods over 2m

The percentage of gRPC requests that appear to fail due to gRPC internal errors across all methods, aggregated across all "repository_service" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "repository_service" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101810 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_code!="OK",is_internal_error="true"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService"}[2m])))))))

gitserver: repository_service_grpc_clients_internal_error_percentage_per_method

Client-observed gRPC internal error percentage per-method over 2m

The percentage of gRPC requests that appear to fail to due to gRPC internal errors per method, aggregated across all "repository_service" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "repository_service" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101811 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_method=~"${repository_service_method:regex}",grpc_code!="OK",is_internal_error="true"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_method=~"${repository_service_method:regex}"}[2m])) by (grpc_method))))))

gitserver: repository_service_grpc_clients_internal_error_all_codes_per_method

Client-observed gRPC internal error response code rate per-method over 2m

The rate of gRPC internal-error response codes per method, aggregated across all "repository_service" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "repository_service" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=101812 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"gitserver.v1.GitserverRepositoryService",is_internal_error="true",grpc_method=~"${repository_service_method:regex}"}[2m])) by (grpc_method, grpc_code))

SHELL
(sum(rate(src_grpc_client_retry_attempts_total{grpc_service=~"gitserver.v1.GitserverRepositoryService",grpc_method=~"${repository_service_method:regex}",is_retried="true"}[2m])) by (grpc_method))

Git Server: Site configuration client update latency

gitserver: gitserver_site_configuration_duration_since_last_successful_update_by_instance

Duration since last successful site configuration update (by instance)

The duration since the configuration client used by the "gitserver" service last successfully updated its site configuration. Long durations could indicate issues updating the site configuration.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
src_conf_client_time_since_last_successful_update_seconds{job=~`.*gitserver`,instance=~`${shard:regex}`}

gitserver: gitserver_site_configuration_duration_since_last_successful_update_by_instance

Maximum duration since last successful site configuration update (all "gitserver" instances)

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
max(max_over_time(src_conf_client_time_since_last_successful_update_seconds{job=~`.*gitserver`,instance=~`${shard:regex}`}[1m]))

SHELL
histogram_quantile(0.95, sum(rate(src_http_request_duration_seconds_bucket{app="gitserver",code!~"2.."}[5m])) by (le, route))

SHELL
sum by (app_name, db_name) (increase(src_pgsql_conns_closed_max_idle_time{app_name="gitserver"}[5m]))

Git Server: Container monitoring (not available on server)

gitserver: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod gitserver (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p gitserver.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' gitserver (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the gitserver container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs gitserver (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^gitserver.*"}) > 60)

gitserver: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^gitserver.*"}

gitserver: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102302 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^gitserver.*"}

gitserver: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102303 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^gitserver.*"}[1h]) + rate(container_fs_writes_total{name=~"^gitserver.*"}[1h]))

Git Server: Provisioning indicators (not available on server)

gitserver: provisioning_container_cpu_usage_long_term

Container cpu usage total (90th percentile over 1d) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
quantile_over_time(0.9, cadvisor_container_cpu_usage_percentage_total{name=~"^gitserver.*"}[1d])

gitserver: provisioning_container_memory_usage_long_term

Container memory usage (1d maximum) by instance

Git Server is expected to use up all the memory it is provided.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102401 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max_over_time(cadvisor_container_memory_usage_percentage_total{name=~"^gitserver.*"}[1d])

gitserver: provisioning_container_cpu_usage_short_term

Container cpu usage total (5m maximum) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102410 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max_over_time(cadvisor_container_cpu_usage_percentage_total{name=~"^gitserver.*"}[5m])

gitserver: provisioning_container_memory_usage_short_term

Container memory usage (5m maximum) by instance

Git Server is expected to use up all the memory it is provided.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102411 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max_over_time(cadvisor_container_memory_usage_percentage_total{name=~"^gitserver.*"}[5m])

gitserver: container_oomkill_events_total

Container OOMKILL events total by instance

This value indicates the total number of times the container main process or child processes were terminated by OOM killer. When it occurs frequently, it is an indicator of underprovisioning.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102412 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^gitserver.*"})

Git Server: Golang runtime monitoring

gitserver: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*gitserver"})

gitserver: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*gitserver"})

Git Server: Kubernetes monitoring (only available on Kubernetes)

gitserver: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/gitserver/gitserver?viewPanel=102600 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*gitserver"}) / count by (app) (up{app=~".*gitserver"}) * 100

Postgres

Postgres metrics, exported from postgres_exporter (not available on server).

To see this dashboard, visit /-/debug/grafana/d/postgres/postgres on your Sourcegraph instance.

postgres: connections

Active connections

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/postgres/postgres?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (job) (pg_stat_activity_count{datname!~"template.*|postgres|cloudsqladmin"}) OR sum by (job) (pg_stat_activity_count{job="codeinsights-db", datname!~"template.*|cloudsqladmin"})

postgres: usage_connections_percentage

Connection in use

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/postgres/postgres?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum(pg_stat_activity_count) by (job) / (sum(pg_settings_max_connections) by (job) - sum(pg_settings_superuser_reserved_connections) by (job)) * 100

postgres: transaction_durations

Maximum transaction durations

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/postgres/postgres?viewPanel=100002 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (job) (pg_stat_activity_max_tx_duration{datname!~"template.*|postgres|cloudsqladmin",job!="codeintel-db"}) OR sum by (job) (pg_stat_activity_max_tx_duration{job="codeinsights-db", datname!~"template.*|cloudsqladmin"})

SHELL
pg_sg_migration_status

SHELL
max by (relname)(pg_index_bloat_ratio) * 100

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^(pgsql|codeintel-db|codeinsights).*"})

Postgres: Kubernetes monitoring (only available on Kubernetes)

postgres: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/postgres/postgres?viewPanel=100400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*(pgsql|codeintel-db|codeinsights)"}) / count by (app) (up{app=~".*(pgsql|codeintel-db|codeinsights)"}) * 100

Precise Code Intel Worker

Handles conversion of uploaded precise code intelligence bundles.

To see this dashboard, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker on your Sourcegraph instance.

Precise Code Intel Worker: Codeintel: LSIF uploads

precise-code-intel-worker: codeintel_upload_handlers

Handler active handlers

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(src_codeintel_upload_processor_handlers{job=~"^precise-code-intel-worker.*"})

precise-code-intel-worker: codeintel_upload_processor_upload_size

Sum of upload sizes in bytes being processed by each precise code-intel worker instance

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by(instance) (src_codeintel_upload_processor_upload_size{job="precise-code-intel-worker"})

precise-code-intel-worker: codeintel_upload_processor_total

Handler operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_upload_processor_total{job=~"^precise-code-intel-worker.*"}[5m]))

precise-code-intel-worker: codeintel_upload_processor_99th_percentile_duration

Aggregate successful handler operation duration distribution over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum  by (le)(rate(src_codeintel_upload_processor_duration_seconds_bucket{job=~"^precise-code-intel-worker.*"}[5m]))

precise-code-intel-worker: codeintel_upload_processor_errors_total

Handler operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_upload_processor_errors_total{job=~"^precise-code-intel-worker.*"}[5m]))

precise-code-intel-worker: codeintel_upload_processor_error_rate

Handler operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_upload_processor_errors_total{job=~"^precise-code-intel-worker.*"}[5m])) / (sum(increase(src_codeintel_upload_processor_total{job=~"^precise-code-intel-worker.*"}[5m])) + sum(increase(src_codeintel_upload_processor_errors_total{job=~"^precise-code-intel-worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^precise-code-intel-worker.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_store_total{job=~"^precise-code-intel-worker.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^precise-code-intel-worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_lsifstore_errors_total{job=~"^precise-code-intel-worker.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_lsifstore_total{job=~"^precise-code-intel-worker.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_lsifstore_errors_total{job=~"^precise-code-intel-worker.*"}[5m]))) * 100

SHELL
sum(increase(src_workerutil_dbworker_store_errors_total{domain='codeintel_upload',job=~"^precise-code-intel-worker.*"}[5m])) / (sum(increase(src_workerutil_dbworker_store_total{domain='codeintel_upload',job=~"^precise-code-intel-worker.*"}[5m])) + sum(increase(src_workerutil_dbworker_store_errors_total{domain='codeintel_upload',job=~"^precise-code-intel-worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_gitserver_client_errors_total{job=~"^precise-code-intel-worker.*"}[5m])) / (sum by (op)(increase(src_gitserver_client_total{job=~"^precise-code-intel-worker.*"}[5m])) + sum by (op)(increase(src_gitserver_client_errors_total{job=~"^precise-code-intel-worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploadstore_errors_total{job=~"^precise-code-intel-worker.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploadstore_total{job=~"^precise-code-intel-worker.*"}[5m])) + sum by (op)(increase(src_codeintel_uploadstore_errors_total{job=~"^precise-code-intel-worker.*"}[5m]))) * 100

SHELL
sum by (app_name, db_name) (increase(src_pgsql_conns_closed_max_idle_time{app_name="precise-code-intel-worker"}[5m]))

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^precise-code-intel-worker.*"} / container_spec_memory_limit_bytes{name=~"^precise-code-intel-worker.*"}) by (name) * 100.0

precise-code-intel-worker: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100712 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^precise-code-intel-worker.*"} / container_spec_memory_limit_bytes{name=~"^precise-code-intel-worker.*"}) by (name) * 100.0

Precise Code Intel Worker: Container monitoring (not available on server)

precise-code-intel-worker: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod precise-code-intel-worker (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p precise-code-intel-worker.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' precise-code-intel-worker (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the precise-code-intel-worker container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs precise-code-intel-worker (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100800 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^precise-code-intel-worker.*"}) > 60)

precise-code-intel-worker: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100801 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^precise-code-intel-worker.*"}

precise-code-intel-worker: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100802 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^precise-code-intel-worker.*"}

precise-code-intel-worker: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=100803 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^precise-code-intel-worker.*"}[1h]) + rate(container_fs_writes_total{name=~"^precise-code-intel-worker.*"}[1h]))

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^precise-code-intel-worker.*"})

Precise Code Intel Worker: Golang runtime monitoring

precise-code-intel-worker: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=101000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*precise-code-intel-worker"})

precise-code-intel-worker: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=101001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*precise-code-intel-worker"})

Precise Code Intel Worker: Kubernetes monitoring (only available on Kubernetes)

precise-code-intel-worker: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/precise-code-intel-worker/precise-code-intel-worker?viewPanel=101100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*precise-code-intel-worker"}) / count by (app) (up{app=~".*precise-code-intel-worker"}) * 100

Syntactic Indexing

Handles syntactic indexing of repositories.

To see this dashboard, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing on your Sourcegraph instance.

Syntactic Indexing: Syntactic indexing scheduling: summary

####syntactic-indexing:

Syntactic indexing jobs proposed for insertion over 5m

Syntactic indexing jobs are proposed for insertion into the queue based on round-robin scheduling across recently modified repos.

This should be equal to the sum of inserted + updated + skipped, but is shown separately for clarity.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_syntactic_enqueuer_jobs_proposed[5m]))

####syntactic-indexing:

Syntactic indexing jobs inserted over 5m

Syntactic indexing jobs are inserted into the queue if there is a proposed repo commit pair (R, X) such that there is no existing job for R in the queue.

If this number is close to the number of proposed jobs, it may indicate that the scheduler is not able to keep up with the rate of incoming commits.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_syntactic_enqueuer_jobs_inserted[5m]))

####syntactic-indexing:

Syntactic indexing jobs updated in-place over 5m

Syntactic indexing jobs are updated in-place when the scheduler attempts to enqueue a repo commit pair (R, X) and discovers that the queue already had some other repo commit pair (R, Y) where Y is an ancestor of X. In that case, the job is updated in-place to point to X, to reflect the fact that users looking at the tip of the default branch of R are more likely to benefit from newer commits being indexed.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100002 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_syntactic_enqueuer_jobs_updated[5m]))

####syntactic-indexing:

Syntactic indexing jobs skipped over 5m

Syntactic indexing jobs insertion is skipped when the scheduler attempts to enqueue a repo commit pair (R, X) and discovers that the queue already had the same job (most likely) or another job (R, Y) where Y is not an ancestor of X.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100003 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_syntactic_enqueuer_jobs_skipped[5m]))

SHELL
sum(increase(src_workerutil_dbworker_store_errors_total{domain='syntactic_scip_indexing_jobs',job=~"^syntactic-code-intel-worker.*"}[5m])) / (sum(increase(src_workerutil_dbworker_store_total{domain='syntactic_scip_indexing_jobs',job=~"^syntactic-code-intel-worker.*"}[5m])) + sum(increase(src_workerutil_dbworker_store_errors_total{domain='syntactic_scip_indexing_jobs',job=~"^syntactic-code-intel-worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_gitserver_client_errors_total{job=~"^syntactic-code-intel-worker.*"}[5m])) / (sum by (op)(increase(src_gitserver_client_total{job=~"^syntactic-code-intel-worker.*"}[5m])) + sum by (op)(increase(src_gitserver_client_errors_total{job=~"^syntactic-code-intel-worker.*"}[5m]))) * 100

SHELL
sum by (app_name, db_name) (increase(src_pgsql_conns_closed_max_idle_time{app_name="syntactic-code-intel-worker"}[5m]))

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^syntactic-code-intel-worker.*"} / container_spec_memory_limit_bytes{name=~"^syntactic-code-intel-worker.*"}) by (name) * 100.0

syntactic-indexing: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100412 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^syntactic-code-intel-worker.*"} / container_spec_memory_limit_bytes{name=~"^syntactic-code-intel-worker.*"}) by (name) * 100.0

Syntactic Indexing: Container monitoring (not available on server)

syntactic-indexing: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod syntactic-code-intel-worker (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p syntactic-code-intel-worker.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' syntactic-code-intel-worker (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the syntactic-code-intel-worker container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs syntactic-code-intel-worker (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^syntactic-code-intel-worker.*"}) > 60)

syntactic-indexing: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^syntactic-code-intel-worker.*"}

syntactic-indexing: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100502 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^syntactic-code-intel-worker.*"}

syntactic-indexing: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100503 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^syntactic-code-intel-worker.*"}[1h]) + rate(container_fs_writes_total{name=~"^syntactic-code-intel-worker.*"}[1h]))

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^syntactic-code-intel-worker.*"})

Syntactic Indexing: Golang runtime monitoring

syntactic-indexing: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100700 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*syntactic-code-intel-worker"})

syntactic-indexing: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100701 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*syntactic-code-intel-worker"})

Syntactic Indexing: Kubernetes monitoring (only available on Kubernetes)

syntactic-indexing: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntactic-indexing/syntactic-indexing?viewPanel=100800 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*syntactic-code-intel-worker"}) / count by (app) (up{app=~".*syntactic-code-intel-worker"}) * 100

Redis

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^redis-cache.*"})

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^redis-store.*"})

Redis: Kubernetes monitoring (only available on Kubernetes)

redis: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/redis/redis?viewPanel=100400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*redis-cache"}) / count by (app) (up{app=~".*redis-cache"}) * 100

Redis: Kubernetes monitoring (only available on Kubernetes)

redis: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/redis/redis?viewPanel=100500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*redis-store"}) / count by (app) (up{app=~".*redis-store"}) * 100

Worker

Manages background processes.

To see this dashboard, visit /-/debug/grafana/d/worker/worker on your Sourcegraph instance.

Worker: Active jobs

worker: worker_job_count

Number of worker instances running each job

The number of worker instances running each job type. It is necessary for each job type to be managed by at least one worker instance.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100000 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum by (job_name) (src_worker_jobs{job=~"^worker.*"})

worker: worker_job_codeintel-upload-janitor_count

Number of worker instances running the codeintel-upload-janitor job

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum (src_worker_jobs{job=~"^worker.*", job_name="codeintel-upload-janitor"})

worker: worker_job_codeintel-commitgraph-updater_count

Number of worker instances running the codeintel-commitgraph-updater job

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum (src_worker_jobs{job=~"^worker.*", job_name="codeintel-commitgraph-updater"})

worker: worker_job_codeintel-autoindexing-scheduler_count

Number of worker instances running the codeintel-autoindexing-scheduler job

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum (src_worker_jobs{job=~"^worker.*", job_name="codeintel-autoindexing-scheduler"})

SHELL
sum(increase(src_record_encryption_errors_total{job=~"^worker.*"}[5m]))

SHELL
sum(increase(src_codeintel_commit_graph_processor_errors_total{job=~"^worker.*"}[5m])) / (sum(increase(src_codeintel_commit_graph_processor_total{job=~"^worker.*"}[5m])) + sum(increase(src_codeintel_commit_graph_processor_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum(increase(src_codeintel_autoindexing_errors_total{op='HandleIndexSchedule',job=~"^worker.*"}[10m])) / (sum(increase(src_codeintel_autoindexing_total{op='HandleIndexSchedule',job=~"^worker.*"}[10m])) + sum(increase(src_codeintel_autoindexing_errors_total{op='HandleIndexSchedule',job=~"^worker.*"}[10m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^worker.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_store_total{job=~"^worker.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_lsifstore_errors_total{job=~"^worker.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_lsifstore_total{job=~"^worker.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_lsifstore_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_gitserver_client_errors_total{job=~"^worker.*"}[5m])) / (sum by (op)(increase(src_gitserver_client_total{job=~"^worker.*"}[5m])) + sum by (op)(increase(src_gitserver_client_errors_total{job=~"^worker.*"}[5m]))) * 100

Worker: Repositories

worker: syncer_sync_last_time

Time since last sync

A high value here indicates issues synchronizing repo metadata. If the value is persistently high, make sure all external services have valid tokens.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100700 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(timestamp(vector(time()))) - max(src_repoupdater_syncer_sync_last_time)

worker: src_repoupdater_max_sync_backoff

Time since oldest sync

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100701 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(src_repoupdater_max_sync_backoff)

worker: src_repoupdater_syncer_sync_errors_total

Site level external service sync error rate

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100702 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by (family) (rate(src_repoupdater_syncer_sync_errors_total{owner!="user",reason!="invalid_npm_path",reason!="internal_rate_limit"}[5m]))

worker: syncer_sync_start

Repo metadata sync was started

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100710 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by (family) (rate(src_repoupdater_syncer_start_sync{family="Syncer.SyncExternalService"}[9h0m0s]))

worker: syncer_sync_duration

95th repositories sync duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100711 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, max by (le, family, success) (rate(src_repoupdater_syncer_sync_duration_seconds_bucket[1m])))

worker: source_duration

95th repositories source duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100712 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, max by (le) (rate(src_repoupdater_source_duration_seconds_bucket[1m])))

worker: syncer_synced_repos

Repositories synced

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100720 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(rate(src_repoupdater_syncer_synced_repos_total[1m]))

worker: sourced_repos

Repositories sourced

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100721 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(rate(src_repoupdater_source_repos_total[1m]))

worker: sched_auto_fetch

Repositories scheduled due to hitting a deadline

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100730 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(rate(src_repoupdater_sched_auto_fetch[1m]))

worker: sched_manual_fetch

Repositories scheduled due to user traffic

Check worker logs if this value is persistently high. This does not indicate anything if there are no user added code hosts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100731 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(rate(src_repoupdater_sched_manual_fetch[1m]))

worker: sched_loops

Scheduler loops

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100740 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(rate(src_repoupdater_sched_loops[1m]))

worker: src_repoupdater_stale_repos

Repos that haven't been fetched in more than 8 hours

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100741 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(src_repoupdater_stale_repos)

worker: sched_error

Repositories schedule error rate

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100742 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(rate(src_repoupdater_sched_error[1m]))

SHELL
sum by (reason) (src_repo_statesyncer_repos_deleted{is_primary="false"})

Query:

SHELL
max by(name) (rate(src_github_rate_limit_wait_duration_seconds{resource="graphql"}[5m]))

worker: github_rest_rate_limit_wait_duration

Time spent waiting for the GitHub rest API rate limiter

Indicates how long we`re waiting on the rate limit once it has been exceeded

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100931 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by(name) (rate(src_github_rate_limit_wait_duration_seconds{resource="rest"}[5m]))

worker: github_search_rate_limit_wait_duration

Time spent waiting for the GitHub search API rate limiter

Indicates how long we`re waiting on the rate limit once it has been exceeded

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100932 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by(name) (rate(src_github_rate_limit_wait_duration_seconds{resource="search"}[5m]))

worker: gitlab_rest_rate_limit_remaining

Remaining calls to GitLab rest API before hitting the rate limit

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100940 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by (name) (src_gitlab_rate_limit_remaining{resource="rest"})

worker: gitlab_rest_rate_limit_wait_duration

Time spent waiting for the GitLab rest API rate limiter

Indicates how long we`re waiting on the rate limit once it has been exceeded

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100941 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by (name) (rate(src_gitlab_rate_limit_wait_duration_seconds{resource="rest"}[5m]))

worker: src_internal_rate_limit_wait_duration_bucket

95th percentile time spent successfully waiting on our internal rate limiter

Indicates how long we`re waiting on our internal rate limiter when communicating with a code host

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100950 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, sum(rate(src_internal_rate_limit_wait_duration_bucket{failed="false"}[5m])) by (le, urn))

worker: src_internal_rate_limit_wait_error_count

Rate of failures waiting on our internal rate limiter

The rate at which we fail our internal rate limiter.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=100951 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (urn) (rate(src_internal_rate_limit_wait_duration_count{failed="true"}[5m]))

SHELL
avg by (type) (src_repo_perms_syncer_perms_found)

worker: perms_syncer_outdated_perms

Number of entities with outdated permissions

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=101050 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by (type) (src_repo_perms_syncer_outdated_perms)

worker: perms_syncer_sync_duration

95th permissions sync duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=101060 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, max by (le, type) (rate(src_repo_perms_syncer_sync_duration_seconds_bucket[1m])))

worker: perms_syncer_sync_errors

Permissions sync error rate

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=101070 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max by (type) (ceil(rate(src_repo_perms_syncer_sync_errors_total[1m])))

worker: perms_syncer_scheduled_repos_total

Total number of repos scheduled for permissions sync

Indicates how many repositories have been scheduled for a permissions sync. More about repository permissions synchronization here

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=101071 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
max(rate(src_repo_perms_syncer_schedule_repos_total[1m]))

SHELL
sum by (op,scope)(increase(src_gitserver_client_errors_total{job=~"^worker.*"}[5m])) / (sum by (op,scope)(increase(src_gitserver_client_total{job=~"^worker.*"}[5m])) + sum by (op,scope)(increase(src_gitserver_client_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum by (op,scope)(increase(src_gitserver_repositoryservice_client_errors_total{job=~"^worker.*"}[5m])) / (sum by (op,scope)(increase(src_gitserver_repositoryservice_client_total{job=~"^worker.*"}[5m])) + sum by (op,scope)(increase(src_gitserver_repositoryservice_client_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_batches_dbstore_errors_total{job=~"^worker.*"}[5m])) / (sum by (op)(increase(src_batches_dbstore_total{job=~"^worker.*"}[5m])) + sum by (op)(increase(src_batches_dbstore_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_batches_service_errors_total{job=~"^worker.*"}[5m])) / (sum by (op)(increase(src_batches_service_total{job=~"^worker.*"}[5m])) + sum by (op)(increase(src_batches_service_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum(increase(src_query_runner_worker_processor_errors_total{job=~"^worker.*"}[5m])) / (sum(increase(src_query_runner_worker_processor_total{job=~"^worker.*"}[5m])) + sum(increase(src_query_runner_worker_processor_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_workerutil_dbworker_store_errors_total{domain='insights_query_runner_jobs',job=~"^worker.*"}[5m])) / (sum by (op)(increase(src_workerutil_dbworker_store_total{domain='insights_query_runner_jobs',job=~"^worker.*"}[5m])) + sum by (op)(increase(src_workerutil_dbworker_store_errors_total{domain='insights_query_runner_jobs',job=~"^worker.*"}[5m]))) * 100

SHELL
sum(increase(src_completioncredits_aggregator_errors_total{job=~"^worker.*"}[30m])) / (sum(increase(src_completioncredits_aggregator_total{job=~"^worker.*"}[30m])) + sum(increase(src_completioncredits_aggregator_errors_total{job=~"^worker.*"}[30m]))) * 100

SHELL
(sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*worker.*"}[5m])) / (sum by (name, job_name) (rate(src_periodic_goroutine_tenant_success_total{job=~".*worker.*"}[5m])) + sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*worker.*"}[5m])))) * 100

SHELL
sum by (app_name, db_name) (increase(src_pgsql_conns_closed_max_idle_time{app_name="worker"}[5m]))

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^worker.*"} / container_spec_memory_limit_bytes{name=~"^worker.*"}) by (name) * 100.0

worker: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^worker.*"} / container_spec_memory_limit_bytes{name=~"^worker.*"}) by (name) * 100.0

Worker: Container monitoring (not available on server)

worker: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod worker (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p worker.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' worker (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the worker container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs worker (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^worker.*"}) > 60)

worker: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^worker.*"}

worker: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102102 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^worker.*"}

worker: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102103 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^worker.*"}[1h]) + rate(container_fs_writes_total{name=~"^worker.*"}[1h]))

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^worker.*"})

Worker: Golang runtime monitoring

worker: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*worker"})

worker: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*worker"})

Worker: Kubernetes monitoring (only available on Kubernetes)

worker: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*worker"}) / count by (app) (up{app=~".*worker"}) * 100

SHELL
sum by (op)(increase(src_workerutil_dbworker_store_errors_total{domain='own_background_worker_store',job=~"^worker.*"}[5m])) / (sum by (op)(increase(src_workerutil_dbworker_store_total{domain='own_background_worker_store',job=~"^worker.*"}[5m])) + sum by (op)(increase(src_workerutil_dbworker_store_errors_total{domain='own_background_worker_store',job=~"^worker.*"}[5m]))) * 100

SHELL
sum(increase(src_own_background_worker_processor_errors_total{job=~"^worker.*"}[5m])) / (sum(increase(src_own_background_worker_processor_total{job=~"^worker.*"}[5m])) + sum(increase(src_own_background_worker_processor_errors_total{job=~"^worker.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_own_background_index_scheduler_errors_total{job=~"^worker.*"}[10m])) / (sum by (op)(increase(src_own_background_index_scheduler_total{job=~"^worker.*"}[10m])) + sum by (op)(increase(src_own_background_index_scheduler_errors_total{job=~"^worker.*"}[10m]))) * 100

Worker: Site configuration client update latency

worker: worker_site_configuration_duration_since_last_successful_update_by_instance

Duration since last successful site configuration update (by instance)

The duration since the configuration client used by the "worker" service last successfully updated its site configuration. Long durations could indicate issues updating the site configuration.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102800 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
src_conf_client_time_since_last_successful_update_seconds{job=~`^worker.*`,instance=~`${instance:regex}`}

worker: worker_site_configuration_duration_since_last_successful_update_by_instance

Maximum duration since last successful site configuration update (all "worker" instances)

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/worker/worker?viewPanel=102801 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
max(max_over_time(src_conf_client_time_since_last_successful_update_seconds{job=~`^worker.*`,instance=~`${instance:regex}`}[1m]))

Searcher

Performs unindexed searches (diff and commit search, text search for unindexed branches).

To see this dashboard, visit /-/debug/grafana/d/searcher/searcher on your Sourcegraph instance.

searcher: traffic

Requests per second by code over 10m

This graph is the average number of requests per second searcher is experiencing over the last 10 minutes.

The code is the HTTP Status code. 200 is success. We have a special code "canceled" which is common when doing a large search request and we find enough results before searching all possible repos.

Note: A search query is translated into an unindexed search query per unique (repo, commit). This means a single user query may result in thousands of requests to searcher.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (code) (rate(searcher_service_request_total{instance=~`${instance:regex}`}[10m]))

searcher: replica_traffic

Requests per second per replica over 10m

This graph is the average number of requests per second searcher is experiencing over the last 10 minutes broken down per replica.

The code is the HTTP Status code. 200 is success. We have a special code "canceled" which is common when doing a large search request and we find enough results before searching all possible repos.

Note: A search query is translated into an unindexed search query per unique (repo, commit). This means a single user query may result in thousands of requests to searcher.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance) (rate(searcher_service_request_total{instance=~`${instance:regex}`}[10m]))

searcher: concurrent_requests

Amount of in-flight unindexed search requests (per instance)

This graph is the amount of in-flight unindexed search requests per instance. Consistently high numbers here indicate you may need to scale out searcher.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance) (searcher_service_running{instance=~`${instance:regex}`})

searcher: unindexed_search_request_errors

Unindexed search request errors every 5m by code

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (code)(increase(searcher_service_request_total{code!="200",code!="canceled",instance=~`${instance:regex}`}[5m])) / ignoring(code) group_left sum(increase(searcher_service_request_total{instance=~`${instance:regex}`}[5m])) * 100

Searcher: Cache store

searcher: store_fetching

Amount of in-flight unindexed search requests fetching code from gitserver (per instance)

Before we can search a commit we fetch the code from gitserver then cache it for future search requests. This graph is the current number of search requests which are in the state of fetching code from gitserver.

Generally this number should remain low since fetching code is fast, but expect bursts. In the case of instances with a monorepo you would expect this number to stay low for the duration of fetching the code (which in some cases can take many minutes).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance) (searcher_store_fetching{instance=~`${instance:regex}`})

searcher: store_fetching_waiting

Amount of in-flight unindexed search requests waiting to fetch code from gitserver (per instance)

We limit the number of requests which can fetch code to prevent overwhelming gitserver. This gauge is the number of requests waiting to be allowed to speak to gitserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance) (searcher_store_fetch_queue_size{instance=~`${instance:regex}`})

searcher: store_fetching_fail

Amount of unindexed search requests that failed while fetching code from gitserver over 10m (per instance)

This graph should be zero since fetching happens in the background and will not be influenced by user timeouts/etc. Expected upticks in this graph are during gitserver rollouts. If you regularly see this graph have non-zero values please reach out to support.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100102 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance) (rate(searcher_store_fetch_failed{instance=~`${instance:regex}`}[10m]))

Searcher: Index use

searcher: searcher_hybrid_final_state_total

Hybrid search final state over 10m

This graph is about our interactions with the search index (zoekt) to help complete unindexed search requests. Searcher will use indexed search for the files that have not changed between the unindexed commit and the index.

This graph should mostly be "success". The next most common state should be "search-canceled" which happens when result limits are hit or the user starts a new search. Finally the next most common should be "diff-too-large", which happens if the commit is too far from the indexed commit. Otherwise other state should be rare and likely are a sign for further investigation.

Note: On sourcegraph.com "zoekt-list-missing" is also common due to it indexing a subset of repositories. Otherwise every other state should occur rarely.

For a full list of possible state see recordHybridFinalState.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100200 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (state)(increase(searcher_hybrid_final_state_total{instance=~`${instance:regex}`}[10m]))

searcher: searcher_hybrid_retry_total

Hybrid search retrying over 10m

Expectation is that this graph should mostly be 0. It will trigger if a user manages to do a search and the underlying index changes while searching or Zoekt goes down. So occasional bursts can be expected, but if this graph is regularly above 0 it is a sign for further investigation.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100201 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (reason)(increase(searcher_hybrid_retry_total{instance=~`${instance:regex}`}[10m]))

Searcher: Cache disk I/O metrics

searcher: cache_disk_reads_sec

Read request rate over 1m (per instance)

The number of read requests that were issued to the device per second.

Note: Disk statistics are per device, not per service. In certain environments (such as common docker-compose setups), searcher could be one of many services using this disk. These statistics are best interpreted as the load experienced by the device searcher is using, not the load searcher is solely responsible for causing.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))

searcher: cache_disk_writes_sec

Write request rate over 1m (per instance)

The number of write requests that were issued to the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))

searcher: cache_disk_read_throughput

Read throughput over 1m (per instance)

The amount of data that was read from the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100310 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_bytes_total{instance=~`node-exporter.*`}[1m])))))

searcher: cache_disk_write_throughput

Write throughput over 1m (per instance)

The amount of data that was written to the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100311 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_written_bytes_total{instance=~`node-exporter.*`}[1m])))))

searcher: cache_disk_read_duration

Average read duration over 1m (per instance)

The average time for read requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100320 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_time_seconds_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))))

searcher: cache_disk_write_duration

Average write duration over 1m (per instance)

The average time for write requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100321 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_write_time_seconds_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))))

searcher: cache_disk_read_request_size

Average read request size over 1m (per instance)

The average size of read requests that were issued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100330 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_bytes_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))))

searcher: cache_disk_write_request_size)

Average write request size over 1m (per instance)

The average size of write requests that were issued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100331 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_written_bytes_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))))

searcher: cache_disk_reads_merged_sec

Merged read request rate over 1m (per instance)

The number of read requests merged per second that were queued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100340 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_merged_total{instance=~`node-exporter.*`}[1m])))))

searcher: cache_disk_writes_merged_sec

Merged writes request rate over 1m (per instance)

The number of write requests merged per second that were queued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100341 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_merged_total{instance=~`node-exporter.*`}[1m])))))

searcher: cache_disk_average_queue_size

Average queue size over 1m (per instance)

The number of I/O operations that were being queued or being serviced. See https://blog.actorsfit.com/a?ID=00200-428fa2ac-e338-4540-848c-af9a3eb1ebd2 for background (avgqu-sz).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100350 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (searcher_mount_point_info{mount_name="cacheDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_io_time_weighted_seconds_total{instance=~`node-exporter.*`}[1m])))))

SHELL
sum(rate(grpc_server_handled_total{grpc_method=~`${searcher_method:regex}`,instance=~`${instance:regex}`,grpc_service=~"searcher.v1.SearcherService"}[2m])) by (grpc_method, grpc_code)

Searcher: Searcher GRPC "internal error" metrics

searcher: searcher_grpc_clients_error_percentage_all_methods

Client baseline error percentage across all methods over 2m

The percentage of gRPC requests that fail across all methods (regardless of whether or not there was an internal error), aggregated across all "searcher" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",grpc_code!="OK"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService"}[2m])))))))

searcher: searcher_grpc_clients_error_percentage_per_method

Client baseline error percentage per-method over 2m

The percentage of gRPC requests that fail per method (regardless of whether or not there was an internal error), aggregated across all "searcher" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",grpc_method=~"${searcher_method:regex}",grpc_code!="OK"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",grpc_method=~"${searcher_method:regex}"}[2m])) by (grpc_method))))))

searcher: searcher_grpc_clients_all_codes_per_method

Client baseline response codes rate per-method over 2m

The rate of all generated gRPC response codes per method (regardless of whether or not there was an internal error), aggregated across all "searcher" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100502 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",grpc_method=~"${searcher_method:regex}"}[2m])) by (grpc_method, grpc_code))

searcher: searcher_grpc_clients_internal_error_percentage_all_methods

Client-observed gRPC internal error percentage across all methods over 2m

The percentage of gRPC requests that appear to fail due to gRPC internal errors across all methods, aggregated across all "searcher" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "searcher" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100510 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",grpc_code!="OK",is_internal_error="true"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService"}[2m])))))))

searcher: searcher_grpc_clients_internal_error_percentage_per_method

Client-observed gRPC internal error percentage per-method over 2m

The percentage of gRPC requests that appear to fail to due to gRPC internal errors per method, aggregated across all "searcher" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "searcher" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100511 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",grpc_method=~"${searcher_method:regex}",grpc_code!="OK",is_internal_error="true"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",grpc_method=~"${searcher_method:regex}"}[2m])) by (grpc_method))))))

searcher: searcher_grpc_clients_internal_error_all_codes_per_method

Client-observed gRPC internal error response code rate per-method over 2m

The rate of gRPC internal-error response codes per method, aggregated across all "searcher" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "searcher" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=100512 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"searcher.v1.SearcherService",is_internal_error="true",grpc_method=~"${searcher_method:regex}"}[2m])) by (grpc_method, grpc_code))

SHELL
(sum(rate(src_grpc_client_retry_attempts_total{grpc_service=~"searcher.v1.SearcherService",grpc_method=~"${searcher_method:regex}",is_retried="true"}[2m])) by (grpc_method))

SHELL
sum by (op,parseAmount)(increase(src_codeintel_symbols_api_errors_total{job=~"^searcher.*"}[5m])) / (sum by (op,parseAmount)(increase(src_codeintel_symbols_api_total{job=~"^searcher.*"}[5m])) + sum by (op,parseAmount)(increase(src_codeintel_symbols_api_errors_total{job=~"^searcher.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_symbols_parser_errors_total{job=~"^searcher.*"}[5m])) / (sum by (op)(increase(src_codeintel_symbols_parser_total{job=~"^searcher.*"}[5m])) + sum by (op)(increase(src_codeintel_symbols_parser_errors_total{job=~"^searcher.*"}[5m]))) * 100

SHELL
rate(src_diskcache_store_symbols_errors_total[5m])

SHELL
sum by (op)(increase(src_codeintel_symbols_repository_fetcher_errors_total{job=~"^searcher.*"}[5m])) / (sum by (op)(increase(src_codeintel_symbols_repository_fetcher_total{job=~"^searcher.*"}[5m])) + sum by (op)(increase(src_codeintel_symbols_repository_fetcher_errors_total{job=~"^searcher.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_symbols_gitserver_errors_total{job=~"^searcher.*"}[5m])) / (sum by (op)(increase(src_codeintel_symbols_gitserver_total{job=~"^searcher.*"}[5m])) + sum by (op)(increase(src_codeintel_symbols_gitserver_errors_total{job=~"^searcher.*"}[5m]))) * 100

SHELL
sum(increase(src_rockskip_service_file_parsing_requests[5m]))

Searcher: Site configuration client update latency

searcher: searcher_site_configuration_duration_since_last_successful_update_by_instance

Duration since last successful site configuration update (by instance)

The duration since the configuration client used by the "searcher" service last successfully updated its site configuration. Long durations could indicate issues updating the site configuration.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
src_conf_client_time_since_last_successful_update_seconds{job=~`.*searcher`,instance=~`${instance:regex}`}

searcher: searcher_site_configuration_duration_since_last_successful_update_by_instance

Maximum duration since last successful site configuration update (all "searcher" instances)

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
max(max_over_time(src_conf_client_time_since_last_successful_update_seconds{job=~`.*searcher`,instance=~`${instance:regex}`}[1m]))

SHELL
(sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*searcher.*"}[5m])) / (sum by (name, job_name) (rate(src_periodic_goroutine_tenant_success_total{job=~".*searcher.*"}[5m])) + sum by (name, job_name) (rate(src_periodic_goroutine_tenant_errors_total{job=~".*searcher.*"}[5m])))) * 100

SHELL
sum by (app_name, db_name) (increase(src_pgsql_conns_closed_max_idle_time{app_name="searcher"}[5m]))

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^searcher.*"} / container_spec_memory_limit_bytes{name=~"^searcher.*"}) by (name) * 100.0

searcher: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101612 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^searcher.*"} / container_spec_memory_limit_bytes{name=~"^searcher.*"}) by (name) * 100.0

Searcher: Container monitoring (not available on server)

searcher: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod searcher (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p searcher.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' searcher (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the searcher container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs searcher (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101700 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^searcher.*"}) > 60)

searcher: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101701 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^searcher.*"}

searcher: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101702 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^searcher.*"}

searcher: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101703 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^searcher.*"}[1h]) + rate(container_fs_writes_total{name=~"^searcher.*"}[1h]))

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^searcher.*"})

Searcher: Golang runtime monitoring

searcher: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101900 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*searcher"})

searcher: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=101901 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*searcher"})

Searcher: Kubernetes monitoring (only available on Kubernetes)

searcher: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/searcher/searcher?viewPanel=102000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*searcher"}) / count by (app) (up{app=~".*searcher"}) * 100

Syntect Server

Handles syntax highlighting for code files.

To see this dashboard, visit /-/debug/grafana/d/syntect-server/syntect-server on your Sourcegraph instance.

syntect-server: syntax_highlighting_errors

Syntax highlighting errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_syntax_highlighting_requests{status="error"}[5m])) / sum(increase(src_syntax_highlighting_requests[5m])) * 100

syntect-server: syntax_highlighting_timeouts

Syntax highlighting timeouts every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_syntax_highlighting_requests{status="timeout"}[5m])) / sum(increase(src_syntax_highlighting_requests[5m])) * 100

syntect-server: syntax_highlighting_panics

Syntax highlighting panics every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_syntax_highlighting_requests{status="panic"}[5m]))

syntect-server: syntax_highlighting_worker_deaths

Syntax highlighter worker deaths every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_syntax_highlighting_requests{status="hss_worker_timeout"}[5m]))

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^syntect-server.*"} / container_spec_memory_limit_bytes{name=~"^syntect-server.*"}) by (name) * 100.0

syntect-server: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100112 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code Search team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^syntect-server.*"} / container_spec_memory_limit_bytes{name=~"^syntect-server.*"}) by (name) * 100.0

Syntect Server: Container monitoring (not available on server)

syntect-server: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod syntect-server (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p syntect-server.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' syntect-server (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the syntect-server container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs syntect-server (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100200 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^syntect-server.*"}) > 60)

syntect-server: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100201 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^syntect-server.*"}

syntect-server: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100202 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^syntect-server.*"}

syntect-server: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100203 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^syntect-server.*"}[1h]) + rate(container_fs_writes_total{name=~"^syntect-server.*"}[1h]))

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^syntect-server.*"})

Syntect Server: Kubernetes monitoring (only available on Kubernetes)

syntect-server: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/syntect-server/syntect-server?viewPanel=100400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*syntect-server"}) / count by (app) (up{app=~".*syntect-server"}) * 100

Zoekt

Indexes repositories, populates the search index, and responds to indexed search queries.

To see this dashboard, visit /-/debug/grafana/d/zoekt/zoekt on your Sourcegraph instance.

zoekt: total_repos_aggregate

Total number of repos (aggregate)

Sudden changes can be caused by indexing configuration changes.

Additionally, a discrepancy between "index_num_assigned" and "index_queue_cap" could indicate a bug.

Legend:

index_num_assigned: # of repos assigned to Zoekt
index_num_indexed: # of repos Zoekt has indexed
index_queue_cap: # of repos Zoekt is aware of, including those that it has finished indexing

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (__name__) ({__name__=~"index_num_assigned|index_num_indexed|index_queue_cap"})

zoekt: total_repos_per_instance

Total number of repos (per instance)

Sudden changes can be caused by indexing configuration changes.

Additionally, a discrepancy between "index_num_assigned" and "index_queue_cap" could indicate a bug.

Legend:

index_num_assigned: # of repos assigned to Zoekt
index_num_indexed: # of repos Zoekt has indexed
index_queue_cap: # of repos Zoekt is aware of, including those that it has finished processing

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (__name__, instance) ({__name__=~"index_num_assigned|index_num_indexed|index_queue_cap",instance=~"${instance:regex}"})

zoekt: repos_stopped_tracking_total_aggregate

The number of repositories we stopped tracking over 5m (aggregate)

Repositories we stop tracking are soft-deleted during the next cleanup job.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum(increase(index_num_stopped_tracking_total[5m]))

zoekt: repos_stopped_tracking_total_per_instance

The number of repositories we stopped tracking over 5m (per instance)

Repositories we stop tracking are soft-deleted during the next cleanup job.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance) (increase(index_num_stopped_tracking_total{instance=~`${instance:regex}`}[5m]))

zoekt: average_resolve_revision_duration

Average resolve revision duration over 5m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100020 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum(rate(resolve_revision_seconds_sum[5m])) / sum(rate(resolve_revision_seconds_count[5m]))

zoekt: get_index_options_error_increase

The number of repositories we failed to get indexing options over 5m

When considering indexing a repository we ask for the index configuration from frontend per repository. The most likely reason this would fail is failing to resolve branch names to git SHAs.

This value can spike up during deployments/etc. Only if you encounter sustained periods of errors is there an underlying issue. When sustained this indicates repositories will not get updated indexes.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100021 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum(increase(get_index_options_error_total[5m]))

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^zoekt-indexserver.*"} / container_spec_memory_limit_bytes{name=~"^zoekt-indexserver.*"}) by (name) * 100.0

zoekt: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100112 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^zoekt-indexserver.*"} / container_spec_memory_limit_bytes{name=~"^zoekt-indexserver.*"}) by (name) * 100.0

Query:

SHELL
max(container_memory_total_active_file_bytes{name=~"^zoekt-webserver.*"} / container_spec_memory_limit_bytes{name=~"^zoekt-webserver.*"}) by (name) * 100.0

zoekt: memory_kernel_usage

Memory usage (kernel)

The kernel usage metric shows the amount of memory used by the kernel on behalf of the application. Some of it may be reclaimable, so high usage does not necessarily indicate memory pressure.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100212 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max(container_memory_kernel_usage{name=~"^zoekt-webserver.*"} / container_spec_memory_limit_bytes{name=~"^zoekt-webserver.*"}) by (name) * 100.0

Zoekt: Memory mapping metrics

zoekt: memory_map_areas_percentage_used

Process memory map areas percentage used (per instance)

Processes have a limited about of memory map areas that they can use. In Zoekt, memory map areas are mainly used for loading shards into memory for queries (via mmap). However, memory map areas are also used for loading shared libraries, etc.

See https://en.wikipedia.org/wiki/Memory-mapped_file and the related articles for more information about memory maps.

Once the memory map limit is reached, the Linux kernel will prevent the process from creating any additional memory map areas. This could cause the process to crash.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(proc_metrics_memory_map_current_count{instance=~`${instance:regex}`} / proc_metrics_memory_map_max_limit{instance=~`${instance:regex}`}) * 100

zoekt: memory_major_page_faults

Webserver page faults

The number of major page faults in a 5 minute window for Zoekt webservers. If this number increases significantly, it indicates that more searches need to load data from disk. There may not be enough memory to efficiently support amount of repo data being searched.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
rate(container_memory_failures_total{failure_type="pgmajfault", name=~"^zoekt-webserver.*"}[5m])

Query:

SHELL
sum by (name) (zoekt_search_running)

zoekt: indexed_search_num_concurrent_requests_by_instance

Amount of in-flight indexed search requests (per instance)

This dashboard shows the current number of indexed search requests that are-flight, broken out per instance.

In-flight search requests include both running and queued requests.

The number of in-flight requests can serve as a proxy for the general load that webserver instances are under.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100421 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance, name) (zoekt_search_running{instance=~`${instance:regex}`})

zoekt: indexed_search_concurrent_request_growth_rate_1m_aggregate

Rate of growth of in-flight indexed search requests over 1m (aggregate)

This dashboard shows the rate of growth of in-flight requests, aggregated across all instances.

In-flight search requests include both running and queued requests.

This metric gives a notion of how quickly the indexed-search backend is working through its request load (taking into account the request arrival rate and processing time). A sustained high rate of growth can indicate that the indexed-search backend is saturated.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100430 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (name) (deriv(zoekt_search_running[1m]))

zoekt: indexed_search_concurrent_request_growth_rate_1m_per_instance

Rate of growth of in-flight indexed search requests over 1m (per instance)

This dashboard shows the rate of growth of in-flight requests, broken out per instance.

In-flight search requests include both running and queued requests.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100431 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance) (deriv(zoekt_search_running[1m]))

zoekt: indexed_search_request_errors

Indexed search request errors every 5m by code

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100440 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (code)(increase(src_zoekt_request_duration_seconds_count{code!~"2.."}[5m])) / ignoring(code) group_left sum(increase(src_zoekt_request_duration_seconds_count[5m])) * 100

zoekt: zoekt_shards_sched

Current number of zoekt scheduler processes in a state

Each ongoing search request starts its life as an interactive query. If it takes too long it becomes a batch query. Between state transitions it can be queued.

If you have a high number of batch queries it is a sign there is a large load of slow queries. Alternatively your systems are underprovisioned and normal search queries are taking too long.

For a full explanation of the states see https://github.com/sourcegraph/zoekt/blob/930cd1c28917e64c87f0ce354a0fd040877cbba1/shards/sched.go#L311-L340

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100450 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (type, state) (zoekt_shards_sched)

zoekt: zoekt_shards_sched_total

Rate of zoekt scheduler process state transitions in the last 5m

Each ongoing search request starts its life as an interactive query. If it takes too long it becomes a batch query. Between state transitions it can be queued.

If you have a high number of batch queries it is a sign there is a large load of slow queries. Alternatively your systems are underprovisioned and normal search queries are taking too long.

For a full explanation of the states see https://github.com/sourcegraph/zoekt/blob/930cd1c28917e64c87f0ce354a0fd040877cbba1/shards/sched.go#L311-L340

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100451 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (type, state) (rate(zoekt_shards_sched[5m]))

Zoekt: Git fetch durations

zoekt: 90th_percentile_successful_git_fetch_durations_5m

90th percentile successful git fetch durations over 5m

Long git fetch times can be a leading indicator of saturation.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, name)(rate(index_fetch_seconds_bucket{success="true"}[5m])))

zoekt: 90th_percentile_failed_git_fetch_durations_5m

90th percentile failed git fetch durations over 5m

Long git fetch times can be a leading indicator of saturation.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, name)(rate(index_fetch_seconds_bucket{success="false"}[5m])))

Zoekt: Indexing results

zoekt: repo_index_state_aggregate

Index results state count over 5m (aggregate)

This dashboard shows the outcomes of recently completed indexing jobs across all index-server instances.

A persistent failing state indicates some repositories cannot be indexed, perhaps due to size and timeouts.

Legend:

fail -> the indexing jobs failed
success -> the indexing job succeeded and the index was updated
success_meta -> the indexing job succeeded, but only metadata was updated
noop -> the indexing job succeed, but we didn`t need to update anything
empty -> the indexing job succeeded, but the index was empty (i.e. the repository is empty)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100600 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (state) (increase(index_repo_seconds_count[5m]))

zoekt: repo_index_state_per_instance

Index results state count over 5m (per instance)

This dashboard shows the outcomes of recently completed indexing jobs, split out across each index-server instance.

(You can use the "instance" filter at the top of the page to select a particular instance.)

A persistent failing state indicates some repositories cannot be indexed, perhaps due to size and timeouts.

Legend:

fail -> the indexing jobs failed
success -> the indexing job succeeded and the index was updated
success_meta -> the indexing job succeeded, but only metadata was updated
noop -> the indexing job succeed, but we didn`t need to update anything
empty -> the indexing job succeeded, but the index was empty (i.e. the repository is empty)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100601 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (instance, state) (increase(index_repo_seconds_count{instance=~`${instance:regex}`}[5m]))

zoekt: repo_index_success_speed_heatmap

Successful indexing durations

Latency increases can indicate bottlenecks in the indexserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100610 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (le, state) (increase(index_repo_seconds_bucket{state="success"}[$__rate_interval]))

zoekt: repo_index_fail_speed_heatmap

Failed indexing durations

Failures happening after a long time indicates timeouts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100611 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (le, state) (increase(index_repo_seconds_bucket{state="fail"}[$__rate_interval]))

zoekt: repo_index_success_speed_p99

99th percentile successful indexing durations over 5m (aggregate)

This dashboard shows the p99 duration of successful indexing jobs aggregated across all Zoekt instances.

Latency increases can indicate bottlenecks in the indexserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100620 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum by (le, name)(rate(index_repo_seconds_bucket{state="success"}[5m])))

zoekt: repo_index_success_speed_p90

90th percentile successful indexing durations over 5m (aggregate)

This dashboard shows the p90 duration of successful indexing jobs aggregated across all Zoekt instances.

Latency increases can indicate bottlenecks in the indexserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100621 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, name)(rate(index_repo_seconds_bucket{state="success"}[5m])))

zoekt: repo_index_success_speed_p75

75th percentile successful indexing durations over 5m (aggregate)

This dashboard shows the p75 duration of successful indexing jobs aggregated across all Zoekt instances.

Latency increases can indicate bottlenecks in the indexserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100622 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.75, sum by (le, name)(rate(index_repo_seconds_bucket{state="success"}[5m])))

zoekt: repo_index_success_speed_p99_per_instance

99th percentile successful indexing durations over 5m (per instance)

This dashboard shows the p99 duration of successful indexing jobs broken out per Zoekt instance.

Latency increases can indicate bottlenecks in the indexserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100630 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum by (le, instance)(rate(index_repo_seconds_bucket{state="success",instance=~`${instance:regex}`}[5m])))

zoekt: repo_index_success_speed_p90_per_instance

90th percentile successful indexing durations over 5m (per instance)

This dashboard shows the p90 duration of successful indexing jobs broken out per Zoekt instance.

Latency increases can indicate bottlenecks in the indexserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100631 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, instance)(rate(index_repo_seconds_bucket{state="success",instance=~`${instance:regex}`}[5m])))

zoekt: repo_index_success_speed_p75_per_instance

75th percentile successful indexing durations over 5m (per instance)

This dashboard shows the p75 duration of successful indexing jobs broken out per Zoekt instance.

Latency increases can indicate bottlenecks in the indexserver.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100632 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.75, sum by (le, instance)(rate(index_repo_seconds_bucket{state="success",instance=~`${instance:regex}`}[5m])))

zoekt: repo_index_failed_speed_p99

99th percentile failed indexing durations over 5m (aggregate)

This dashboard shows the p99 duration of failed indexing jobs aggregated across all Zoekt instances.

Failures happening after a long time indicates timeouts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100640 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum by (le, name)(rate(index_repo_seconds_bucket{state="fail"}[5m])))

zoekt: repo_index_failed_speed_p90

90th percentile failed indexing durations over 5m (aggregate)

This dashboard shows the p90 duration of failed indexing jobs aggregated across all Zoekt instances.

Failures happening after a long time indicates timeouts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100641 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, name)(rate(index_repo_seconds_bucket{state="fail"}[5m])))

zoekt: repo_index_failed_speed_p75

75th percentile failed indexing durations over 5m (aggregate)

This dashboard shows the p75 duration of failed indexing jobs aggregated across all Zoekt instances.

Failures happening after a long time indicates timeouts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100642 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.75, sum by (le, name)(rate(index_repo_seconds_bucket{state="fail"}[5m])))

zoekt: repo_index_failed_speed_p99_per_instance

99th percentile failed indexing durations over 5m (per instance)

This dashboard shows the p99 duration of failed indexing jobs broken out per Zoekt instance.

Failures happening after a long time indicates timeouts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100650 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum by (le, instance)(rate(index_repo_seconds_bucket{state="fail",instance=~`${instance:regex}`}[5m])))

zoekt: repo_index_failed_speed_p90_per_instance

90th percentile failed indexing durations over 5m (per instance)

This dashboard shows the p90 duration of failed indexing jobs broken out per Zoekt instance.

Failures happening after a long time indicates timeouts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100651 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, instance)(rate(index_repo_seconds_bucket{state="fail",instance=~`${instance:regex}`}[5m])))

zoekt: repo_index_failed_speed_p75_per_instance

75th percentile failed indexing durations over 5m (per instance)

This dashboard shows the p75 duration of failed indexing jobs broken out per Zoekt instance.

Failures happening after a long time indicates timeouts.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100652 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.75, sum by (le, instance)(rate(index_repo_seconds_bucket{state="fail",instance=~`${instance:regex}`}[5m])))

Zoekt: Indexing queue statistics

zoekt: indexed_num_scheduled_jobs_aggregate

# scheduled index jobs (aggregate)

A queue that is constantly growing could be a leading indicator of a bottleneck or under-provisioning

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100700 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum(index_queue_len)

zoekt: indexed_num_scheduled_jobs_per_instance

# scheduled index jobs (per instance)

A queue that is constantly growing could be a leading indicator of a bottleneck or under-provisioning

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100701 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
index_queue_len{instance=~`${instance:regex}`}

zoekt: indexed_indexing_delay_heatmap

Repo indexing delay heatmap

The indexing delay represents the amount of time between when Zoekt received a repo indexing job, to when the repo was indexed. It includes the time the repo spent in the indexing queue, as well as the time it took to actually index the repo. This metric only includes successfully indexed repos.

Large indexing delays can be an indicator of:

resource saturation
each Zoekt replica has too many jobs for it to be able to process all of them promptly. In this scenario, consider adding additional Zoekt replicas to distribute the work better .

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100710 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by (le) (increase(index_indexing_delay_seconds_bucket{state=~"success|success_meta"}[$__rate_interval]))

zoekt: indexed_indexing_delay_p90_aggregate

90th percentile indexing delay over 5m (aggregate)

This dashboard shows the p90 indexing delay aggregated across all Zoekt instances.

Large indexing delays can be an indicator of:

resource saturation
each Zoekt replica has too many jobs for it to be able to process all of them promptly. In this scenario, consider adding additional Zoekt replicas to distribute the work better.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100720 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, name)(rate(index_indexing_delay_seconds_bucket{state=~"success|success_meta"}[5m])))

zoekt: indexed_indexing_delay_p50_aggregate

50th percentile indexing delay over 5m (aggregate)

This dashboard shows the p50 indexing delay aggregated across all Zoekt instances.

Large indexing delays can be an indicator of:

resource saturation
each Zoekt replica has too many jobs for it to be able to process all of them promptly. In this scenario, consider adding additional Zoekt replicas to distribute the work better.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100721 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.50, sum by (le, name)(rate(index_indexing_delay_seconds_bucket{state=~"success|success_meta"}[5m])))

zoekt: indexed_indexing_delay_p90_per_instance

90th percentile indexing delay over 5m (per instance)

This dashboard shows the p90 indexing delay, broken out per Zoekt instance.

Large indexing delays can be an indicator of:

resource saturation
each Zoekt replica has too many jobs for it to be able to process all of them promptly. In this scenario, consider adding additional Zoekt replicas to distribute the work better.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100730 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.90, sum by (le, instance)(rate(index_indexing_delay_seconds{instance=~`${instance:regex}`}[5m])))

zoekt: indexed_indexing_delay_p50_per_instance

50th percentile indexing delay over 5m (per instance)

This dashboard shows the p50 indexing delay, broken out per Zoekt instance.

Large indexing delays can be an indicator of:

resource saturation
each Zoekt replica has too many jobs for it to be able to process all of them promptly. In this scenario, consider adding additional Zoekt replicas to distribute the work better.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100731 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
histogram_quantile(0.50, sum by (le, instance)(rate(index_indexing_delay_seconds{instance=~`${instance:regex}`}[5m])))

SHELL
sum(rate(index_shard_merging_duration_seconds_sum{error="true"}[1h])) / sum(rate(index_shard_merging_duration_seconds_count{error="true"}[1h]))

zoekt: shard_merging_errors_aggregate

Number of errors during shard merging (aggregate)

Number of errors during shard merging aggregated over all instances.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100820 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum(index_shard_merging_duration_seconds_count{error="true"}) by (app)

zoekt: shard_merging_errors_per_instance

Number of errors during shard merging (per instance)

Number of errors during shard merging per instance.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100821 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum(index_shard_merging_duration_seconds_count{instance=~`${instance:regex}`, error="true"}) by (instance)

zoekt: shard_merging_merge_running_per_instance

If shard merging is running (per instance)

Set to 1 if shard merging is running.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100830 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by (instance) (index_shard_merging_running{instance=~`${instance:regex}`})

zoekt: shard_merging_vacuum_running_per_instance

If vacuum is running (per instance)

Set to 1 if vacuum is running.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=100831 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by (instance) (index_vacuum_running{instance=~`${instance:regex}`})

SHELL
sum by (container_label_io_kubernetes_pod_name) (rate(container_network_receive_errors_total{container_label_io_kubernetes_pod_name=~`${instance:regex}`}[5m]))

SHELL
sum(rate(grpc_server_handled_total{grpc_method=~`${zoekt_webserver_method:regex}`,instance=~`${webserver_instance:regex}`,grpc_service=~"zoekt.webserver.v1.WebserverService"}[2m])) by (grpc_method, grpc_code)

Zoekt: Zoekt Webserver GRPC "internal error" metrics

zoekt: zoekt_webserver_grpc_clients_error_percentage_all_methods

Client baseline error percentage across all methods over 2m

The percentage of gRPC requests that fail across all methods (regardless of whether or not there was an internal error), aggregated across all "zoekt_webserver" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_code!="OK"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService"}[2m])))))))

zoekt: zoekt_webserver_grpc_clients_error_percentage_per_method

Client baseline error percentage per-method over 2m

The percentage of gRPC requests that fail per method (regardless of whether or not there was an internal error), aggregated across all "zoekt_webserver" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_method=~"${zoekt_webserver_method:regex}",grpc_code!="OK"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_method=~"${zoekt_webserver_method:regex}"}[2m])) by (grpc_method))))))

zoekt: zoekt_webserver_grpc_clients_all_codes_per_method

Client baseline response codes rate per-method over 2m

The rate of all generated gRPC response codes per method (regardless of whether or not there was an internal error), aggregated across all "zoekt_webserver" clients.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101102 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_method=~"${zoekt_webserver_method:regex}"}[2m])) by (grpc_method, grpc_code))

zoekt: zoekt_webserver_grpc_clients_internal_error_percentage_all_methods

Client-observed gRPC internal error percentage across all methods over 2m

The percentage of gRPC requests that appear to fail due to gRPC internal errors across all methods, aggregated across all "zoekt_webserver" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "zoekt_webserver" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101110 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_code!="OK",is_internal_error="true"}[2m])))) / ((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService"}[2m])))))))

zoekt: zoekt_webserver_grpc_clients_internal_error_percentage_per_method

Client-observed gRPC internal error percentage per-method over 2m

The percentage of gRPC requests that appear to fail to due to gRPC internal errors per method, aggregated across all "zoekt_webserver" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "zoekt_webserver" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101111 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(100.0 * ((((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_method=~"${zoekt_webserver_method:regex}",grpc_code!="OK",is_internal_error="true"}[2m])) by (grpc_method))) / ((sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_method=~"${zoekt_webserver_method:regex}"}[2m])) by (grpc_method))))))

zoekt: zoekt_webserver_grpc_clients_internal_error_all_codes_per_method

Client-observed gRPC internal error response code rate per-method over 2m

The rate of gRPC internal-error response codes per method, aggregated across all "zoekt_webserver" clients.

Note: Internal errors are ones that appear to originate from the https://github.com/grpc/grpc-go library itself, rather than from any user-written application code. These errors can be caused by a variety of issues, and can originate from either the code-generated "zoekt_webserver" gRPC client or gRPC server. These errors might be solvable by adjusting the gRPC configuration, or they might indicate a bug from Sourcegraph`s use of gRPC.

When debugging, knowing that a particular error comes from the grpc-go library itself (an internal error) as opposed to normal application code can be helpful when trying to fix it.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101112 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(sum(rate(src_grpc_method_status{grpc_service=~"zoekt.webserver.v1.WebserverService",is_internal_error="true",grpc_method=~"${zoekt_webserver_method:regex}"}[2m])) by (grpc_method, grpc_code))

SHELL
(sum(rate(src_grpc_client_retry_attempts_total{grpc_service=~"zoekt.webserver.v1.WebserverService",grpc_method=~"${zoekt_webserver_method:regex}",is_retried="true"}[2m])) by (grpc_method))

Zoekt: Data disk I/O metrics

zoekt: data_disk_reads_sec

Read request rate over 1m (per instance)

The number of read requests that were issued to the device per second.

Note: Disk statistics are per device, not per service. In certain environments (such as common docker-compose setups), zoekt could be one of many services using this disk. These statistics are best interpreted as the load experienced by the device zoekt is using, not the load zoekt is solely responsible for causing.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))

zoekt: data_disk_writes_sec

Write request rate over 1m (per instance)

The number of write requests that were issued to the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))

zoekt: data_disk_read_throughput

Read throughput over 1m (per instance)

The amount of data that was read from the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101310 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_bytes_total{instance=~`node-exporter.*`}[1m])))))

zoekt: data_disk_write_throughput

Write throughput over 1m (per instance)

The amount of data that was written to the device per second.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101311 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_written_bytes_total{instance=~`node-exporter.*`}[1m])))))

zoekt: data_disk_read_duration

Average read duration over 1m (per instance)

The average time for read requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101320 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_time_seconds_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))))

zoekt: data_disk_write_duration

Average write duration over 1m (per instance)

The average time for write requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101321 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_write_time_seconds_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))))

zoekt: data_disk_read_request_size

Average read request size over 1m (per instance)

The average size of read requests that were issued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101330 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_read_bytes_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_completed_total{instance=~`node-exporter.*`}[1m])))))))

zoekt: data_disk_write_request_size)

Average write request size over 1m (per instance)

The average size of write requests that were issued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101331 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_written_bytes_total{instance=~`node-exporter.*`}[1m])))))) / ((max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_completed_total{instance=~`node-exporter.*`}[1m])))))))

zoekt: data_disk_reads_merged_sec

Merged read request rate over 1m (per instance)

The number of read requests merged per second that were queued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101340 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_reads_merged_total{instance=~`node-exporter.*`}[1m])))))

zoekt: data_disk_writes_merged_sec

Merged writes request rate over 1m (per instance)

The number of write requests merged per second that were queued to the device.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101341 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_writes_merged_total{instance=~`node-exporter.*`}[1m])))))

zoekt: data_disk_average_queue_size

Average queue size over 1m (per instance)

The number of I/O operations that were being queued or being serviced. See https://blog.actorsfit.com/a?ID=00200-428fa2ac-e338-4540-848c-af9a3eb1ebd2 for background (avgqu-sz).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101350 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
(max by (instance) (zoekt_indexserver_mount_point_info{mount_name="indexDir",instance=~`${instance:regex}`} * on (device, nodename) group_left() (max by (device, nodename) (rate(node_disk_io_time_weighted_seconds_total{instance=~`node-exporter.*`}[1m])))))

Zoekt: [indexed-search-indexer] Golang runtime monitoring

zoekt: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*indexed-search-indexer"})

zoekt: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101401 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*indexed-search-indexer"})

Zoekt: [indexed-search] Golang runtime monitoring

zoekt: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by(instance) (go_goroutines{job=~".*indexed-search"})

zoekt: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
max by(instance) (go_gc_duration_seconds{job=~".*indexed-search"})

Zoekt: Kubernetes monitoring (only available on Kubernetes)

zoekt: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/zoekt/zoekt?viewPanel=101600 on your Sourcegraph instance.

_{Managed by the Sourcegraph Search Platform team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*indexed-search"}) / count by (app) (up{app=~".*indexed-search"}) * 100

Prometheus

Sourcegraph's all-in-one Prometheus and Alertmanager service.

To see this dashboard, visit /-/debug/grafana/d/prometheus/prometheus on your Sourcegraph instance.

Prometheus: Metrics

prometheus: metrics_cardinality

Metrics with highest cardinalities

The 10 highest-cardinality metrics collected by this Prometheus instance.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
topk(10, count by (__name__, job)({__name__!=""}))

prometheus: samples_scraped

Samples scraped by job

The number of samples scraped after metric relabeling was applied by this Prometheus instance.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(job) (scrape_samples_post_metric_relabeling{job!=""})

prometheus: prometheus_rule_eval_duration

Average prometheus rule group evaluation duration over 10m by rule group

A high value here indicates Prometheus rule evaluation is taking longer than expected. It might indicate that certain rule groups are taking too long to evaluate, or Prometheus is underprovisioned.

Rules that Sourcegraph ships with are grouped under /sg_config_prometheus. Custom rules are grouped under /sg_prometheus_addons.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(rule_group) (avg_over_time(prometheus_rule_group_last_duration_seconds[10m]))

prometheus: prometheus_rule_eval_failures

Failed prometheus rule evaluations over 5m by rule group

Rules that Sourcegraph ships with are grouped under /sg_config_prometheus. Custom rules are grouped under /sg_prometheus_addons.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(rule_group) (rate(prometheus_rule_evaluation_failures_total[5m]))

Prometheus: Alerts

prometheus: alertmanager_notification_latency

Alertmanager notification latency over 1m by integration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(integration) (rate(alertmanager_notification_latency_seconds_sum[1m]))

prometheus: alertmanager_notification_failures

Failed alertmanager notifications over 1m by integration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(integration) (rate(alertmanager_notifications_failed_total[1m]))

Prometheus: Internals

prometheus: prometheus_config_status

Prometheus configuration reload status

A 1 indicates Prometheus reloaded its configuration successfully.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100200 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
prometheus_config_last_reload_successful

prometheus: alertmanager_config_status

Alertmanager configuration reload status

A 1 indicates Alertmanager reloaded its configuration successfully.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100201 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
alertmanager_config_last_reload_successful

prometheus: prometheus_tsdb_op_failure

Prometheus tsdb failures by operation over 1m by operation

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100210 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
increase(label_replace({__name__=~"prometheus_tsdb_(.*)_failed_total"}, "operation", "$1", "__name__", "(.+)s_failed_total")[5m:1m])

prometheus: prometheus_target_sample_exceeded

Prometheus scrapes that exceed the sample limit over 10m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100211 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
increase(prometheus_target_scrapes_exceeded_sample_limit_total[10m])

prometheus: prometheus_target_sample_duplicate

Prometheus scrapes rejected due to duplicate timestamps over 10m

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100212 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
increase(prometheus_target_scrapes_sample_duplicate_timestamp_total[10m])

Prometheus: Container monitoring (not available on server)

prometheus: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod prometheus (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p prometheus.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' prometheus (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the prometheus container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs prometheus (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^prometheus.*"}) > 60)

prometheus: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^prometheus.*"}

prometheus: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100302 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^prometheus.*"}

prometheus: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100303 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^prometheus.*"}[1h]) + rate(container_fs_writes_total{name=~"^prometheus.*"}[1h]))

Query:

SHELL
max by (name) (container_oom_events_total{name=~"^prometheus.*"})

Prometheus: Kubernetes monitoring (only available on Kubernetes)

prometheus: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/prometheus/prometheus?viewPanel=100500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*prometheus"}) / count by (app) (up{app=~".*prometheus"}) * 100

Executor

Executes jobs in an isolated environment.

To see this dashboard, visit /-/debug/grafana/d/executor/executor on your Sourcegraph instance.

Executor: Executor: Executor jobs

executor: multiqueue_executor_dequeue_cache_size

Unprocessed executor job dequeue cache size for multiqueue executors

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
multiqueue_executor_dequeue_cache_size{queue=~"$queue",job=~"^(executor|sourcegraph-code-intel-indexers|executor-batches|frontend|sourcegraph-frontend|worker|sourcegraph-executors).*"}

SHELL
sum(increase(src_executor_processor_errors_total{queue=~"${queue:regex}",sg_job=~"^sourcegraph-executors.*"}[5m])) / (sum(increase(src_executor_processor_total{queue=~"${queue:regex}",sg_job=~"^sourcegraph-executors.*"}[5m])) + sum(increase(src_executor_processor_errors_total{queue=~"${queue:regex}",sg_job=~"^sourcegraph-executors.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_apiworker_apiclient_queue_errors_total{sg_job=~"^sourcegraph-executors.*"}[5m])) / (sum by (op)(increase(src_apiworker_apiclient_queue_total{sg_job=~"^sourcegraph-executors.*"}[5m])) + sum by (op)(increase(src_apiworker_apiclient_queue_errors_total{sg_job=~"^sourcegraph-executors.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_apiworker_apiclient_files_errors_total{sg_job=~"^sourcegraph-executors.*"}[5m])) / (sum by (op)(increase(src_apiworker_apiclient_files_total{sg_job=~"^sourcegraph-executors.*"}[5m])) + sum by (op)(increase(src_apiworker_apiclient_files_errors_total{sg_job=~"^sourcegraph-executors.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_apiworker_command_errors_total{op=~"setup.*",sg_job=~"^sourcegraph-executors.*"}[5m])) / (sum by (op)(increase(src_apiworker_command_total{op=~"setup.*",sg_job=~"^sourcegraph-executors.*"}[5m])) + sum by (op)(increase(src_apiworker_command_errors_total{op=~"setup.*",sg_job=~"^sourcegraph-executors.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_apiworker_command_errors_total{op=~"exec.*",sg_job=~"^sourcegraph-executors.*"}[5m])) / (sum by (op)(increase(src_apiworker_command_total{op=~"exec.*",sg_job=~"^sourcegraph-executors.*"}[5m])) + sum by (op)(increase(src_apiworker_command_errors_total{op=~"exec.*",sg_job=~"^sourcegraph-executors.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_apiworker_command_errors_total{op=~"teardown.*",sg_job=~"^sourcegraph-executors.*"}[5m])) / (sum by (op)(increase(src_apiworker_command_total{op=~"teardown.*",sg_job=~"^sourcegraph-executors.*"}[5m])) + sum by (op)(increase(src_apiworker_command_errors_total{op=~"teardown.*",sg_job=~"^sourcegraph-executors.*"}[5m]))) * 100

SHELL
sum(rate(node_network_transmit_errs_total{sg_job=~"sourcegraph-executors",sg_instance=~"$instance"}[$__rate_interval])) by(sg_instance)

Executor: Executor: Docker Registry Mirror instance metrics

executor: node_cpu_utilization

CPU utilization (minus idle/iowait)

Indicates the amount of CPU time excluding idle and iowait time, divided by the number of cores, as a percentage.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100800 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(rate(node_cpu_seconds_total{sg_job=~"sourcegraph-executors-registry",mode!~"(idle|iowait)",sg_instance=~"docker-registry"}[$__rate_interval])) by(sg_instance) / count(node_cpu_seconds_total{sg_job=~"sourcegraph-executors-registry",mode="system",sg_instance=~"docker-registry"}) by (sg_instance) * 100

executor: node_cpu_saturation_cpu_wait

CPU saturation (time waiting)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100801 on your Sourcegraph instance.

Technical details

Query:

SHELL
rate(node_pressure_cpu_waiting_seconds_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])

executor: node_memory_utilization

Memory utilization

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100810 on your Sourcegraph instance.

Technical details

Query:

SHELL
(1 - sum(node_memory_MemAvailable_bytes{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}) by (sg_instance) / sum(node_memory_MemTotal_bytes{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}) by (sg_instance)) * 100

executor: node_memory_saturation_vmeff

Memory saturation (vmem efficiency)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100811 on your Sourcegraph instance.

Technical details

Query:

SHELL
(rate(node_vmstat_pgsteal_anon{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]) + rate(node_vmstat_pgsteal_direct{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]) + rate(node_vmstat_pgsteal_file{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]) + rate(node_vmstat_pgsteal_kswapd{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) / (rate(node_vmstat_pgscan_anon{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]) + rate(node_vmstat_pgscan_direct{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]) + rate(node_vmstat_pgscan_file{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]) + rate(node_vmstat_pgscan_kswapd{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) * 100

executor: node_memory_saturation_pressure_stalled

Memory saturation (fully stalled)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100812 on your Sourcegraph instance.

Technical details

Query:

SHELL
rate(node_pressure_memory_stalled_seconds_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])

executor: node_io_disk_utilization

Disk IO utilization (percentage time spent in IO)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100820 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(label_replace(label_replace(rate(node_disk_io_time_seconds_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]), "disk", "$1", "device", "^([^d].+)"), "disk", "ignite", "device", "dm-.*")) by(sg_instance,disk) * 100

executor: node_io_disk_saturation

Disk IO saturation (avg IO queue size)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100821 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(label_replace(label_replace(rate(node_disk_io_time_weighted_seconds_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval]), "disk", "$1", "device", "^([^d].+)"), "disk", "ignite", "device", "dm-.*")) by(sg_instance,disk)

executor: node_io_disk_saturation_pressure_full

Disk IO saturation (avg time of all processes stalled)

Indicates the averaged amount of time for which all non-idle processes were stalled waiting for IO to complete simultaneously aka where no processes could make progress.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100822 on your Sourcegraph instance.

Technical details

Query:

SHELL
rate(node_pressure_io_stalled_seconds_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])

executor: node_io_network_utilization

Network IO utilization (Rx)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100830 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(rate(node_network_receive_bytes_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) by(sg_instance) * 8

executor: node_io_network_saturation

Network IO saturation (Rx packets dropped)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100831 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(rate(node_network_receive_drop_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) by(sg_instance)

executor: node_io_network_saturation

Network IO errors (Rx)

Number of bad/malformed packets received. https://www.kernel.org/doc/html/latest/networking/statistics.html#:~:text=excluding%20the%20FCS.-,rx_errors,-Total%20number%20of

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100832 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(rate(node_network_receive_errs_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) by(sg_instance)

executor: node_io_network_utilization

Network IO utilization (Tx)

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100840 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(rate(node_network_transmit_bytes_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) by(sg_instance) * 8

executor: node_io_network_saturation

Network IO saturation (Tx packets dropped)

Number of dropped transmitted packets. This can happen if the receiving side`s receive queues/buffers become full due to slow packet processing throughput, the network link is congested etc.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100841 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(rate(node_network_transmit_drop_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) by(sg_instance)

executor: node_io_network_saturation

Network IO errors (Tx)

Number of packet transmission errors. This is distinct from tx packet dropping, and can indicate a failing NIC, improperly configured network options anywhere along the line, signal noise etc.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100842 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum(rate(node_network_transmit_errs_total{sg_job=~"sourcegraph-executors-registry",sg_instance=~"docker-registry"}[$__rate_interval])) by(sg_instance)

Executor: Golang runtime monitoring

executor: go_goroutines

Maximum active goroutines

A high value here indicates a possible goroutine leak.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100900 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(sg_instance) (go_goroutines{sg_job=~".*sourcegraph-executors"})

executor: go_gc_duration_seconds

Maximum go garbage collection duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/executor/executor?viewPanel=100901 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
max by(sg_instance) (go_gc_duration_seconds{sg_job=~".*sourcegraph-executors"})

Global Containers Resource Usage

Container usage and provisioning indicators of all services.

To see this dashboard, visit /-/debug/grafana/d/containers/containers on your Sourcegraph instance.

Global Containers Resource Usage: Containers (not available on server)

containers: container_memory_usage

Container memory usage of all services

This value indicates the memory usage of all containers.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/containers/containers?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^(frontend|sourcegraph-frontend|gitserver|pgsql|codeintel-db|codeinsights|precise-code-intel-worker|prometheus|redis-cache|redis-store|redis-exporter|searcher|syntect-server|worker|zoekt-indexserver|zoekt-webserver|indexed-search|grafana|blobstore|jaeger).*"}

containers: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

This value indicates the CPU usage of all containers.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/containers/containers?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^(frontend|sourcegraph-frontend|gitserver|pgsql|codeintel-db|codeinsights|precise-code-intel-worker|prometheus|redis-cache|redis-store|redis-exporter|searcher|syntect-server|worker|zoekt-indexserver|zoekt-webserver|indexed-search|grafana|blobstore|jaeger).*"}

SHELL
max by (name) (container_oom_events_total{name=~"^(frontend|sourcegraph-frontend|gitserver|pgsql|codeintel-db|codeinsights|precise-code-intel-worker|prometheus|redis-cache|redis-store|redis-exporter|searcher|syntect-server|worker|zoekt-indexserver|zoekt-webserver|indexed-search|grafana|blobstore|jaeger).*"}) >= 1

containers: container_missing

Container missing

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/containers/containers?viewPanel=100130 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^(frontend|sourcegraph-frontend|gitserver|pgsql|codeintel-db|codeinsights|precise-code-intel-worker|prometheus|redis-cache|redis-store|redis-exporter|searcher|syntect-server|worker|zoekt-indexserver|zoekt-webserver|indexed-search|grafana|blobstore|jaeger).*"}) > 60)

Code Intelligence > Autoindexing

The service at internal/codeintel/autoindexing.

To see this dashboard, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing on your Sourcegraph instance.

Code Intelligence > Autoindexing: Codeintel: Autoindexing > Summary

####codeintel-autoindexing:

Auto-index jobs inserted over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_dbstore_indexes_inserted[5m]))

codeintel-autoindexing: codeintel_autoindexing_error_rate

Auto-indexing job scheduler operation error rate over 10m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_autoindexing_errors_total{op='HandleIndexSchedule',job=~"^${source:regex}.*"}[10m])) / (sum(increase(src_codeintel_autoindexing_total{op='HandleIndexSchedule',job=~"^${source:regex}.*"}[10m])) + sum(increase(src_codeintel_autoindexing_errors_total{op='HandleIndexSchedule',job=~"^${source:regex}.*"}[10m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_autoindexing_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_autoindexing_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_transport_graphql_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_autoindexing_store_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_store_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_store_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_autoindexing_background_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_background_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_background_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_autoindexing_inference_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_inference_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_inference_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_luasandbox_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_luasandbox_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_luasandbox_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Autoindexing: Codeintel: Autoindexing > Janitor task > Codeintel autoindexing janitor unknown repository

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_repository_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100700 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_autoindexing_janitor_unknown_repository_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_repository_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100701 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_autoindexing_janitor_unknown_repository_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_repository_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100710 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_repository_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_repository_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100711 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_autoindexing_janitor_unknown_repository_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_repository_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100712 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_repository_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_repository_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100713 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_repository_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_repository_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_repository_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Autoindexing: Codeintel: Autoindexing > Janitor task > Codeintel autoindexing janitor unknown commit

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_commit_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100800 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_autoindexing_janitor_unknown_commit_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_commit_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100801 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_autoindexing_janitor_unknown_commit_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_commit_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100810 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_commit_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_commit_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100811 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_autoindexing_janitor_unknown_commit_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_commit_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100812 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_commit_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_unknown_commit_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100813 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_commit_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_commit_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_janitor_unknown_commit_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Autoindexing: Codeintel: Autoindexing > Janitor task > Codeintel autoindexing janitor expired

codeintel-autoindexing: codeintel_autoindexing_janitor_expired_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100900 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_autoindexing_janitor_expired_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_expired_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100901 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_autoindexing_janitor_expired_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_expired_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100910 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_expired_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_expired_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100911 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_autoindexing_janitor_expired_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-autoindexing: codeintel_autoindexing_janitor_expired_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100912 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_expired_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-autoindexing: codeintel_autoindexing_janitor_expired_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-autoindexing/codeintel-autoindexing?viewPanel=100913 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_autoindexing_janitor_expired_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_autoindexing_janitor_expired_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_autoindexing_janitor_expired_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Code Nav

The service at internal/codeintel/codenav`.

To see this dashboard, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav on your Sourcegraph instance.

Code Intelligence > Code Nav: Codeintel: CodeNav > Service

codeintel-codenav: codeintel_codenav_total

Aggregate service operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_codenav_total{job=~"^${source:regex}.*"}[5m]))

codeintel-codenav: codeintel_codenav_99th_percentile_duration

Aggregate successful service operation duration distribution over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum  by (le)(rate(src_codeintel_codenav_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m]))

codeintel-codenav: codeintel_codenav_errors_total

Aggregate service operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100002 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_codenav_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-codenav: codeintel_codenav_error_rate

Aggregate service operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100003 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_codenav_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum(increase(src_codeintel_codenav_total{job=~"^${source:regex}.*"}[5m])) + sum(increase(src_codeintel_codenav_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

codeintel-codenav: codeintel_codenav_total

Service operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_codenav_total{job=~"^${source:regex}.*"}[5m]))

codeintel-codenav: codeintel_codenav_99th_percentile_duration

99th percentile successful service operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_codenav_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-codenav: codeintel_codenav_errors_total

Service operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_codenav_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-codenav: codeintel_codenav_error_rate

Service operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-codenav/codeintel-codenav?viewPanel=100013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_codenav_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_codenav_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_codenav_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_codenav_lsifstore_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_codenav_lsifstore_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_codenav_lsifstore_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_codenav_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_codenav_transport_graphql_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_codenav_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_codenav_store_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_codenav_store_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_codenav_store_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Policies

The service at internal/codeintel/policies.

To see this dashboard, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies on your Sourcegraph instance.

Code Intelligence > Policies: Codeintel: Policies > Service

codeintel-policies: codeintel_policies_total

Aggregate service operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_policies_total{job=~"^${source:regex}.*"}[5m]))

codeintel-policies: codeintel_policies_99th_percentile_duration

Aggregate successful service operation duration distribution over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum  by (le)(rate(src_codeintel_policies_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m]))

codeintel-policies: codeintel_policies_errors_total

Aggregate service operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100002 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_policies_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-policies: codeintel_policies_error_rate

Aggregate service operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100003 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_policies_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum(increase(src_codeintel_policies_total{job=~"^${source:regex}.*"}[5m])) + sum(increase(src_codeintel_policies_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

codeintel-policies: codeintel_policies_total

Service operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_policies_total{job=~"^${source:regex}.*"}[5m]))

codeintel-policies: codeintel_policies_99th_percentile_duration

99th percentile successful service operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_policies_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-policies: codeintel_policies_errors_total

Service operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_policies_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-policies: codeintel_policies_error_rate

Service operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_policies_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_policies_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_policies_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_policies_store_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_policies_store_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_policies_store_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_policies_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_policies_transport_graphql_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_policies_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Policies: Codeintel: Policies > Repository Pattern Matcher task

codeintel-policies: codeintel_background_policies_updated_total_total

Lsif repository pattern matcher repositories pattern matcher every 5m

Number of configuration policies whose repository membership list was updated

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-policies/codeintel-policies?viewPanel=100300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_background_policies_updated_total_total{job=~"^${source:regex}.*"}[5m]))

Code Intelligence > Uploads

The service at internal/codeintel/uploads.

To see this dashboard, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads on your Sourcegraph instance.

Code Intelligence > Uploads: Codeintel: Uploads > Service

codeintel-uploads: codeintel_uploads_total

Aggregate service operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_99th_percentile_duration

Aggregate successful service operation duration distribution over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum  by (le)(rate(src_codeintel_uploads_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_errors_total

Aggregate service operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100002 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_error_rate

Aggregate service operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100003 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum(increase(src_codeintel_uploads_total{job=~"^${source:regex}.*"}[5m])) + sum(increase(src_codeintel_uploads_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

codeintel-uploads: codeintel_uploads_total

Service operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_99th_percentile_duration

99th percentile successful service operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_errors_total

Service operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_error_rate

Service operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_store_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_store_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_transport_graphql_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_transport_graphql_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum by (op)(increase(src_codeintel_uploads_transport_http_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_transport_http_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_transport_http_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

SHELL
sum(increase(src_codeintel_background_upload_records_expired_total{job=~"^${source:regex}.*"}[5m]))

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads janitor unknown repository

codeintel-uploads: codeintel_uploads_janitor_unknown_repository_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_unknown_repository_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_repository_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_unknown_repository_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_repository_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100510 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_unknown_repository_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_repository_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100511 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_janitor_unknown_repository_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_janitor_unknown_repository_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100512 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_unknown_repository_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_repository_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100513 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_unknown_repository_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_janitor_unknown_repository_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_janitor_unknown_repository_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads janitor unknown commit

codeintel-uploads: codeintel_uploads_janitor_unknown_commit_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100600 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_unknown_commit_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_commit_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100601 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_unknown_commit_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_commit_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100610 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_unknown_commit_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_commit_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100611 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_janitor_unknown_commit_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_janitor_unknown_commit_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100612 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_unknown_commit_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_unknown_commit_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100613 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_unknown_commit_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_janitor_unknown_commit_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_janitor_unknown_commit_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads janitor abandoned

codeintel-uploads: codeintel_uploads_janitor_abandoned_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100700 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_abandoned_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_abandoned_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100701 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_abandoned_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_abandoned_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100710 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_abandoned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_abandoned_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100711 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_janitor_abandoned_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_janitor_abandoned_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100712 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_abandoned_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_abandoned_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100713 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_abandoned_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_janitor_abandoned_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_janitor_abandoned_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads expirer unreferenced

codeintel-uploads: codeintel_uploads_expirer_unreferenced_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100800 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_expirer_unreferenced_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100801 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_expirer_unreferenced_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100810 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100811 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_expirer_unreferenced_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100812 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100813 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads expirer unreferenced graph

codeintel-uploads: codeintel_uploads_expirer_unreferenced_graph_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100900 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_expirer_unreferenced_graph_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_graph_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100901 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_expirer_unreferenced_graph_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_graph_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100910 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_graph_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_graph_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100911 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_expirer_unreferenced_graph_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_graph_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100912 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_graph_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_expirer_unreferenced_graph_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=100913 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_graph_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_graph_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_expirer_unreferenced_graph_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads hard deleter

codeintel-uploads: codeintel_uploads_hard_deleter_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_hard_deleter_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_hard_deleter_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_hard_deleter_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_hard_deleter_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_hard_deleter_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_hard_deleter_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_hard_deleter_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_hard_deleter_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_hard_deleter_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_hard_deleter_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_hard_deleter_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_hard_deleter_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_hard_deleter_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads janitor audit logs

codeintel-uploads: codeintel_uploads_janitor_audit_logs_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_audit_logs_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_audit_logs_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_audit_logs_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_audit_logs_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101110 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_audit_logs_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_audit_logs_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101111 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_janitor_audit_logs_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_janitor_audit_logs_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101112 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_audit_logs_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_audit_logs_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101113 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_audit_logs_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_janitor_audit_logs_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_janitor_audit_logs_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Janitor task > Codeintel uploads janitor scip documents

codeintel-uploads: codeintel_uploads_janitor_scip_documents_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101200 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_scip_documents_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_scip_documents_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101201 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_janitor_scip_documents_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_scip_documents_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101210 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_scip_documents_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_scip_documents_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101211 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_janitor_scip_documents_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_janitor_scip_documents_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101212 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_scip_documents_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_janitor_scip_documents_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101213 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_janitor_scip_documents_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_janitor_scip_documents_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_janitor_scip_documents_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Reconciler task > Codeintel uploads reconciler scip metadata

codeintel-uploads: codeintel_uploads_reconciler_scip_metadata_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_reconciler_scip_metadata_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_metadata_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_reconciler_scip_metadata_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_metadata_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101310 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_reconciler_scip_metadata_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_metadata_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101311 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_reconciler_scip_metadata_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_reconciler_scip_metadata_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101312 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_reconciler_scip_metadata_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_metadata_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101313 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_reconciler_scip_metadata_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_reconciler_scip_metadata_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_reconciler_scip_metadata_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Code Intelligence > Uploads: Codeintel: Uploads > Reconciler task > Codeintel uploads reconciler scip data

codeintel-uploads: codeintel_uploads_reconciler_scip_data_records_scanned_total

Records scanned every 5m

The number of candidate records considered for cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_reconciler_scip_data_records_scanned_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_data_records_altered_total

Records altered every 5m

The number of candidate records altered as part of cleanup.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101401 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum(increase(src_codeintel_uploads_reconciler_scip_data_records_altered_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_data_total

Job invocation operations every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101410 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_reconciler_scip_data_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_data_99th_percentile_duration

99th percentile successful job invocation operation duration over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101411 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
histogram_quantile(0.99, sum  by (le,op)(rate(src_codeintel_uploads_reconciler_scip_data_duration_seconds_bucket{job=~"^${source:regex}.*"}[5m])))

codeintel-uploads: codeintel_uploads_reconciler_scip_data_errors_total

Job invocation operation errors every 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101412 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_reconciler_scip_data_errors_total{job=~"^${source:regex}.*"}[5m]))

codeintel-uploads: codeintel_uploads_reconciler_scip_data_error_rate

Job invocation operation error rate over 5m

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/codeintel-uploads/codeintel-uploads?viewPanel=101413 on your Sourcegraph instance.

_{Managed by the Sourcegraph Code intelligence team.}

Technical details

Query:

SHELL
sum by (op)(increase(src_codeintel_uploads_reconciler_scip_data_errors_total{job=~"^${source:regex}.*"}[5m])) / (sum by (op)(increase(src_codeintel_uploads_reconciler_scip_data_total{job=~"^${source:regex}.*"}[5m])) + sum by (op)(increase(src_codeintel_uploads_reconciler_scip_data_errors_total{job=~"^${source:regex}.*"}[5m]))) * 100

Telemetry

Monitoring telemetry services in Sourcegraph.

To see this dashboard, visit /-/debug/grafana/d/telemetry/telemetry on your Sourcegraph instance.

Telemetry: Telemetry Gateway Exporter: Events export and queue metrics

telemetry: telemetry_gateway_exporter_queue_size

Telemetry event payloads pending export

The number of events queued to be exported.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/telemetry/telemetry?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Data & Analytics team.}

Technical details

Query:

SHELL
sum(src_telemetrygatewayexporter_queue_size)

telemetry: telemetry_gateway_exporter_queue_growth

Rate of growth of events export queue over 30m

A positive value indicates the queue is growing.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/telemetry/telemetry?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Data & Analytics team.}

Technical details

Query:

SHELL
max(deriv(src_telemetrygatewayexporter_queue_size[30m]))

telemetry: src_telemetrygatewayexporter_exported_events

Events exported from queue per hour

The number of events being exported.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/telemetry/telemetry?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Data & Analytics team.}

Technical details

Query:

SHELL
max(increase(src_telemetrygatewayexporter_exported_events[1h]))

telemetry: telemetry_gateway_exporter_batch_size

Number of events exported per batch over 30m

The number of events exported in each batch. The largest bucket is the maximum number of events exported per batch. If the distribution trends to the maximum bucket, then events export throughput is at or approaching saturation - try increasing TELEMETRY_GATEWAY_EXPORTER_EXPORT_BATCH_SIZE or decreasing TELEMETRY_GATEWAY_EXPORTER_EXPORT_INTERVAL.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/telemetry/telemetry?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Data & Analytics team.}

Technical details

Query:

SHELL
sum by (le) (rate(src_telemetrygatewayexporter_batch_size_bucket[30m]))

SHELL
sum(increase(src_telemetrygatewayexporter_exporter_errors_total{job=~"^worker.*"}[30m])) / (sum(increase(src_telemetrygatewayexporter_exporter_total{job=~"^worker.*"}[30m])) + sum(increase(src_telemetrygatewayexporter_exporter_errors_total{job=~"^worker.*"}[30m]))) * 100

SHELL
sum(increase(src_telemetrygatewayexporter_queue_cleanup_errors_total{job=~"^worker.*"}[30m])) / (sum(increase(src_telemetrygatewayexporter_queue_cleanup_total{job=~"^worker.*"}[30m])) + sum(increase(src_telemetrygatewayexporter_queue_cleanup_errors_total{job=~"^worker.*"}[30m]))) * 100

SHELL
sum(increase(src_telemetrygatewayexporter_queue_metrics_reporter_errors_total{job=~"^worker.*"}[30m])) / (sum(increase(src_telemetrygatewayexporter_queue_metrics_reporter_total{job=~"^worker.*"}[30m])) + sum(increase(src_telemetrygatewayexporter_queue_metrics_reporter_errors_total{job=~"^worker.*"}[30m]))) * 100

Telemetry: Telemetry persistence

telemetry: telemetry_v2_export_queue_write_failures

Failed writes to events export queue over 5m

Telemetry V2 writes send events into the telemetry_events_export_queue for the exporter to periodically export.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/telemetry/telemetry?viewPanel=100400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
(sum(increase(src_telemetry_export_store_queued_events{failed="true"}[5m])) / sum(increase(src_telemetry_export_store_queued_events[5m]))) * 100

telemetry: telemetry_v2_event_logs_write_failures

Failed write V2 events to V1 'event_logs' over 5m

Telemetry V2 writes also attempt to tee events into the legacy V1 events format in the event_logs database table for long-term local persistence.

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/telemetry/telemetry?viewPanel=100401 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
(sum(increase(src_telemetry_teestore_v1_events{failed="true"}[5m])) / sum(increase(src_telemetry_teestore_v1_events[5m]))) * 100

SHELL
sum(increase(src_telemetrygatewayexporter_usermetadata_exporter_errors_total{job=~"^worker.*"}[30m])) / (sum(increase(src_telemetrygatewayexporter_usermetadata_exporter_total{job=~"^worker.*"}[30m])) + sum(increase(src_telemetrygatewayexporter_usermetadata_exporter_errors_total{job=~"^worker.*"}[30m]))) * 100

OpenTelemetry Collector

The OpenTelemetry collector ingests OpenTelemetry data from Sourcegraph and exports it to the configured backends.

To see this dashboard, visit /-/debug/grafana/d/otel-collector/otel-collector on your Sourcegraph instance.

OpenTelemetry Collector: Receivers

otel-collector: otel_span_receive_rate

Spans received per receiver per minute

Shows the rate of spans accepted by the configured reveiver

A Trace is a collection of spans and a span represents a unit of work or operation. Spans are the building blocks of Traces. The spans have only been accepted by the receiver, which means they still have to move through the configured pipeline to be exported. For more information on tracing and configuration of a OpenTelemetry receiver see https://opentelemetry.io/docs/collector/configuration/#receivers.

See the Exporters section see spans that have made it through the pipeline and are exported.

Depending the configured processors, received spans might be dropped and not exported. For more information on configuring processors see https://opentelemetry.io/docs/collector/configuration/#processors.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (receiver) (rate(otelcol_receiver_accepted_spans[1m]))

otel-collector: otel_span_refused

Spans refused per receiver

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (receiver) (rate(otelcol_receiver_refused_spans[1m]))

OpenTelemetry Collector: Exporters

otel-collector: otel_span_export_rate

Spans exported per exporter per minute

Shows the rate of spans being sent by the exporter

A Trace is a collection of spans. A Span represents a unit of work or operation. Spans are the building blocks of Traces. The rate of spans here indicates spans that have made it through the configured pipeline and have been sent to the configured export destination.

For more information on configuring a exporter for the OpenTelemetry collector see https://opentelemetry.io/docs/collector/configuration/#exporters.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (exporter) (rate(otelcol_exporter_sent_spans[1m]))

otel-collector: otel_span_export_failures

Span export failures by exporter

Shows the rate of spans failed to be sent by the configured reveiver. A number higher than 0 for a long period can indicate a problem with the exporter configuration or with the service that is being exported too

For more information on configuring a exporter for the OpenTelemetry collector see https://opentelemetry.io/docs/collector/configuration/#exporters.

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (exporter) (rate(otelcol_exporter_send_failed_spans[1m]))

SHELL
sum by (exporter) (rate(otelcol_exporter_enqueue_failed_spans{job=~"^.*"}[1m]))

OpenTelemetry Collector: Processors

otel-collector: otelcol_processor_dropped_spans

Spans dropped per processor per minute

Shows the rate of spans dropped by the configured processor

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (processor) (rate(otelcol_processor_dropped_spans[1m]))

OpenTelemetry Collector: Collector resource usage

otel-collector: otel_cpu_usage

Cpu usage of the collector

Shows CPU usage as reported by the OpenTelemetry collector.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100400 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (job) (rate(otelcol_process_cpu_seconds{job=~"^.*"}[1m]))

otel-collector: otel_memory_resident_set_size

Memory allocated to the otel collector

Shows the allocated memory Resident Set Size (RSS) as reported by the OpenTelemetry collector.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100401 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (job) (rate(otelcol_process_memory_rss{job=~"^.*"}[1m]))

otel-collector: otel_memory_usage

Memory used by the collector

Shows how much memory is being used by the otel collector.

High memory usage might indicate thad the configured pipeline is keeping a lot of spans in memory for processing
Spans failing to be sent and the exporter is configured to retry
A high batch count by using a batch processor

For more information on configuring processors for the OpenTelemetry collector see https://opentelemetry.io/docs/collector/configuration/#processors.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100402 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (job) (rate(otelcol_process_runtime_total_alloc_bytes{job=~"^.*"}[1m]))

OpenTelemetry Collector: Container monitoring (not available on server)

otel-collector: container_missing

Container missing

Kubernetes:
- Determine if the pod was OOM killed using kubectl describe pod otel-collector (look for OOMKilled: true) and, if so, consider increasing the memory limit in the relevant Deployment.yaml.
- Check the logs before the container restarted to see if there are panic: messages or similar using kubectl logs -p otel-collector.
Docker Compose:
- Determine if the pod was OOM killed using docker inspect -f '\{\{json .State\}\}' otel-collector (look for "OOMKilled":true) and, if so, consider increasing the memory limit of the otel-collector container in docker-compose.yml.
- Check the logs before the container restarted to see if there are panic: messages or similar using docker logs otel-collector (note this will include logs from the previous and currently running container).

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100500 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
count by(name) ((time() - container_last_seen{name=~"^otel-collector.*"}) > 60)

otel-collector: container_cpu_usage

Container cpu usage total (1m average) across all cores by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100501 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_cpu_usage_percentage_total{name=~"^otel-collector.*"}

otel-collector: container_memory_usage

Container memory usage by instance

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100502 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
cadvisor_container_memory_usage_percentage_total{name=~"^otel-collector.*"}

otel-collector: fs_io_operations

Filesystem reads and writes rate by instance over 1h

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100503 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(name) (rate(container_fs_reads_total{name=~"^otel-collector.*"}[1h]) + rate(container_fs_writes_total{name=~"^otel-collector.*"}[1h]))

OpenTelemetry Collector: Kubernetes monitoring (only available on Kubernetes)

otel-collector: pods_available_percentage

Percentage pods available

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/otel-collector/otel-collector?viewPanel=100600 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by(app) (up{app=~".*otel-collector"}) / count by (app) (up{app=~".*otel-collector"}) * 100

Completions

Cody chat and code completions.

To see this dashboard, visit /-/debug/grafana/d/completions/completions on your Sourcegraph instance.

Completions: Completions requests

completions: api_request_rate

Rate of completions API requests

Rate (QPS) of requests to cody chat and code completion endpoints.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/completions/completions?viewPanel=100000 on your Sourcegraph instance.

Technical details

Query:

SHELL
sum by (code)(irate(src_http_request_duration_seconds_count{route=~"^cody.completions.*"}[5m]))

SHELL
histogram_quantile(0.50, sum(rate(src_completions_upstream_connection_dial_duration_seconds_bucket[$sampling_duration])) by (le, provider))

SHELL
histogram_quantile(0.50, sum(rate(src_completions_upstream_connection_dial_duration_seconds_bucket[$sampling_duration])) by (le, provider))

Completions: Completion credits entitlements

completions: completion_credits_check_entitlement_duration_p95

95th percentile completion credits entitlement check duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/completions/completions?viewPanel=100300 on your Sourcegraph instance.

_{Managed by the Sourcegraph Core Services team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, sum(rate(src_completion_credits_check_entitlement_duration_ms_bucket[5m])) by (le))

completions: completion_credits_consume_credits_duration_p95

95th percentile completion credits consume duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/completions/completions?viewPanel=100301 on your Sourcegraph instance.

_{Managed by the Sourcegraph Core Services team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, sum(rate(src_completion_credits_consume_duration_ms_bucket[5m])) by (le))

completions: completion_credits_check_entitlement_durations

Completion credits entitlement check duration over 5m

This metric tracks pre-completion-request latency for checking if completion credits entitlement has been exceeded.
- If this value is high, this latency may be noticeable to users.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/completions/completions?viewPanel=100310 on your Sourcegraph instance.

_{Managed by the Sourcegraph Core Services team.}

Technical details

Query:

SHELL
sum by (le) (rate(src_completion_credits_check_entitlement_duration_ms_bucket[5m]))

Periodic Goroutines

Overview of all periodic background routines across Sourcegraph services.

To see this dashboard, visit /-/debug/grafana/d/periodic-goroutines/periodic-goroutines on your Sourcegraph instance.

Periodic Goroutines: Periodic Goroutines Overview

periodic-goroutines: total_running_goroutines

Total number of running periodic goroutines across all services

The total number of running periodic goroutines across all services. This provides a high-level overview of system activity.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/periodic-goroutines/periodic-goroutines?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum(src_periodic_goroutine_running)

periodic-goroutines: goroutines_by_service

Number of running periodic goroutines by service

The number of running periodic goroutines broken down by service. This helps identify which services are running the most background routines.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/periodic-goroutines/periodic-goroutines?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
sum by (job) (src_periodic_goroutine_running)

periodic-goroutines: top_error_producers

Top 10 periodic goroutines by error rate

The top 10 periodic goroutines with the highest error rates. These routines may require immediate attention or investigation.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/periodic-goroutines/periodic-goroutines?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
topk(10, sum by (name, job) (rate(src_periodic_goroutine_errors_total[5m])))

periodic-goroutines: top_time_consumers

Top 10 slowest periodic goroutines

The top 10 periodic goroutines with the longest average execution time. These routines may be candidates for optimization or load distribution.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/periodic-goroutines/periodic-goroutines?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Infrastructure Org team.}

Technical details

Query:

SHELL
topk(10, max by (name, job) (rate(src_periodic_goroutine_duration_seconds_sum[5m]) / rate(src_periodic_goroutine_duration_seconds_count[5m])))

SHELL
sum by (name, job) (rate(src_periodic_goroutine_tenant_errors_total{name=~'${routineName:regex}', job=~'${serviceName:regex}'}[5m]))

Background Jobs Dashboard

Overview of all background jobs in the system.

To see this dashboard, visit /-/debug/grafana/d/background-jobs/background-jobs on your Sourcegraph instance.

Background Jobs Dashboard: DBWorker Store Operations

background-jobs: operation_rates_by_method

Rate of operations by method (5m)

shows the rate of different dbworker store operations

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100000 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (op) (rate(src_workerutil_dbworker_store_total{domain=~"$dbworker_domain"}[5m]))

background-jobs: error_rates

Rate of errors by method (5m)

Rate of errors by operation type. Check specific operations with high error rates.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100001 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (op) (rate(src_workerutil_dbworker_store_errors_total{domain=~"$dbworker_domain"}[5m]))

background-jobs: p90_duration_by_method

90th percentile duration by method

90th percentile latency for dbworker store operations.

Investigate database query performance and indexing for the affected operations. Look for slow queries in database logs.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100010 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.9, sum by(le, op) (rate(src_workerutil_dbworker_store_duration_seconds_bucket{domain=~"$dbworker_domain"}[5m])))

background-jobs: p50_duration_by_method

Median duration by method

median latency for dbworker store operations

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100011 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.5, sum by(le, op) (rate(src_workerutil_dbworker_store_duration_seconds_bucket{domain=~"$dbworker_domain"}[5m])))

background-jobs: p90_duration_by_domain

90th percentile duration by domain

90th percentile latency for dbworker store operations.

Investigate database performance for the specific domain. May indicate issues with specific database tables or query patterns.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100012 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.9, sum by(le, domain) (rate(src_workerutil_dbworker_store_duration_seconds_bucket{domain=~"$dbworker_domain"}[5m])))

background-jobs: p50_duration_by_method

Median operation duration by method

median latency for dbworker store operations by method

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100013 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.5, sum by(le, op) (rate(src_workerutil_dbworker_store_duration_seconds_bucket{domain=~"$dbworker_domain"}[5m])))

background-jobs: dequeue_performance

Dequeue operation metrics

rate of dequeue operations by domain - critical for worker performance

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100020 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (rate(src_workerutil_dbworker_store_total{op="Dequeue", domain=~"$dbworker_domain"}[5m]))

background-jobs: error_percentage_by_method

Percentage of operations resulting in error by method

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100021 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(sum by (op) (rate(src_workerutil_dbworker_store_errors_total{domain=~"$dbworker_domain"}[5m])) / sum by (op) (rate(src_workerutil_dbworker_store_total{domain=~"$dbworker_domain"}[5m]))) * 100

background-jobs: error_percentage_by_domain

Percentage of operations resulting in error by domain

Refer to the alerts reference for 2 alerts related to this panel.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100022 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(sum by (domain) (rate(src_workerutil_dbworker_store_errors_total{domain=~"$dbworker_domain"}[5m])) / sum by (domain) (rate(src_workerutil_dbworker_store_total{domain=~"$dbworker_domain"}[5m]))) * 100

background-jobs: operation_latency_heatmap

Distribution of operation durations

Distribution of operation durations - shows the spread of latencies across all operations

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100023 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (le) (rate(src_workerutil_dbworker_store_duration_seconds_bucket{domain=~"$dbworker_domain"}[5m]))

Background Jobs Dashboard: DBWorker Resetter

background-jobs: resetter_duration

Time spent running the resetter

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100100 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.95, sum by(le, domain) (rate(src_dbworker_resetter_duration_seconds_bucket{domain=~"$resetter_domain"}[5m])))

background-jobs: resetter_runs

Number of times the resetter ran

the number of times the resetter ran in the last 5 minutes

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100101 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (increase(src_dbworker_resetter_total{domain=~"$resetter_domain"}[5m]))

background-jobs: resetter_failures

Number of times the resetter failed to run

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100102 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (increase(src_dbworker_resetter_errors_total{domain=~"$resetter_domain"}[5m]))

background-jobs: reset_records

Number of stalled records reset back to 'queued' state

the number of stalled records that were reset back to the queued state in the last 5 minutes

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100110 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (increase(src_dbworker_resetter_record_resets_total{domain=~"$resetter_domain"}[5m]))

background-jobs: failed_records

Number of stalled records marked as 'failed'

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100111 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (increase(src_dbworker_resetter_record_reset_failures_total{domain=~"$resetter_domain"}[5m]))

background-jobs: stall_duration

Duration jobs were stalled before being reset

median time a job was stalled before being reset

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100120 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (le) (rate(src_dbworker_resetter_stall_duration_seconds_bucket{domain=~"$resetter_domain"}[5m]))

background-jobs: stall_duration_p90

90th percentile of stall duration

Refer to the alerts reference for 1 alert related to this panel.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100121 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
histogram_quantile(0.9, sum by(le, domain) (rate(src_dbworker_resetter_stall_duration_seconds_bucket{domain=~"$resetter_domain"}[5m])))

background-jobs: reset_vs_failure_ratio

Ratio of jobs reset to queued versus marked as failed

ratio of reset jobs to failed jobs - higher values indicate healthier job processing

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100122 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
(sum by (domain) (increase(src_dbworker_resetter_record_resets_total{domain=~"$resetter_domain"}[1h]))) / on(domain) (sum by (domain) (increase(src_dbworker_resetter_record_reset_failures_total{domain=~"$resetter_domain"}[1h]) > 0) or on(domain) sum by (domain) (increase(src_dbworker_resetter_record_resets_total{domain=~"$resetter_domain"}[1h]) * 0 + 1))

Query:

SHELL
sum(increase(src_workerutil_queue_depth[30m]))/1800

background-jobs: queue_depth_by_domain

Number of jobs in queue by domain

Number of queued jobs per domain. Large values may indicate workers are not keeping up with incoming jobs.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100210 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (max by (domain) (src_workerutil_queue_depth))

background-jobs: queue_duration_by_domain

Maximum queue time by domain

Maximum time a job has been waiting in queue per domain. Long durations indicate potential worker stalls.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100211 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (max by (domain) (src_workerutil_queue_duration_seconds))

background-jobs: queue_growth_by_domain

Rate of change in queue size by domain

Rate of change in queue size per domain. Consistently positive values indicate jobs are being queued faster than processed.

This panel has no related alerts.

To see this panel, visit /-/debug/grafana/d/background-jobs/background-jobs?viewPanel=100212 on your Sourcegraph instance.

_{Managed by the Sourcegraph Source team.}

Technical details

Query:

SHELL
sum by (domain) (idelta(src_workerutil_queue_depth[10m])) / 600