Metrics¶
This document lists all the monitoring metrics provided by Kube-OVN.
ovn-monitor¶
OVN status metrics:
Type | Metric | Description |
---|---|---|
Gauge | kube_ovn_ovn_status | OVN Health Status. The values are: (2) for standby or follower, (1) for active or leader, (0) for unhealthy. |
Gauge | kube_ovn_failed_req_count | The number of failed requests to OVN stack. |
Gauge | kube_ovn_log_file_size | The size of a log file associated with an OVN component. |
Gauge | kube_ovn_db_file_size | The size of a database file associated with an OVN component. |
Gauge | kube_ovn_chassis_info | Whether the OVN chassis is up (1) or down (0), together with additional information about the chassis. |
Gauge | kube_ovn_db_status | The status of OVN NB/SB DB, (1) for healthy, (0) for unhealthy. |
Gauge | kube_ovn_logical_switch_info | The information about OVN logical switch. This metric is always up (1). |
Gauge | kube_ovn_logical_switch_external_id | Provides the external IDs and values associated with OVN logical switches. This metric is always up (1). |
Gauge | kube_ovn_logical_switch_port_binding | Provides the association between a logical switch and a logical switch port. This metric is always up (1). |
Gauge | kube_ovn_logical_switch_tunnel_key | The value of the tunnel key associated with the logical switch. |
Gauge | kube_ovn_logical_switch_ports_num | The number of logical switch ports connected to the OVN logical switch. |
Gauge | kube_ovn_logical_switch_port_info | The information about OVN logical switch port. This metric is always up (1). |
Gauge | kube_ovn_logical_switch_port_tunnel_key | The value of the tunnel key associated with the logical switch port. |
Gauge | kube_ovn_cluster_enabled | Is OVN clustering enabled (1) or not (0). |
Gauge | kube_ovn_cluster_role | A metric with a constant '1' value labeled by server role. |
Gauge | kube_ovn_cluster_status | A metric with a constant '1' value labeled by server status. |
Gauge | kube_ovn_cluster_term | The current raft term known by this server. |
Gauge | kube_ovn_cluster_leader_self | Is this server consider itself a leader (1) or not (0). |
Gauge | kube_ovn_cluster_vote_self | Is this server voted itself as a leader (1) or not (0). |
Gauge | kube_ovn_cluster_election_timer | The current election timer value. |
Gauge | kube_ovn_cluster_log_not_committed | The number of log entries not yet committed by this server. |
Gauge | kube_ovn_cluster_log_not_applied | The number of log entries not yet applied by this server. |
Gauge | kube_ovn_cluster_log_index_start | The log entry index start value associated with this server. |
Gauge | kube_ovn_cluster_log_index_next | The log entry index next value associated with this server. |
Gauge | kube_ovn_cluster_inbound_connections_total | The total number of inbound connections to the server. |
Gauge | kube_ovn_cluster_outbound_connections_total | The total number of outbound connections from the server. |
Gauge | kube_ovn_cluster_inbound_connections_error_total | The total number of failed inbound connections to the server. |
Gauge | kube_ovn_cluster_outbound_connections_error_total | The total number of failed outbound connections from the server. |
ovs-monitor¶
ovsdb
and vswitchd
status metrics:
Type | Metric | Description |
---|---|---|
Gauge | ovs_status | OVS Health Status. The values are: health(1), unhealthy(0). |
Gauge | ovs_info | This metric provides basic information about OVS. It is always set to 1. |
Gauge | failed_req_count | The number of failed requests to OVS stack. |
Gauge | log_file_size | The size of a log file associated with an OVS component. |
Gauge | db_file_size | The size of a database file associated with an OVS component. |
Gauge | datapath | Represents an existing datapath. This metrics is always 1. |
Gauge | dp_total | Represents total number of datapaths on the system. |
Gauge | dp_if | Represents an existing datapath interface. This metrics is always 1. |
Gauge | dp_if_total | Represents the number of ports connected to the datapath. |
Gauge | dp_flows_total | The number of flows in a datapath. |
Gauge | dp_flows_lookup_hit | The number of incoming packets in a datapath matching existing flows in the datapath. |
Gauge | dp_flows_lookup_missed | The number of incoming packets in a datapath not matching any existing flow in the datapath. |
Gauge | dp_flows_lookup_lost | The number of incoming packets in a datapath destined for userspace process but subsequently dropped before reaching userspace. |
Gauge | dp_masks_hit | The total number of masks visited for matching incoming packets. |
Gauge | dp_masks_total | The number of masks in a datapath. |
Gauge | dp_masks_hit_ratio | The average number of masks visited per packet. It is the ration between hit and total number of packets processed by a datapath. |
Gauge | interface | Represents OVS interface. This is the primary metric for all other interface metrics. This metrics is always 1. |
Gauge | interface_admin_state | The administrative state of the physical network link of OVS interface. The values are: down(0), up(1), other(2). |
Gauge | interface_link_state | The state of the physical network link of OVS interface. The values are: down(0), up(1), other(2). |
Gauge | interface_mac_in_use | The MAC address in use by OVS interface. |
Gauge | interface_mtu | The currently configured MTU for OVS interface. |
Gauge | interface_of_port | Represents the OpenFlow port ID associated with OVS interface. |
Gauge | interface_if_index | Represents the interface index associated with OVS interface. |
Gauge | interface_tx_packets | Represents the number of transmitted packets by OVS interface. |
Gauge | interface_tx_bytes | Represents the number of transmitted bytes by OVS interface. |
Gauge | interface_rx_packets | Represents the number of received packets by OVS interface. |
Gauge | interface_rx_bytes | Represents the number of received bytes by OVS interface. |
Gauge | interface_rx_crc_err | Represents the number of CRC errors for the packets received by OVS interface. |
Gauge | interface_rx_dropped | Represents the number of input packets dropped by OVS interface. |
Gauge | interface_rx_errors | Represents the total number of packets with errors received by OVS interface. |
Gauge | interface_rx_frame_err | Represents the number of frame alignment errors on the packets received by OVS interface. |
Gauge | interface_rx_missed_err | Represents the number of packets with RX missed received by OVS interface. |
Gauge | interface_rx_over_err | Represents the number of packets with RX overrun received by OVS interface. |
Gauge | interface_tx_dropped | Represents the number of output packets dropped by OVS interface. |
Gauge | interface_tx_errors | Represents the total number of transmit errors by OVS interface. |
Gauge | interface_collisions | Represents the number of collisions on OVS interface. |
kube-ovn-pinger¶
Network quality related metrics:
Type | Metric | Description |
---|---|---|
Gauge | pinger_ovs_up | If the ovs on the node is up |
Gauge | pinger_ovs_down | If the ovs on the node is down |
Gauge | pinger_ovn_controller_up | If the ovn_controller on the node is up |
Gauge | pinger_ovn_controller_down | If the ovn_controller on the node is down |
Gauge | pinger_inconsistent_port_binding | The number of mismatch port bindings between ovs and ovn-sb |
Gauge | pinger_apiserver_healthy | If the apiserver request is healthy on this node |
Gauge | pinger_apiserver_unhealthy | If the apiserver request is unhealthy on this node |
Histogram | pinger_apiserver_latency_ms | The latency ms histogram the node request apiserver |
Gauge | pinger_internal_dns_healthy | If the internal dns request is unhealthy on this node |
Gauge | pinger_internal_dns_unhealthy | If the internal dns request is unhealthy on this node |
Histogram | pinger_internal_dns_latency_ms | The latency ms histogram the node request internal dns |
Gauge | pinger_external_dns_health | If the external dns request is healthy on this node |
Gauge | pinger_external_dns_unhealthy | If the external dns request is unhealthy on this node |
Histogram | pinger_external_dns_latency_ms | The latency ms histogram the node request external dns |
Histogram | pinger_pod_ping_latency_ms | The latency ms histogram for pod peer ping |
Gauge | pinger_pod_ping_lost_total | The lost count for pod peer ping |
Gauge | pinger_pod_ping_count_total | The total count for pod peer ping |
Histogram | pinger_node_ping_latency_ms | The latency ms histogram for pod ping node |
Gauge | pinger_node_ping_lost_total | The lost count for pod ping node |
Gauge | pinger_node_ping_count_total | The total count for pod ping node |
Histogram | pinger_external_ping_latency_ms | The latency ms histogram for pod ping external address |
Gauge | pinger_external_lost_total | The lost count for pod ping external address |
kube-ovn-controller¶
kube-ovn-controller
status metrics:
Type | Metric | Description |
---|---|---|
Histogram | rest_client_request_latency_seconds | Request latency in seconds. Broken down by verb and URL |
Counter | rest_client_requests_total | Number of HTTP requests, partitioned by status code, method, and host |
Counter | lists_total | Total number of API lists done by the reflectors |
Summary | list_duration_seconds | How long an API list takes to return and decode for the reflectors |
Summary | items_per_list | How many items an API list returns to the reflectors |
Counter | watches_total | Total number of API watches done by the reflectors |
Counter | short_watches_total | Total number of short API watches done by the reflectors |
Summary | watch_duration_seconds | How long an API watch takes to return and decode for the reflectors |
Summary | items_per_watch | How many items an API watch returns to the reflectors |
Gauge | last_resource_version | Last resource version seen for the reflectors |
Histogram | ovs_client_request_latency_milliseconds | The latency histogram for ovs request |
Gauge | subnet_available_ip_count | The available num of ip address in subnet |
Gauge | subnet_used_ip_count | The used num of ip address in subnet |
kube-ovn-cni¶
kube-ovn-cni
status metrics:
Type | Metric | Description |
---|---|---|
Histogram | cni_op_latency_seconds | The latency seconds for cni operations |
Counter | cni_wait_address_seconds_total | Latency that cni wait controller to assign an address |
Counter | cni_wait_connectivity_seconds_total | Latency that cni wait address ready in overlay network |
Counter | cni_wait_route_seconds_total | Latency that cni wait controller to add routed annotation to pod |
Histogram | rest_client_request_latency_seconds | Request latency in seconds. Broken down by verb and URL |
Counter | rest_client_requests_total | Number of HTTP requests, partitioned by status code, method, and host |
Counter | lists_total | Total number of API lists done by the reflectors |
Summary | list_duration_seconds | How long an API list takes to return and decode for the reflectors |
Summary | items_per_list | How many items an API list returns to the reflectors |
Counter | watches_total | Total number of API watches done by the reflectors |
Counter | short_watches_total | Total number of short API watches done by the reflectors |
Summary | watch_duration_seconds | How long an API watch takes to return and decode for the reflectors |
Summary | items_per_watch | How many items an API watch returns to the reflectors |
Gauge | last_resource_version | Last resource version seen for the reflectors |
Histogram | ovs_client_request_latency_milliseconds | The latency histogram for ovs request |
Last update: October 18, 2022
Created: June 30, 2022
Created: June 30, 2022