All Collections
Integrations
Data Push Destination Example Using Google BigQuery
Data Push Destination Example Using Google BigQuery

Use BigQuery to Analyze UXI Data

J
Written by Josh Peters
Updated over a week ago

Once you configure your UXI dashboard to send test result and issues to Google BigQuery, you can immediately start using the data to analyze your networks and services.

The UXI data in BigQuery is partitioned by day (UTC day) for the time it was written to BigQuery. In addition, the issues, test results and test codes within the test results are in separate tables in order to optimize the queries. This allows you to extract and analyze only the data you are interested in.

This article will show some example queries.

Let's say you have several sensors at each site, you want to identify which sites have the most issues over the past week, and you want to know what those issues are and what networks have these issues. The query would look like this:

SELECT
code,
count(code) as occurrences,
context.hierarchy_node_name as site,
context.network_name as network_name,
context.service_name as service_name,
FROM
`<project>.<dataset>.issues_bigquery_<customer_uid>`
WHERE
DATE(_PARTITIONTIME) >= DATE_ADD(CURRENT_DATE(), INTERVAL -7 DAY)
AND status = 'CONFIRMED'
GROUP BY code,site,network_name,service_name
ORDER BY occurrences DESC

Let's say you want to know which issues are ongoing now. The query may look like this:
โ€‹

SELECT
T1.uid,
T1.sensor_name,
T1.network_name,
T1.service_name,
T1.code,
T1.status
FROM (
SELECT
uid,
context.sensor_name AS sensor_name,
context.network_name AS network_name,
context.service_name AS service_name,
code,
status
FROM
`<project>.<dataset>.issues_bigquery_<customer_uid>`
WHERE
DATE(_PARTITIONTIME) >= DATE_ADD(CURRENT_DATE(), INTERVAL -7 DAY)
AND status = 'CONFIRMED' ) AS T1
LEFT JOIN (
SELECT
uid,
context.sensor_name AS sensor_name,
context.network_name AS network_name,
context.service_name AS service_name,
code,
status
FROM
`<project>.<dataset>.issues_bigquery_<customer_uid>`
WHERE
DATE(_PARTITIONTIME) >= DATE_ADD(CURRENT_DATE(), INTERVAL -7 DAY)
AND status = 'RESOLVED' ) AS T2
ON
T1.uid=T2.uid
WHERE T2.uid is null
ORDER BY
sensor_name DESC

Lets say you made a network change May 8th and want to identify which sites have the worst DHCP response times since then. The query would look like this:

SELECT
avg(dora_time_milliseconds) as avg_dhcp_time,
context_sensor_name as sensor_name,
context_network_name as network_name
FROM
`<project>.<dataset>.test_results_bigquery_<customer_uid>_DHCP`
WHERE
DATE(_PARTITIONTIME) >= "2023-05-08"
GROUP BY sensor_name,network_name
ORDER BY avg_dhcp_time DESC

Application (PING and HTTP_GET) query examples

If you want to know which groups and networks have the worst latency to an application, you can analyze the ping results.

Here's a query to find the 20 worst groups (change ASC to DESC for 20 best groups), excluding results where RSSI is low.

SELECT
context_hierarchy_node_name,
context_service_name,
context_network_name,
avg(latency_milliseconds) as avg_latency_ms,
avg(jitter_milliseconds) as avg_jitter_ms,
avg(packets_dropped/packets_sent) as avg_packet_loss
FROM
`<project>.<dataset>.test_results_bigquery_<customer_uid>_PING`
WHERE
DATE(_PARTITIONTIME) >= DATE_ADD(CURRENT_DATE(), INTERVAL -7 DAY) and context_wifi_signal_level < -75
GROUP BY context_hierarchy_node_name, context_service_name, context_network_name
ORDER BY avg_latency_ms DESC
LIMIT 20

If you want to know which groups and networks have the worst HTTP get response time, you can change the query as follows:

SELECT
context_hierarchy_node_name,
context_service_name,
context_network_name,
avg(request_time_milliseconds) as avg_request_time_milliseconds
FROM
`<project>.<dataset>.test_results_bigquery_<customer_uid>_HTTP_GET`
WHERE
DATE(_PARTITIONTIME) >= DATE_ADD(CURRENT_DATE(), INTERVAL -7 DAY) and context_wifi_signal_level < -75
GROUP BY context_hierarchy_node_name, context_service_name, context_network_name
ORDER BY avg_request_time_milliseconds DESC
LIMIT 20

Throughput query examples

Let's say you are running librespeed and want to know the average upload and download speeds across all parent groups over the last 7 days.

SELECT
avg(speed_download_bps) / 1000000 as avg_speed_download_mbps,
avg(speed_upload_bps) / 1000000 as avg_speed_upload_mbps
FROM
`<project>.<dataset>.test_results_bigquery_<customer_uid>_LIBRESPEED`
WHERE
DATE(_PARTITIONTIME) >= DATE_ADD(CURRENT_DATE(), INTERVAL -7 DAY)

Here's a query to find the 20 worst groups (change ASC to DESC for 20 best groups)

SELECT
count(timestamp) as number_of_measurements,
avg(speed_download_bps) / 1000000 as avg_speed_download_mbps,
context_hierarchy_node_name as group_name,
context_network_name as network_name
FROM
`<project>.<dataset>.test_results_bigquery_<customer_uid>_LIBRESPEED`
WHERE
DATE(_PARTITIONTIME) >= DATE_ADD(CURRENT_DATE(), INTERVAL -7 DAY)
GROUP BY group_name,network_name
ORDER BY avg_speed_download_mbps ASC
LIMIT 20

Did this answer your question?