Data Push Destination Example Analyzing UXI Data in S3 with Amazon Athena

Once your UXI sensor test result or issue data is in S3, you can use tools like Amazon Athena to analyze the data. Athena is a serverless application that uses the S3 data directly. You do not need to load your data into Athena, you simply describe the format of how it is represented in S3 and run SQL queries to analyze it.

You can refer to Amazon’s official documentation for getting started with Athena. Below are some examples of creating a database, setting up a table partitioned by date and running a query to analyze the data.

Create a Database:

The first step is creating a database. You can enter the following in the Query Editor and run the query.

CREATE DATABASE my_uxi_data;

Create a Table

In Athena, you may need to refresh your browser to select the newly created database. Once selected, the next step is to create a table that describes how the data is represented in S3.

A table for analyzing issue data will look like this:

CREATE EXTERNAL TABLE issues (
	code string,
	severity string,
	status string,
	timestamp string,
	uid string,
	context struct < category: string,
	    customer_uid: string,
	    hierarchy_node_name: string,
	    hierarchy_type: string,
	    mac_address: string,
	    network_name: string,
	    network_uid: string,
	    sensor_name: string,
	    sensor_serial: string,
	    sensor_uid: string,
	    service_name: string,
	    service_uid: string >
) PARTITIONED BY (year string, month string, day string) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION 's3://<your bucket name>/Aruba-UXI/issues.s3.<customer uid>/'

A table for analyzing test results for DHCP will look like this:

Note for other test types you will need to refer to the test result schema and modify accordingly.

CREATE EXTERNAL TABLE test_results (
  code string,
  dora_time_milliseconds float,
  `timestamp` string,
  lease_time_milliseconds int,
  server string,
  context struct<category:string,
              customer_uid:string,
              default_gateway:string,
              device_uid:string,
              dhcp_server:string,
              hierarchy_node_name:string,
              hierarchy_node_path:string,
              ethernet_duplex_mode:string,
              ethernet_link_speed_bps:int,
              interface_name:string,
              interface_type:string,
              ip_address:string,
              mac_address:string,
              network_name:string,
              network_uid:string,
              operator:string,
              primary_dns:string,
              secondary_dns:string,
              service_name:string,
              service_uid:string,
              sensor_name:string,
              sensor_serial:string,
              sensor_uid:string,
              wifi_bssid:string,
              wifi_channel:int,
              wifi_frequency:float,
              wifi_signal_level:int,
              wifi_ssid:string
              > 
  )
PARTITIONED BY (year string, month string, day string) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://<your bucket name>/Aruba-UXI/test_results.s3.<customer uid>/'

Add a Partition

In the above example, to optimize the query size and cost the tables are partitioned by date. There are a few different ways of doing this using a command to repair the table, using a command to add the partition or using amazon glue to continuously update the table.

Here is an example of adding the partition.

ALTER TABLE issues 
ADD PARTITION (year='2022', month='06', day='05') LOCATION 's3://<your bucket name>/Aruba-UXI/issues.s3.<customer uid>/year=2022/month=06/day=03'

ALTER TABLE test_results 
ADD PARTITION (year='2022', month='06', day='03') LOCATION 's3://<your bucket name>/Aruba-UXI/test_results.s3.<customer uid>/year=2022/month=06/day=03'

Querying the Data

Now that you have described your data in Athena, you can perform some ad-hoc analysis on the data using SQL.

SELECT * FROM "my_uxi_data"."issues" 
WHERE year='2022' AND month='06' AND day='05'
limit 10

SELECT * FROM "my_uxi_data"."test_results" 
WHERE code = 'DHCP' AND year='2022' AND month='06' AND day='03'
limit 10