Data Export

Data Export

Data Export feature allows you to export raw data stored in Hackle to your own storage, such as AWS S3 or GCP GCS. The Data Export feature sends data on a daily basis.

Supported Cloud Storage

CloudStorageSupported?
AWSS3Yes
AWSRedshiftNot yet
GCPGCSYes
GCPBigQueryNot yet

Preparation

The following tasks are required before data extraction

  • Create a storage to which you want to send the raw data (AWS S3, GCP GCS, etc.).
  • Create a key for the storage to which the raw data will be sent and grant the necessary permissions.

Generating keys and authorizing : GCP GCS

For GCP GCS, you can generate a key by referring to GCP IAM > Generating and Managing Service Account Keys.

The following authorizations are required when creating a key to access GCS.

storage.buckets.get
storage.objects.get
storage.objects.create
storage.objects.delete
storage.objects.list

Generating keys and authorizing : AWS S3

For AWS S3, you can refer to the following documents to create a key and grant the required permissions.

  1. [Create an AWS IAM User by following the documentation in AWS Docs: Create an IAM User.
  2. Follow AWS Docs: Creating an IAM Policy to create a Policy and include the IAM Policy policy attached as iam_policy.json in the code below. Then add the IAM Policy policy to the IAM Role created in the previous step.
  3. Follow the documentation in AWS Docs: Setting up IAM STS and add the IAM STS policy attached below as iam_sts.json to the IAM Role you created. At this time, enter the ARN value provided by Hackle. Also, pass the sts:ExternalId value you specified to Hackle.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
              "s3:GetObject",
              "s3:GetObjectVersion",
              "s3:DeleteObject",
              "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::<bucket>",
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "<prefix>/*",
                        "<prefix>/",
                        "<prefix>"                      
                    ]
                }
            }
        }
    ]
}
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": {
       "AWS": "[ARN providied by Hackle]"
     },
     "Action": "sts:AssumeRole",
     "Condition": {
       "StringEquals": {
         "sts:ExternalId": "[Random String provided by client]"
       }
     }
   }
 ]
}

Data Export Sample

When raw data sent to Hackle is extracted, it is passed in Apache Parquet format. Below is a schema of the Parquet format data that is passed.

root
 |-- server_dt: date (nullable = false)
 |-- ts: timestamp (nullable = false)
 |-- environment: string (nullable = false)
 |-- event_key: string (nullable = false)
 |-- identifiers: string (nullable = false)
 |-- insert_id: string (nullable = false)
 |-- metric_value: decimal(24,6) (nullable = false)
 |-- hackle_properties: string (nullable = false)
 |-- user_properties: string (nullable = false)
 |-- event_properties: string (nullable = false)

Data extraction request

To request a data extraction, please contact Hackle team. The following information is required to extract data

  • Key that is authorized for access
  • WS S3, GCS Bucket name and partition path where the data in the bucket will be loaded (e.g. gcs://customer-data-hackle/test/prefix-custom/dt=2023-01-01)