Data Export
Data Export
Data Export feature allows you to export raw data stored in Hackle to your own storage, such as AWS S3 or GCP GCS. The Data Export feature sends data on a daily basis.
Supported Cloud Storage
Cloud | Storage | Supported? |
---|---|---|
AWS | S3 | Yes |
AWS | Redshift | Not yet |
GCP | GCS | Yes |
GCP | BigQuery | Not yet |
Preparation
The following tasks are required before data extraction
- Create a storage to which you want to send the raw data (AWS S3, GCP GCS, etc.).
- Create a key for the storage to which the raw data will be sent and grant the necessary permissions.
Generating keys and authorizing : GCP GCS
For GCP GCS, you can generate a key by referring to GCP IAM > Generating and Managing Service Account Keys.
The following authorizations are required when creating a key to access GCS.
storage.buckets.get
storage.objects.get
storage.objects.create
storage.objects.delete
storage.objects.list
Generating keys and authorizing : AWS S3
For AWS S3, you can refer to the following documents to create a key and grant the required permissions.
- [Create an AWS IAM User by following the documentation in AWS Docs: Create an IAM User.
- Follow AWS Docs: Creating an IAM Policy to create a Policy and include the IAM Policy policy attached as
iam_policy.json
in the code below. Then add the IAM Policy policy to the IAM Role created in the previous step. - Follow the documentation in AWS Docs: Setting up IAM STS and add the IAM STS policy attached below as
iam_sts.json
to the IAM Role you created. At this time, enter the ARN value provided by Hackle. Also, pass thests:ExternalId
value you specified to Hackle.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:DeleteObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": "arn:aws:s3:::<bucket>",
"Condition": {
"StringLike": {
"s3:prefix": [
"<prefix>/*",
"<prefix>/",
"<prefix>"
]
}
}
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "[ARN providied by Hackle]"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "[Random String provided by client]"
}
}
}
]
}
Data Export Sample
When raw data sent to Hackle is extracted, it is passed in Apache Parquet format. Below is a schema of the Parquet format data that is passed.
root
|-- server_dt: date (nullable = false)
|-- ts: timestamp (nullable = false)
|-- environment: string (nullable = false)
|-- event_key: string (nullable = false)
|-- identifiers: string (nullable = false)
|-- insert_id: string (nullable = false)
|-- metric_value: decimal(24,6) (nullable = false)
|-- hackle_properties: string (nullable = false)
|-- user_properties: string (nullable = false)
|-- event_properties: string (nullable = false)
Data extraction request
To request a data extraction, please contact Hackle team. The following information is required to extract data
- Key that is authorized for access
- WS S3, GCS Bucket name and partition path where the data in the bucket will be loaded (e.g.
gcs://customer-data-hackle/test/prefix-custom/dt=2023-01-01
)
Updated almost 2 years ago