File Source Plugin
Premium
This is a premium plugin that you can buy here.
The CloudQuery File plugin syncs parquet
files to any of the supported CloudQuery destinations (e.g. PostgreSQL, BigQuery, Snowflake, and more).
Example
This example configures a File source with a directory to sync files from. The (top level) spec section is described in the Source Spec Reference.
kind: source
spec:
name: file
path: /path/to/downloaded/plugin # Buy from here: https://cloudquery.io/integrations/file
registry: local
version: "PREMIUM"
tables: ["*"]
destinations: ["postgresql"]
spec:
files_dir: "/path/to/files-to-sync" # required. Path to the directory with files to sync
# concurrency: 50 # optional. Number of files to sync in parallel. Default: 50
File spec
This is the (nested) spec used by the File source plugin.
-
files_dir
(string, required)Path to the directory with files to sync. Only files with
.parquet
extension will be synced. -
concurrency
(int, optional, default: 50)Number of files to sync in parallel. Negative values mean no limit.
Example with AWS Cost and Usage Reports
AWS Cost and Usage Reports are stored in S3 as parquet files. The following example shows how to sync these files and AWS infrastructure data to a PostgreSQL database. To learn more about visualizing AWS Cost and Usage Reports, visit our dashboards page.
kind: source
spec:
name: file
version: "PREMIUM"
destinations: [postgresql]
path: /path/to/downloaded/plugin
tables: ["*"]
spec:
files_dir: "/path/to/cost_and_usage_reports" # Update this value to the local directory with your AWS Cost and Usage Reports
---
kind: source
spec:
name: aws
version: "v22.15.2"
destinations: [postgresql]
path: cloudquery/aws
tables: ["*"]
skip_tables:
- aws_ec2_vpc_endpoint_services
- aws_cloudtrail_events
- aws_docdb_cluster_parameter_groups
- aws_docdb_engine_versions
- aws_ec2_instance_types
- aws_elasticache_engine_versions
- aws_elasticache_parameter_groups
- aws_elasticache_reserved_cache_nodes_offerings
- aws_elasticache_service_updates
- aws_iam_group_last_accessed_details
- aws_iam_policy_last_accessed_details
- aws_iam_role_last_accessed_details
- aws_iam_user_last_accessed_details
- aws_neptune_cluster_parameter_groups
- aws_neptune_db_parameter_groups
- aws_rds_cluster_parameter_groups
- aws_rds_db_parameter_groups
- aws_rds_engine_versions
- aws_servicequotas_services
---
kind: destination
spec:
name: postgresql
path: cloudquery/postgresql
version: "v6.1.2"
spec:
connection_string: postgresql://postgres:pass@localhost:5432/postgres