Data Pipeline
Data Pipeline is responsible for NiFi creation and setting up the EKS clusters

  • NiFi
  • EKS

General Flow
  1. 1.
    Configured to pull the specified Nifi image from an ECR repo (assumes an image has been pushed to ECR (see ./push.sh for how an image might be pulled, built with Docker, pushed to ECR))
  2. 2.
    EKS cluster creation & autoscaling groups, worker nodes
  3. 3.
    Chart Deployment - configures the chart to pull the Nifi image that was pushed to ECR, the other processors we download are able to do so at initContainer spinup time by first downloading Cigna CA certs.
File Structure
./assets/test-flow.xml
./docker/Dockerfile
./docker/VERSION
./docker/lib/
./modules/charts/main.tf
./modules/charts/variables.tf
./modules/ecr/main.tf
./modules/ecr/variables.tf
./modules/eks/main.tf
./modules/eks/variables.tf
./yaml/oneup-nifi-chart-cigna.yaml
./terraform.tf
./variables.tf
./values.tfvars
./README.md
./destroy.sh
./push.sh
Running
terraform init
terraform validate
terraform apply -var-file=values.tfvars
Validation
Once all the pods are running, you can connect to Nifi in few ways:
  • Connecting with kubectl (you should be able to do this if you added your IAM role to the values file)
Once connected, you can upload the template provided in ./assets/test-flow.xml to Nifi. It's currently configured to hit the FHIR R4 lambda endpoint.
EKS Monitoring
Through the AWS EKS console, you can get high level information about the state of pods and other workloads. For more details, provided your role has system:masters access to the EKS cluster, you can deploy the Kubernetes dashboard.
More information available on the dashboard here: https://docs.aws.amazon.com/eks/latest/userguide/dashboard-tutorial.html
Destroy
Make sure to run the following (the destroy script should uninstall the nifi chart, delete dynamically provisioned volumes, and then remove the chart from the TF state):
./destroy.sh \$AWS_PROFILE \$AWS_REGION data-ingestion-eks-cluster
terraform refresh -var-file=values.tfvars
terraform destroy -var-file=values.tfvars
This will refresh the data source resources (update our EKS tokens so we can connect and uninstall helm charts)
Last modified 1yr ago
Copy link
On this page
Data Pipeline Overview