The objectives of this task are as follows:
- Create a highly scalable Qdrant vector database hosted on AWS.
- Have automatic snapshotting and backup options available.
- Have a recovery mechanism from backup for the database.
- Develop an efficient mechanism to ingest around 1 million records in the database.
- Set up observability and performance monitoring with alerts on the system.
- Use Terraform to spin up the required resources.
Go
Go Fiber
- Go FrameworkPrometheus
Grafana
Qdrant
- Vector DatabaseTerraform
- Infra-structure as Code (IaC)Aws
- Cloud ProviderKube-Prometheus
deploys the Prometheus Operator and already schedules a Prometheus called prometheus-k8s with alerts and rules by default.Apache Kafka
- Distributed event streaming platform, It is used for creating Data Streaming PipelineTrivy
- security scanner
- To start the application in the local environment, execute the following command:
# for local enivronment
docker-compose up -d
- If you want to deploy the application in a production environment, use the production-specific Docker Compose configuration file by executing the following command:
# for production environment
docker-compose -f docker-compose.prod.yaml up -d
git clone --recursive https://github.com/prometheus-operator/kube-prometheus
cd kube-prometheus
kubectl create -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/
# or
./kube_prom.sh
$ kubectl get all -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 4 (17h ago) 29h
pod/alertmanager-main-1 2/2 Running 4 (17h ago) 29h
pod/alertmanager-main-2 2/2 Running 4 (17h ago) 29h
pod/blackbox-exporter-7d8c77d7b9-p4txc 3/3 Running 6 (17h ago) 29h
pod/grafana-79f47474f7-tsrpc 1/1 Running 2 (17h ago) 29h
pod/kube-state-metrics-8cc8f7df6-wslgq 3/3 Running 7 (86m ago) 29h
pod/node-exporter-bd97l 2/2 Running 4 (17h ago) 29h
pod/prometheus-adapter-6b88dfd544-4rr57 1/1 Running 3 (86m ago) 29h
pod/prometheus-adapter-6b88dfd544-vhb98 1/1 Running 2 (17h ago) 29h
pod/prometheus-k8s-0 2/2 Running 4 (17h ago) 29h
pod/prometheus-k8s-1 2/2 Running 4 (17h ago) 29h
pod/prometheus-operator-557b4f4977-q76cz 2/2 Running 6 (86m ago) 29h
pod/qdapi-5fdb7df48b-cjfrz 1/1 Running 0 31m
pod/qdapi-5fdb7df48b-s9l5x 1/1 Running 0 31m
pod/qdapi-5fdb7df48b-wl82n 1/1 Running 0 31m
pod/qdrant-db-0 1/1 Running 0 44m
pod/qdrant-db-1 1/1 Running 0 44m
pod/qdrant-db-2 1/1 Running 0 44m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main ClusterIP 10.108.137.87 <none> 9093/TCP,8080/TCP 29h
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 29h
service/blackbox-exporter ClusterIP 10.103.243.118 <none> 9115/TCP,19115/TCP 29h
service/grafana ClusterIP 10.96.214.152 <none> 3000/TCP 29h
service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 29h
service/node-exporter ClusterIP None <none> 9100/TCP 29h
service/prometheus-adapter ClusterIP 10.107.130.104 <none> 443/TCP 29h
service/prometheus-k8s ClusterIP 10.106.89.198 <none> 9090/TCP,8080/TCP 29h
service/prometheus-operated ClusterIP None <none> 9090/TCP 29h
service/prometheus-operator ClusterIP None <none> 8443/TCP 29h
service/qdapi ClusterIP 10.104.190.99 <none> 80/TCP 45m
service/qdrant-db ClusterIP 10.99.231.223 <none> 6333/TCP,6334/TCP 31m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/node-exporter 1 1 1 1 1 kubernetes.io/os=linux 29h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/blackbox-exporter 1/1 1 1 29h
deployment.apps/grafana 1/1 1 1 29h
deployment.apps/kube-state-metrics 1/1 1 1 29h
deployment.apps/prometheus-adapter 2/2 2 2 29h
deployment.apps/prometheus-operator 1/1 1 1 29h
deployment.apps/qdapi 3/3 3 3 45m
NAME DESIRED CURRENT READY AGE
replicaset.apps/blackbox-exporter-7d8c77d7b9 1 1 1 29h
replicaset.apps/grafana-79f47474f7 1 1 1 29h
replicaset.apps/kube-state-metrics-8cc8f7df6 1 1 1 29h
replicaset.apps/prometheus-adapter-6b88dfd544 2 2 2 29h
replicaset.apps/prometheus-operator-557b4f4977 1 1 1 29h
replicaset.apps/qdapi-5fdb7df48b 3 3 3 31m
replicaset.apps/qdapi-69d5bfcc99 0 0 0 45m
NAME READY AGE
statefulset.apps/alertmanager-main 3/3 29h
statefulset.apps/prometheus-k8s 2/2 29h
statefulset.apps/qdrant-db 3/3 44m
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cronjob.batch/qdrant-cronjob 0 0 * * * False 0 <none> 4s
In the StatefulSet configuration, I have used volumeClaimTemplates
section to define the PVC template that will be used by each replica of the StatefulSet. Each replica will have its own PVC PersistantVolumeClaim
with its unique identity, backed by the requested storage.
With this configuration, the Qdrant vector database instances will have their data persisted across restarts and rescheduling events, providing data durability and stability for your deployment.
-
When using StatefulSets, you can request persistent storage using Persistent Volume Claims (PVCs). Each Pod in the StatefulSet can have its own PVC, which can be backed by a Persistent Volume (PV). The PVs are independent of the Pods and their lifecycle, so if a Pod fails, the PV and the data it holds will remain intact.
-
In the case of a StatefulSet with a master-slave configuration, the master Pod is responsible for handling write operations to the data storage, while the slave Pods can read from the storage. This configuration ensures data consistency, as only one Pod is writing to the data at a time.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: qdrant-db
spec:
selector:
matchLabels:
app: qdrant-db
serviceName: qdrant-db
replicas: 3
template:
metadata:
labels:
app: qdrant-db
spec:
containers:
- name: qdrant-db
image: qdrant/qdrant
ports:
- containerPort: 6333
name: web
- containerPort: 6334
name: grpc
volumeMounts:
- name: qdrant-data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: qdrant-data
spec:
accessModes: [ "ReadWriteOnce" ]
requests:
resources:
storage: 10Gi
Kubernetes will generate a Job based on the schedule provided in the CronJob. The Job will run the container with the specified image at the scheduled time and take snapshots of the qdrant-db
.
apiVersion: batch/v1
kind: CronJob
metadata:
name: qdrant-cronjob
spec:
schedule: "0 0 * * *" # Run once a day at midnight
jobTemplate:
spec:
template:
spec:
containers:
- name: qdrant-db
image: qdrant:qdrant
imagePullPolicy: IfNotPresent
ports:
- containerPort: 6333
name: web
- containerPort: 6334
name: grpc
volumeMounts:
- name: qdrant-dump
mountPath: /data
restartPolicy: OnFailure
- @yearly (or @annually) Run once a year at midnight of 1 January
0 0 1 1 *
- @monthly Run once a month at midnight of the first day of the month
0 0 1 * *
- @weekly Run once a week at midnight on Sunday morning
0 0 * * 0
- @daily (or @midnight) Run once a day at midnight
0 0 * * *
- @hourly Run once an hour at the beginning of the hour
0 * * * *
docker build --tag qdapi .
docker run -p 8000:8000 -e PORT=8000 -e QDRANT_ADDR=qdrant:6334 -d qdapi
# production
docker-compose -f docker-compose.prod.yaml up -d
# development
docker-compose up -d
kubectl --namespace monitoring port-forward svc/prometheus-k8s 10000:9090 >/dev/null &
kubectl --namespace monitoring port-forward svc/grafana 20000:3000 >/dev/null &
kubectl --namespace monitoring port-forward svc/alertmanager-main 30000:9093 >/dev/null &
kubectl --namespace monitoring port-forward svc/qdapi 8080:80 >/dev/null &
kubectl --namespace monitoring port-forward svc/qdrant-db 6334:6334 >/dev/null &
# or
# Use the provided script to automate port forwarding
./portforwarding.sh
Once the port forwarding is set up, you can access the following services on your local machine:
Grafana Dashboard: http://localhost:20000
Use this link to access the Grafana dashboard, where you can view various monitoring and analytics visualizations. Application API: http://localhost:8080
Use this link to access the application's API, allowing you to interact with the application programmatically. Prometheus Dashboard: http://localhost:10000
Use this link to access the Prometheus dashboard, where you can explore and monitor various metrics collected by Prometheus.
- Create IAM User:
- Log in to the AWS Management Console using an account with administrative privileges.
- Navigate to the IAM service.
- Click on "Users" in the left navigation pane and create a new user.
- Add the user to a group with access to EC2. You can use an existing group with the
AmazonEC2FullAccess
policy attached, or create a custom group with the necessary EC2 permissions. - Take note of the Access Key ID and Secret Access Key provided during the user creation process. You will need these to configure AWS CLI access.
- Configure AWS CLI:
- Open a terminal or command prompt on your local machine.
- Run the following command and provide the Access Key ID and Secret Access Key when prompted:
aws configure
-
Clone the Repository:
- Clone the repository containing the Terraform code to your local machine using Git or download the code as a ZIP archive and extract it.
-
Navigate to the Terraform Configuration Folder:
- Using the terminal or command prompt, navigate to the folder that contains the Terraform configuration files (e.g.,
cd ./.terraform
).
- Using the terminal or command prompt, navigate to the folder that contains the Terraform configuration files (e.g.,
-
Initialize Terraform:
- Run the following command to initialize Terraform and download the necessary providers:
terraform init
- Run the following command to initialize Terraform and download the necessary providers:
-
Plan the Terraform Deployment (Optional):
- It's recommended to create a Terraform plan to preview the changes before applying them. Run the following command to generate a plan:
terraform plan
- It's recommended to create a Terraform plan to preview the changes before applying them. Run the following command to generate a plan:
-
Apply the Terraform Configuration:
- If the plan looks good, apply the Terraform configuration to create the AWS EC2 instances. Run the following command and confirm the action:
terraform apply
- If the plan looks good, apply the Terraform configuration to create the AWS EC2 instances. Run the following command and confirm the action:
-
Verify the EC2 Instances:
- Once the Terraform apply process is complete, log in to your AWS Management Console and navigate to the EC2 service. You should see the newly created EC2 instances.
If you want to remove the resources created by Terraform, you can use the following command:
terraform destroy
- Screenshot of Grafana visualization dashboard.
- Tha
dashboard.yaml
file present ingrafana
directory
- Screenshot of Minikube dashboard visualization.
-
Get All the Collection Created
GET {{ baseURL }}/all
-
Create Collection
POST {{ baseURL }}/all
{ "collectionName": "test_collection" }
-
Create Field Inside Collection
POST {{ baseURL }}/field/create
{ "collectionName": "test_collection", "fieldName": "location" }
-
Insert Data into Collection
POST {{ baseURL }}/upsert
{ "id": 2, "city": "New York", "location": "Washington DC", "collectionName": "test_collection" }
-
Get Data by Id
POST {{ baseURL }}/data/id
{ "id": 2, "collectionName": "test_collection" }
-
DeleteCollection
DELETE {{ baseURL }}/collection/delete
{ "collectionName": "test_collection" }
To set up the Kafka consumer, follow these steps:
- Open a terminal or command prompt.
- Navigate to the
kafka_consumer
directory using thecd
command:cd kafka_consumer
- Build the consumer using the following command:
This command will compile the code and generate the executable file named
go build -o out/consumer utils.go consumer.go
consumer
inside theout
directory.
To set up the Kafka producer, follow these steps:
- Open a terminal or command prompt.
- Navigate to the
kafka_producer
directory using thecd
command:cd kafka_producer
- Build the producer using the following command:
This command will compile the code and generate the executable file named
go build -o out/producer utils.go producer.go
producer
inside theout
directory.
By default, the data consumption value is set to 10,000 (10K) records in the kafka_producer/producer.go
file. You can modify this value to any desired number, such as 1 million (1M), by following these steps:
- Open the
kafka_producer/producer.go
file in a text editor or code editor of your choice. - Locate the following section of code within the
producer.go
file:
- Currently the Cosumption value is set to 10K, it can be modified to 1Mil in
kafka_producer/producer.go
file.
for n := 0; n < 10000; n++ {
key := users[rand.Intn(len(users))]
payload := Payload{
ID: n + 1,
City: key,
Location: "Spain",
CollectionName: "test_collection",
}
.
.
.
}
- Update the loop condition
10000
to your desired value. For example, to generate 1 million records, change it to1000000
. - Save the changes to the file.
With this modification, the Kafka producer will now generate the specified number of records when executed.