Qdrant

Objective

The objectives of this task are as follows:

Create a highly scalable Qdrant vector database hosted on AWS.
Have automatic snapshotting and backup options available.
Have a recovery mechanism from backup for the database.
Develop an efficient mechanism to ingest around 1 million records in the database.
Set up observability and performance monitoring with alerts on the system.
Use Terraform to spin up the required resources.

Architecture Design

Tech Stack

Go
Go Fiber - Go Framework
Prometheus
Grafana
Qdrant - Vector Database
Terraform - Infra-structure as Code (IaC)
Aws - Cloud Provider
Kube-Prometheus deploys the Prometheus Operator and already schedules a Prometheus called prometheus-k8s with alerts and rules by default.
Apache Kafka - Distributed event streaming platform, It is used for creating Data Streaming Pipeline
Trivy - security scanner

Setting up the Application using `docker-compose`

To start the application in the local environment, execute the following command:

# for local enivronment
docker-compose up -d

If you want to deploy the application in a production environment, use the production-specific Docker Compose configuration file by executing the following command:

# for production environment
docker-compose -f docker-compose.prod.yaml up -d

Setting up `kube-prometheus`

git clone --recursive https://github.com/prometheus-operator/kube-prometheus
cd kube-prometheus

kubectl create -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/ 

# or
./kube_prom.sh

K8s Monitoring Pods

$ kubectl get all -n monitoring  
NAME                                       READY   STATUS    RESTARTS      AGE
pod/alertmanager-main-0                    2/2     Running   4 (17h ago)   29h
pod/alertmanager-main-1                    2/2     Running   4 (17h ago)   29h
pod/alertmanager-main-2                    2/2     Running   4 (17h ago)   29h
pod/blackbox-exporter-7d8c77d7b9-p4txc     3/3     Running   6 (17h ago)   29h
pod/grafana-79f47474f7-tsrpc               1/1     Running   2 (17h ago)   29h
pod/kube-state-metrics-8cc8f7df6-wslgq     3/3     Running   7 (86m ago)   29h
pod/node-exporter-bd97l                    2/2     Running   4 (17h ago)   29h
pod/prometheus-adapter-6b88dfd544-4rr57    1/1     Running   3 (86m ago)   29h
pod/prometheus-adapter-6b88dfd544-vhb98    1/1     Running   2 (17h ago)   29h
pod/prometheus-k8s-0                       2/2     Running   4 (17h ago)   29h
pod/prometheus-k8s-1                       2/2     Running   4 (17h ago)   29h
pod/prometheus-operator-557b4f4977-q76cz   2/2     Running   6 (86m ago)   29h
pod/qdapi-5fdb7df48b-cjfrz                 1/1     Running   0             31m
pod/qdapi-5fdb7df48b-s9l5x                 1/1     Running   0             31m
pod/qdapi-5fdb7df48b-wl82n                 1/1     Running   0             31m
pod/qdrant-db-0                            1/1     Running   0             44m
pod/qdrant-db-1                            1/1     Running   0             44m
pod/qdrant-db-2                            1/1     Running   0             44m

NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-main       ClusterIP   10.108.137.87    <none>        9093/TCP,8080/TCP            29h
service/alertmanager-operated   ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   29h
service/blackbox-exporter       ClusterIP   10.103.243.118   <none>        9115/TCP,19115/TCP           29h
service/grafana                 ClusterIP   10.96.214.152    <none>        3000/TCP                     29h
service/kube-state-metrics      ClusterIP   None             <none>        8443/TCP,9443/TCP            29h
service/node-exporter           ClusterIP   None             <none>        9100/TCP                     29h
service/prometheus-adapter      ClusterIP   10.107.130.104   <none>        443/TCP                      29h
service/prometheus-k8s          ClusterIP   10.106.89.198    <none>        9090/TCP,8080/TCP            29h
service/prometheus-operated     ClusterIP   None             <none>        9090/TCP                     29h
service/prometheus-operator     ClusterIP   None             <none>        8443/TCP                     29h
service/qdapi                   ClusterIP   10.104.190.99    <none>        80/TCP                       45m
service/qdrant-db               ClusterIP   10.99.231.223    <none>        6333/TCP,6334/TCP            31m

NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/node-exporter   1         1         1       1            1           kubernetes.io/os=linux   29h

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/blackbox-exporter     1/1     1            1           29h
deployment.apps/grafana               1/1     1            1           29h
deployment.apps/kube-state-metrics    1/1     1            1           29h
deployment.apps/prometheus-adapter    2/2     2            2           29h
deployment.apps/prometheus-operator   1/1     1            1           29h
deployment.apps/qdapi                 3/3     3            3           45m

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/blackbox-exporter-7d8c77d7b9     1         1         1       29h
replicaset.apps/grafana-79f47474f7               1         1         1       29h
replicaset.apps/kube-state-metrics-8cc8f7df6     1         1         1       29h
replicaset.apps/prometheus-adapter-6b88dfd544    2         2         2       29h
replicaset.apps/prometheus-operator-557b4f4977   1         1         1       29h
replicaset.apps/qdapi-5fdb7df48b                 3         3         3       31m
replicaset.apps/qdapi-69d5bfcc99                 0         0         0       45m

NAME                                 READY   AGE
statefulset.apps/alertmanager-main   3/3     29h
statefulset.apps/prometheus-k8s      2/2     29h
statefulset.apps/qdrant-db           3/3     44m

NAME                           SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/qdrant-cronjob   0 0 * * *   False     0        <none>          4s

For Recovery Mechanism in Database

In the StatefulSet configuration, I have used volumeClaimTemplates section to define the PVC template that will be used by each replica of the StatefulSet. Each replica will have its own PVC PersistantVolumeClaim with its unique identity, backed by the requested storage.

With this configuration, the Qdrant vector database instances will have their data persisted across restarts and rescheduling events, providing data durability and stability for your deployment.

When using StatefulSets, you can request persistent storage using Persistent Volume Claims (PVCs). Each Pod in the StatefulSet can have its own PVC, which can be backed by a Persistent Volume (PV). The PVs are independent of the Pods and their lifecycle, so if a Pod fails, the PV and the data it holds will remain intact.
In the case of a StatefulSet with a master-slave configuration, the master Pod is responsible for handling write operations to the data storage, while the slave Pods can read from the storage. This configuration ensures data consistency, as only one Pod is writing to the data at a time.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: qdrant-db
spec:
  selector:
    matchLabels:
      app: qdrant-db
  serviceName: qdrant-db
  replicas: 3
  template:
    metadata:
      labels:
        app: qdrant-db
    spec:
      containers:
      - name: qdrant-db
        image: qdrant/qdrant
        ports:
        - containerPort: 6333
          name: web
        - containerPort: 6334
          name: grpc        
        volumeMounts:
        - name: qdrant-data
          mountPath: /data
  volumeClaimTemplates:
  - metadata:
      name: qdrant-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
        requests:
      resources:
          storage: 10Gi

For automatic snapshotting and backup options

Kubernetes will generate a Job based on the schedule provided in the CronJob. The Job will run the container with the specified image at the scheduled time and take snapshots of the qdrant-db.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: qdrant-cronjob
spec:
  schedule: "0 0 * * *" # Run once a day at midnight
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: qdrant-db
            image: qdrant:qdrant
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 6333
              name: web
            - containerPort: 6334
              name: grpc 
            volumeMounts:
            - name: qdrant-dump
              mountPath: /data
          restartPolicy: OnFailure

@yearly (or @annually) Run once a year at midnight of 1 January 0 0 1 1 *
@monthly Run once a month at midnight of the first day of the month 0 0 1 * *
@weekly Run once a week at midnight on Sunday morning 0 0 * * 0
@daily (or @midnight) Run once a day at midnight 0 0 * * *
@hourly Run once an hour at the beginning of the hour 0 * * * *

docker build --tag qdapi .
docker run -p 8000:8000 -e PORT=8000 -e QDRANT_ADDR=qdrant:6334 -d qdapi

To run the Application using docker-compose

# production
docker-compose -f docker-compose.prod.yaml up -d 
# development
docker-compose up -d

Set up the port forwarding manually

kubectl --namespace monitoring port-forward svc/prometheus-k8s 10000:9090 >/dev/null &
kubectl --namespace monitoring port-forward svc/grafana 20000:3000 >/dev/null &
kubectl --namespace monitoring port-forward svc/alertmanager-main 30000:9093 >/dev/null & 
kubectl --namespace monitoring port-forward svc/qdapi 8080:80 >/dev/null &
kubectl --namespace monitoring port-forward svc/qdrant-db 6334:6334 >/dev/null &

# or 
# Use the provided script to automate port forwarding
./portforwarding.sh

Accessing Services

Once the port forwarding is set up, you can access the following services on your local machine:

Grafana Dashboard: http://localhost:20000

Use this link to access the Grafana dashboard, where you can view various monitoring and analytics visualizations. Application API: http://localhost:8080

Use this link to access the application's API, allowing you to interact with the application programmatically. Prometheus Dashboard: http://localhost:10000

Use this link to access the Prometheus dashboard, where you can explore and monitor various metrics collected by Prometheus.

To Run Terraform Code

Setting up AWS Access

Create IAM User:

Log in to the AWS Management Console using an account with administrative privileges.
Navigate to the IAM service.
Click on "Users" in the left navigation pane and create a new user.
Add the user to a group with access to EC2. You can use an existing group with the AmazonEC2FullAccess policy attached, or create a custom group with the necessary EC2 permissions.
Take note of the Access Key ID and Secret Access Key provided during the user creation process. You will need these to configure AWS CLI access.

Configure AWS CLI:

Open a terminal or command prompt on your local machine.
Run the following command and provide the Access Key ID and Secret Access Key when prompted:
```
aws configure
```

Running Terraform

Clone the Repository:
- Clone the repository containing the Terraform code to your local machine using Git or download the code as a ZIP archive and extract it.
Navigate to the Terraform Configuration Folder:
- Using the terminal or command prompt, navigate to the folder that contains the Terraform configuration files (e.g., cd ./.terraform).
Initialize Terraform:
- Run the following command to initialize Terraform and download the necessary providers:
```
terraform init
```
Plan the Terraform Deployment (Optional):
- It's recommended to create a Terraform plan to preview the changes before applying them. Run the following command to generate a plan:
```
terraform plan
```
Apply the Terraform Configuration:
- If the plan looks good, apply the Terraform configuration to create the AWS EC2 instances. Run the following command and confirm the action:
```
terraform apply
```
Verify the EC2 Instances:
- Once the Terraform apply process is complete, log in to your AWS Management Console and navigate to the EC2 service. You should see the newly created EC2 instances.

Cleaning Up

If you want to remove the resources created by Terraform, you can use the following command:

terraform destroy

Grafana Dashboard

Screenshot of Grafana visualization dashboard.
Tha dashboard.yaml file present in grafana directory

Minikube Dashboard

Screenshot of Minikube dashboard visualization.

API Documentation

Get All the Collection Created
```
GET {{ baseURL }}/all
```

Create Collection

POST {{ baseURL }}/all

{
    "collectionName": "test_collection"
}

Create Field Inside Collection

POST {{ baseURL }}/field/create

{
    "collectionName": "test_collection",
    "fieldName": "location"
}

Insert Data into Collection

POST {{ baseURL }}/upsert

{
    "id": 2,
    "city": "New York",
    "location": "Washington DC",
    "collectionName": "test_collection"
}

Get Data by Id

POST {{ baseURL }}/data/id

{
    "id": 2,
    "collectionName": "test_collection"
}

DeleteCollection

DELETE {{ baseURL }}/collection/delete

{
    "collectionName": "test_collection"
}

Setting up the Data Pipeline

For the Consumer

To set up the Kafka consumer, follow these steps:

Open a terminal or command prompt.
Navigate to the kafka_consumer directory using the cd command:
```
cd kafka_consumer
```
Build the consumer using the following command:
```
go build -o out/consumer utils.go consumer.go
```
This command will compile the code and generate the executable file named consumer inside the out directory.

For the Producer

To set up the Kafka producer, follow these steps:

Open a terminal or command prompt.
Navigate to the kafka_producer directory using the cd command:
```
cd kafka_producer
```
Build the producer using the following command:
```
go build -o out/producer utils.go producer.go
```
This command will compile the code and generate the executable file named producer inside the out directory.

Modifying Data Consumption

By default, the data consumption value is set to 10,000 (10K) records in the kafka_producer/producer.go file. You can modify this value to any desired number, such as 1 million (1M), by following these steps:

Open the kafka_producer/producer.go file in a text editor or code editor of your choice.
Locate the following section of code within the producer.go file:

Currently the Cosumption value is set to 10K, it can be modified to 1Mil in kafka_producer/producer.go file.

for n := 0; n < 10000; n++ {
    key := users[rand.Intn(len(users))]
    payload := Payload{
        ID:             n + 1,
        City:           key,
        Location:       "Spain",
        CollectionName: "test_collection",
    }
    .
    .
    .
}

Update the loop condition 10000 to your desired value. For example, to generate 1 million records, change it to 1000000.
Save the changes to the file.

With this modification, the Kafka producer will now generate the specified number of records when executed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Qdrant

Objective

Architecture Design

Tech Stack

Setting up the Application using `docker-compose`

Setting up `kube-prometheus`

K8s Monitoring Pods

For Recovery Mechanism in Database

For automatic snapshotting and backup options

To run the Application using docker-compose

Set up the port forwarding manually

Accessing Services

To Run Terraform Code

Setting up AWS Access

Running Terraform

Cleaning Up

Grafana Dashboard

Minikube Dashboard

API Documentation

Setting up the Data Pipeline

For the Consumer

For the Producer

Modifying Data Consumption

Files

README.md

Latest commit

History

README.md

File metadata and controls

Qdrant

Objective

Architecture Design

Tech Stack

Setting up the Application using docker-compose

Setting up kube-prometheus

K8s Monitoring Pods

For Recovery Mechanism in Database

For automatic snapshotting and backup options

To run the Application using docker-compose

Set up the port forwarding manually

Accessing Services

To Run Terraform Code

Setting up AWS Access

Running Terraform

Cleaning Up

Grafana Dashboard

Minikube Dashboard

API Documentation

Setting up the Data Pipeline

For the Consumer

For the Producer

Modifying Data Consumption

Setting up the Application using `docker-compose`

Setting up `kube-prometheus`