Checks a website and sends the results as events through kafka into a postgresql DB
# setups local virtualenv from requirements.txt.freeze
make setup
# runs the (local) infra in the background via docker-compose and the two apps in the foreground
make run-local
To stop the local infrastructure (kafak, zookeeper, postgres):
make stop-infra-local
Runs the unit tests in pytest and the integration_tests.py against local infrastructure
make tests
The makefile targets expect that avn
is setup with a default project:
source .venv/bin/activate # gets you avn installed in the virtualenv
avn user login
# create a new project or reuse your current default one
avn project create <project-name>
avn project switch <project-name>
# maybe: switch the current project to a different cloud provider
avn project update --cloud do-fra
Creates and configures the needed services, downloads the needed credential files and
generates the aiven.env
file which needs to be adjusted:
# read through the errors, it might be needed to run this twice to add the kafka topic
make setup-infra-aiven
Afterwards edit aiven.env
to adjust the KAFKA_*/CONSUMER_POSTGRES_*
settings. The
needed settings are available in the Aiven UI of the respectiv service
Afterwards you can run the local producer/consumer against the Aiven infrastructure
# this also automatically starts the Aiven infra if it was stoped
make run-aiven
To run the integration tests against Aiven Infrastructure:
# Starts up Aiven infra which was stopped
make integration-tests-aiven
To stop the aiven infrastructure:
make stop-infra-aiven
Destroying the Infrastructure is only possible in the Aiven UI.
Build the Image:
make build-docker
Run the docker container in detached mode and then follow the logs of both containers. This expects a
configured aiven.env
file.
make run-docker-aiven
ctrl+c
will stop following the log but not the containers!
To shutdown both container:
make stop-docker
Add all directly used packages to requirements.txt
and run:
make update-packages
This installs / updates all packages in requirements.txt
and puts the currently installed packages into
requirements.txt.freeze
.
Ensure that everything still works by running tests
make tests
Afterwards commit requirements.txt
and requirements.txt.freeze
.
- Setup a local kafka and postgresql and add makefile to run them in the background
- Build producer: sending a simple message through kafka
- Build consumer: move events into a table
- Implement the requirements at the consumer and producer side
- Implement unittests with mocks
- Implement an integration test which spins up new local infra, runs a few loops and then checks that the expected data is in the DB + the same check with failing websites
- Polish the code
- Proper Packaging: dockerfile? setup.py? -> just a Dockerfile for now
- Figure out how to spin up the infra via
avn
and document it - Run the integration tests agains Aiven Infra
The initial docker setup to use kafka locally + python example:
- https://towardsdatascience.com/kafka-docker-python-408baf0e1088
- https://medium.com/big-data-engineering/hello-kafka-world-the-complete-guide-to-kafka-with-docker-and-python-f788e2588cfc
- https://medium.com/better-programming/a-simple-apache-kafka-cluster-with-docker-kafdrop-and-python-cf45ab99e2b9
- https://stackoverflow.com/questions/52438822/docker-kafka-w-python-consumer/52440056
- https://help.aiven.io/en/articles/489572-getting-started-with-aiven-kafka
Makefile / virtualenv stuff: