This is an efficient implementation of a log ingestion & analysis solution. It utilises Golang & Elastic DB to efficiently ingest, store & analyse logs. It can accept logs from two sources, Kafka queue or direct HTTP API requests. It also provides a web UI for analysis of logs using different filters & search queries.
- Text-search across all fields
- Regular expression search on all fields
- Filters on specific fields
- Search in given time range
- Combination of all filters & search
- HTTP endpoint for posting logs
- Kafka queue for streamlined Log processing
- Ingestion Buffer & Batch processing
- Efficient search queries leveraging Elastic DB
- Scalable & Efficient processing using sharding provided by Elastic DB
- Python
- Node JS & NPM
- Golang
- Kafka
- Elastic Search
Clone the repo
git clone https://github.com/dyte-submissions/november-2023-hiring-OmkarPh cd november-2023-hiring-OmkarPh/
Setup Log ingestion server
Go to
directorycd log-server/
Install golang dependencies
go mod download
Install Python dependencies
pip install -r requirements.txt
Start the ingestion server
cd cmd GIN_MODE=release go run .
The server should now be running on http://localhost:3000.
Simulate huge amount of sample logs simultaneously Configure
. Default value: 3000python tests/performance_test.py
Setup Web UI
- Go to
directorycd frontend/
- Install NPM dependencies
npm install
- Start the React app
npm start
- View the app here - http://localhost:3006
- Go to
Simulate Log Publishers (Optional)
Start zookeepr
zookeeper-server-start //path-to-kafka-config/zoo.cfg
Start Kafka
kafka-server-start /path-to-kafka-config/server.properties
Start publisher script Go to
directorycd log-server/log-producers
Start a producer to simulate service using
option in different shells. Configured topics:auth
,# Example: python producer.py --topic payment python producer.py --topic auth
Description: Ingests a new log entry into the system.
Request Example:
{ "level": "error", "message": "Failed to connect to DB", "resourceId": "server-1234", "timestamp": "2023-09-15T08:00:00Z", "traceId": "abc-xyz-123", "spanId": "span-456", "commit": "5e5342f", "metadata": { "parentResourceId": "server-0987" } }
Response Example:
{ "status": "success" }
GET /logs-count
Description: Retrieves the count of logs stored in Elasticsearch.
Response Example:
{ "count": 5286 }
POST /search-logs
Description: Searches for logs based on specified parameters. All the filter params, search text & time range are optional.
Request Example:
{ "text": "email", "regexText": "jkl-*", "filters": [ { "columnName": "level", "filterValues": ["error", "warning"] }, { "columnName": "resourceId", "filterValues": ["user-123"] }, { "columnName": "metadata.parentResourceId", "filterValues": ["9876", "1234"] }, ... Other columns ], "timeRange": { "startTime": "2023-11-19T00:00:00Z", "endTime": "2023-11-19T23:59:59Z" } }
Response Example:
{ "hits": { "total": 5, "hits": [ { "_id": "1", "_source": { "level": "error", "message": "Database connection error", "resourceId": "user-123" // ... (other log fields) } } // Additional log entries ] } }
- Elasticsearch Index Name:
- Server Port:
- CORS Configuration: Allows all origins (
) and supports HTTP methods: GET, POST, PUT, DELETE. - Concurrency Configuration:
- Buffered log channel with a default capacity of 5000 logs. (Can be changed via
) - Maximum concurrent log processing workers: 20.
- Buffered log channel with a default capacity of 5000 logs. (Can be changed via
- Kibana integration for better visual analysis
- Persistent TCP or Web socket connection between servers (log producer) and log ingestor for lower latency
- Redundant disk buffer for reliability
- Alert system to notify administrators or users about critical log events in real-time.
