Skip to content

Efficiently manage and query vast log data volumes with a scalable Log Ingestor and Query Interface, featuring real-time ingestion, advanced filtering, and a user-friendly interface.

License

Notifications You must be signed in to change notification settings

OmkarPh/log-ingestor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

Log ingestion & analysis

Efficiently manage and query vast log data volumes with a scalable Log Ingestor and Query Interface, featuring real-time ingestion, advanced filtering, and a user-friendly interface.

Table of Contents
  1. About The Project
  2. Getting Started
  3. API Documentation
  4. Additional Information
  5. Contributing
  6. License
  7. Contact

About The Project

Log analyser Log analyser

This is an efficient implementation of a log ingestion & analysis solution. It utilises Golang & Elastic DB to efficiently ingest, store & analyse logs. It can accept logs from two sources, Kafka queue or direct HTTP API requests. It also provides a web UI for analysis of logs using different filters & search queries.

Features

  • Text-search across all fields
  • Regular expression search on all fields
  • Filters on specific fields
  • Search in given time range
  • Combination of all filters & search
  • HTTP endpoint for posting logs
  • Kafka queue for streamlined Log processing
  • Ingestion Buffer & Batch processing
  • Efficient search queries leveraging Elastic DB
  • Scalable & Efficient processing using sharding provided by Elastic DB

Built With

  • Golang
  • Python
  • Elasticsearch
  • Kafka
  • TypeScript
  • React
  • Gin

(back to top)

Getting Started

Prerequisites

  • Python
  • Node JS & NPM
  • Golang
  • Kafka
  • Elastic Search

Installation & usage

  • Clone the repo

    git clone https://github.com/dyte-submissions/november-2023-hiring-OmkarPh
    cd november-2023-hiring-OmkarPh/
  • Setup Log ingestion server

    1. Go to log-server directory

        cd log-server/
    2. Install golang dependencies

      go mod download
    3. Install Python dependencies

      pip install -r requirements.txt
    4. Start the ingestion server

      cd cmd
      GIN_MODE=release go run .

      The server should now be running on http://localhost:3000.

    5. Simulate huge amount of sample logs simultaneously Configure LOGS_LENGTH in log-server/tests/performance_test.py. Default value: 3000

      python tests/performance_test.py
  • Setup Web UI

    1. Go to frontend directory
        cd frontend/
    2. Install NPM dependencies
      npm install
    3. Start the React app
      npm start
    4. View the app here - http://localhost:3006
  • Simulate Log Publishers (Optional)

    1. Start zookeepr

      zookeeper-server-start //path-to-kafka-config/zoo.cfg
    2. Start Kafka

      kafka-server-start /path-to-kafka-config/server.properties
    3. Start publisher script Go to log-producers directory

      cd log-server/log-producers

      Start a producer to simulate service using --topic option in different shells. Configured topics: auth,database,email,payment,server,services,

      # Example:
      python producer.py --topic payment
      python producer.py --topic auth

(back to top)

API Documentation

Ingestion Routes

1. New Log Ingestion

  • Endpoint: POST /

  • Description: Ingests a new log entry into the system.

    Request Example:

    {
      "level": "error",
      "message": "Failed to connect to DB",
      "resourceId": "server-1234",
      "timestamp": "2023-09-15T08:00:00Z",
      "traceId": "abc-xyz-123",
      "spanId": "span-456",
      "commit": "5e5342f",
      "metadata": {
        "parentResourceId": "server-0987"
      }
    }

    Response Example:

    {
      "status": "success"
    }

2. Count Logs

  • Endpoint: GET /logs-count

  • Description: Retrieves the count of logs stored in Elasticsearch.

    Response Example:

    {
      "count": 5286
    }

Query Routes

1. Search Logs

  • Endpoint: POST /search-logs

  • Description: Searches for logs based on specified parameters. All the filter params, search text & time range are optional.

    Request Example:

    {
      "text": "email",
      "regexText": "jkl-*",
      "filters": [
        {
          "columnName": "level",
          "filterValues": ["error", "warning"]
        },
        {
          "columnName": "resourceId",
          "filterValues": ["user-123"]
        },
        {
          "columnName": "metadata.parentResourceId",
          "filterValues": ["9876", "1234"]
        },
        ... Other columns
      ],
      "timeRange": {
        "startTime": "2023-11-19T00:00:00Z",
        "endTime": "2023-11-19T23:59:59Z"
      }
    }

    Response Example:

    {
      "hits": {
        "total": 5,
        "hits": [
          {
            "_id": "1",
            "_source": {
              "level": "error",
              "message": "Database connection error",
              "resourceId": "user-123"
              // ... (other log fields)
            }
          }
          // Additional log entries
        ]
      }
    }

Additional Information

  • Elasticsearch Index Name: log-ingestor
  • Server Port: :3000
  • CORS Configuration: Allows all origins (*) and supports HTTP methods: GET, POST, PUT, DELETE.
  • Concurrency Configuration:
    • Buffered log channel with a default capacity of 5000 logs. (Can be changed via logsBufferSize)
    • Maximum concurrent log processing workers: 20.

(back to top)

Future improvements / Flaws

  • Kibana integration for better visual analysis
  • Persistent TCP or Web socket connection between servers (log producer) and log ingestor for lower latency
  • Redundant disk buffer for reliability
  • Alert system to notify administrators or users about critical log events in real-time.

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Omkar Phansopkar - @omkarphansopkar - [email protected]

(back to top)

About

Efficiently manage and query vast log data volumes with a scalable Log Ingestor and Query Interface, featuring real-time ingestion, advanced filtering, and a user-friendly interface.

Resources

License

Stars

Watchers

Forks