NGWAF – First Iteration Of ML Based Feedback WAF



Inspired by actual pain points from operating WAFs, NGWAF intends to simplify and reimagine WAF operations through the following processes:

To make deployment simple and portable, we have containerised the different components in the architecture using docker and configured them in a docker-compose file. This allows running it on a fresh install to be quick and easy as the dependencies are handled by docker automatically. The deployment can be expanded to be deployed into a local or cloud provider based kubernetes cluster, making scalabe as users can increase the number of nodes/pods to handle large amounts of traffic.

The deployment have been tested on macOS (Docker desktop), linux (ubuntu).

NGWAF runs out-of-the-box with three key components, these components as mentioned above are all containerised and are scalable according to desired usage. The protected resource can be customised by making a deployment change within the setup.

NGWAF was engineered with the following key user benefits in mind:

NGWAF adopts a novel architecture consisting an interactive and quarantine environment built to isolate potential hostile attackers. Unlike conventional WAFs which blocks upon detection, NGWAF diverts threat actors to emulated systems, trapping them to soften the impact of their malicious actions. The environment also act as a sinkhole to gather current attack methods, enabling the observation and collection of malicious data. These data can be used to further improve NGWAF’s detection capability.

NGWAF in action: Upon detection of SQL injection, NGWAF redirects to our quarantine environment, instead of dropping or blocking the attempt.

Training data and compliance checks for NGWAF are collected and conducted based on this requirement.

Instead of traditional rulesets which require analysts to manually identify and add rules as time goes by, NGWAF leverages end-to-end machine learning pipelines for the detection mechanism, greatly reducing the complexity in WAF rule management, especially for detecting complex payloads.

To do so, we needed to first create a base model and architecture that users can start off with, before they later use data collected from their own applications for retraining and fine-tuning:

Our model was able to achieve 99.6% accuracy on our training dataset.

Although we have included logs from various applications in order to improve the generalizability of the base model, further maintenance and retraining of the model will be important to:

Contrary to traditional WAFs where malicious traffic are blocked or dropped right away. NGWAF is going with a more flexible approach. Whereby, it redirects and detains malicious actors within a quarantine environment. This environment consists of various interactive emulated honeypots to try and gather more attack methods/data, these data will be utilised to potentially enhance NGWAF’s detection rate of more modern and complex attacks.

Currently, NGWAF’s quarantine environment forwards all data submitted by the trapped attacker to our ELK stack for analysis and visualisation. The data are auto-scrubbed into different components of the HTTP request, then packaged internally on the environment’s backend in JSON format before forwarding. This helps to lower the manpower cost required to clean and index the data when we kickstart the retraining process.

NGWAF currently provides users to make changes to the look and feel of the front-end aspect of our honeypots within the quarantine environment (based off a customised version of drupot). Users simply have to replace the assets folder within the docker volume with their front-end assets of choice.

NGWAF is also accommodating to users who would like to link their own honeypots as part of the quarantine environment. Users just have to forward the honeypot’s HTTP requests to the environment’s backend server (backend processes will automatically scrub and forward data to the analysis dashboard – ELK stack).

As new payloads and attack vectors emerge, it is important to upgrade detection capabilities in order to ensure security. Hence, a retraining function is built into NGWAF to ensure defenders are able to train the machine learning model to detect those newer payloads.

Retraining of datasets is one of the main features in NGWAF. On our dashboard, users can insert new dataset for retraining, to strengthen and improve the quality of NGWAF detection of malicious payloads.

This can be achieved in the following steps:

Create a new dataset (.csv) for upload in the following format (empty column, training data, label). You can refer to patch_sqli.csv as an example.

Navigate to http://localhost:8088 to view NGWAF admin panel.

Select the “Import Dataset” tab and upload the training set you have created

 

NGWAF uses ELK stack to capture logs of network data that passes through NGWAF, allowing users to monitor the traffic that passes through the NGWAF for further analysis.

 

NGWAF also comes with live Telegram notification, to inform owners about live malicious threats that is detected by NGWAF.

 

Tested Operating Systems

With Docker running, run the following file using the command below:

./run.sh

To replace the targets, point the dest_server and honey_pot_server variable to the correct targets in the /waf/WafApp/waf.py file

Once the Docker container is up, you can visit your localhost, in which these ports are running these services:

To allow for Telegram live notifications, do replace the following variables in /waf/WafApp/waf.py with a valid TELEGRAM tokens.

NGWAF is a W.I.P, Open source project, functions and features may change from patch to patch. If you are interested to contribute, please feel free to create an issue or pull request!

source


CyberTelugu

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top

Adblock Detected

Please consider supporting us by disabling your ad blocker

Refresh Page