Customized and Configurable Scan Discovery

Sam Volin | Mar 4, 2024

HawkScan provides multiple mechanisms to discover running web applications. Security and software development teams can combine forces and accomplish more in their software development pipeline by using the Spidering, HAR file, Seed Path, or Custom Scan Discovery mechanisms.

Discovering an application by spidering pages

During the Active Scan portion of HawkScan’s operation, it will actively attack and attempt to replicate known software vulnerabilities against any paths your application exposes. Understanding what endpoints your web application exposes is fundamental to how HawkScan operates.

After HawkScan has started and configured behavior, but before the Active Scan, it will begin the Scan Discovery phase, finding the paths of your web application by “spidering” them. This process will follow the URLs and relative paths found on each application/HTML page in a breadth-first search(BFS) pattern. Starting from the scanned `app.host`, this pattern will look for URLs within the same origin of your host (read: the application you’re scanning), and perform a Passive Scan on the response, checking for any direct evidence of known vulnerabilities on a separate thread, before adding the path to the site tree for reuse during the Active Scan.

This behavior is what happens by default when you run HawkScan. HawkScan caps spidering at 2 minutes by default, so this part of the scan won’t take forever like some other AppSec tools. HawkScan also supports scanning from an OpenAPI specification file, Or a GraphQL introspection endpoint, or even a Soap WSDL file to find attackable paths into your API.

All of these aspects of the scan are entirely configurable within the stackhawk.yml file as well, by the way:

hawk:
 spider:
   maxDurationMinutes: 2 # maximum allowed time in minutes for spiders to crawl your application.
   base: true # the basic spider utility that looks at html source files and follows urls it finds. Enabled by default.
   ajax: false # a more complex spider operation that follows dynamic links and buttons in an application.

Discovering an Application with HAR Files

A HAR (HTTP Archive) file is a log of a web browser’s interaction with a website. It stands for HTTP Archive format and is designed to store and share collected data about network requests, responses, and other performance-related information. HAR files capture details such as URLs, headers, cookies, timings, and content for each HTTP request made by the browser, which can be leveraged to discover your application.

With HawkScan, you can identify and map the paths of your web application using HAR files. Although we prefer API specifications, HAR files rovide a high level of control and precision in how HawkScan navigates and analyzes your web application making it a better alternative than spidering for scan discovery.

We’ve made the scan discovery process for single-page apps even easier, by allowing you to record HAR files directly from your local machine, providing even greater accuracy. This is extremely helpful for recording authentication to ensure you are testing password-protected routes.

To record a session, usehawk perch start --with-chrome
and--with-proxy-info
to begin the recording, andhawk perch stop --har-file=<file name>
to save your session. You can learn more withhawk perch start --help and hawk perch stop --help
.

Discovering an application by telling HawkScan what’s what

The web-crawling mechanism to discover a web application is not a silver bullet. It requires a link to be on every page in your web application in some fashion, all starting from your root `app.host`. This scenario won’t work for pages that are unlinked or hidden. If you know exactly what paths in your web application you want HawkScan to visit, tell it explicitly by specifying `.seedPaths` . Providing HawkScan with seedPaths will add these application routes to the internal site tree to be visited later during the Active Scan.

hawk:
 spider:
   seedPaths:
     - /hidden
     - /secret-path
     - /unlinked-endpoint-no-spider-will-ever-find

You can read more about HawkScan scan discovery and spidering mechanisms in our sweet documentation.

Customizing HawkScan with your favorite DevTools

This brings us to a cool new feature in HawkScan 2.8.0: Custom Scan Discovery. This feature allows HawkScan to be configured with a specified process command that will run in an environment designed for HawkScan to intercept the web traffic the command generates.

This feature is highly flexible to different environments or build systems, so that advanced developer resources can be reused.

By using this feature security teams can leverage the Postman Collections developers write for testing their API endpoints:

 spider:
   base: false
   custom:
     command:  "newman run postman_collection.json"

Or they can run their Cypress test suites and feed HawkScan the requests it makes into your web application:

hawk:
 spider:
   base: false
   custom:
     command:  "./node_modules/.bin/cypress run -s path/to/cypress-specs"
     environment:
       NO_PROXY: "<-loopback>"

The configuration support for Custom Discovery even works with HawkScan’s smart ability to interpolate and safely handle secrets from configuration at runtime. It’s so flexible, you can even invoke a shell and call any arbitrary commands a researcher may need with access to more terminal resources, so security researchers can get into all kinds of shenanigans with HawkScan:

# security researchers can try this, but not recommended for the pipeline!
app:
  host: ${APP_HOST:http://localhost:9000}
 
hawk:
  spider:
    base: false
    custom:
      command: bash
	credentials: 
      arguments:
        - -c
        - "echo KAAKAWW!! && curl -x $HTTP_PROXY -X DELETE 
  ${APP_HOST:http://localhost:9000}/admin/records/indices"

And more! These tools are just the tip of the iceberg. Any devtools that support proxying their web traffic into a separate host, either by the `HTTP_PROXY` environment variable or configuration file, can be used for customized Scan Discovery.

You can learn more about how to succeed with Custom Scan Discovery in our documentation .

Combine them all and discover your whole application

Part of the power of HawkScan is it can use all or none of these spidering, seedpath, and custom discovery mechanisms together. And HawkScan works even better when configured to scan specific API protocols, such as OpenAPI, GraphQL, or Soap. HawkScan has the flexibility and capabilities to adapt to any software environment and support engineering teams in finding vulnerabilities anywhere in a running web application. By giving smarter, straightforward resources to developers and software teams, we hope users can maintain a stronger application security posture as they develop and defend their awesome software.

📺 Watch a Quick Demo

https://fast.wistia.net/embed/iframe/ncbe0mjyb

FEATURED POSTS

Why Legacy DAST Fails for Modern Applications and How to Fix It

Mar 17, 2025 | API Security, DAST, Scanning with StackHawk

APIs ARE Your Application: So Why Aren’t You Testing Them? If you’re still using legacy Dynamic Application Security Testing (DAST) tools to secure your APIs, you’re leaving your biggest attack surface untested. Legacy DAST was designed for a different era—one where...

Single Page Application Security Testing: Is Scanning Your SPA with DAST Wrong?

Mar 12, 2025 | DAST, Scanning with StackHawk

Learn why SPA scanning is fundamentally flawed, and how to dynamically test your APIs directly for fast, accurate results with thorough coverage of your real attack surface.

StackHawk Announces HawkScan Test Engine

Jun 1, 2024 | Scanning with StackHawk

We are excited to announce that with the release of HawkScan 4.0, the transition to the HawkScan Test Engine (HSTE) will be complete.

Security Testing for the Modern Dev Team

See how StackHawk makes web application and API security part of software delivery.

Watch a Demo

StackHawk provides DAST & API Security Testing

Get Omdia analyst’s point-of-view on StackHawk for DAST.

"*" indicates required fields

Stop Reacting, Start Securing

Our Product

API Discovery

Modern DAST

API Security Testing

Oversight

API Discovery

Solutions

Sensitive Data Identification

StackHawk for HealthTech

Getting Started With AppSec

StackHawk for Financial Services

DevSecOps

Developer-First AppSec

OWASP Top 10

GraphQL Security Testing

gRPC Security Testing

Integrate with your existing tools and workflows

Integrations

GitHub

Snyk

AWS

Atlassian

Microsoft

Our Hawksome Customers

Explore Featured Stories

Health Tech

Financial Services

Industrial Automation

StackHawk + GitHub

Hawkdocs

Getting Started

StackHawk API

Integrations

StackHawk Platform

Authentication

BLOG

Dynamic Application Security Testing: Overview and Tooling

Resources

Maturity Model

Watch a Demo

Blog

Getting Started

About

All Resources

Talk to an Expert!

Stop Reacting, Start Securing

Our Product

API Discovery

Modern DAST

API Security Testing

Oversight

API Discovery

Solutions

Sensitive Data Identification

StackHawk for HealthTech

Getting Started With AppSec

StackHawk for Financial Services

DevSecOps

Developer-First AppSec

OWASP Top 10

GraphQL Security Testing

gRPC Security Testing

Integrate with your existing tools and workflows

Integrations

GitHub

Snyk

AWS

Atlassian

Microsoft

Our Hawksome Customers

Explore Featured Stories

Health Tech

Financial Services

Industrial Automation

StackHawk + GitHub

Hawkdocs

Getting Started

StackHawk API

Integrations