Blog

Writing Secure Code with OpenAI Codex: Scan, Fix, and Verify with StackHawk

Matt Tanner | Jun 19, 2026

Share on LinkedIn

Share on X

Share on Facebook

Share on Reddit

Send us an email

Codex CLI lives in your terminal and works alongside the tools that developers already use. Like other terminal-based agents, you simply point it at a directory, and OpenAI’s coding agent reads, changes, and runs the code inside it. The features it finishes pile up quickly. The security check on those features is still the slow, manual part of your day.

This guide shows you how to add the runtime security testing layer. By the end, Codex will run StackHawk scans against your live app, repair what turns up, and back every fix with a clean rescan.

What Are StackHawk Agent Skills for Codex?

A StackHawk agent skill is a set of instructions that teaches a coding agent the entire runtime security job. That means running scans, reviewing findings, fixing vulnerable code, and verifying the results. The skill compresses it into a five-step loop. Configure a stackhawk.yml for your app type, host, and auth pattern; scan the running app with HawkScan; parse the structured findings; fix the code; verify with a rescan.

We ship two skills. HawkScan covers scanning and fixing, and it’s the one this tutorial installs. StackHawk API answers the reporting questions: security posture, findings reports, scan history, and triage status. For Codex specifically, Codex will configure HawkScan, run the scan, parse findings, and help you fix them.

The skills are structured markdown with no runtime dependencies installed, no code running in the background. That’s the Agentic StackHawk position on AI coding agent security: the agent that built the feature also tests it, so “done” means “done and secure”.

Prerequisites

Here are a few prerequisites to check off before the steps below:

Codex CLI: npm i -g @openai/codex installs it (Homebrew works too), and it’s included with ChatGPT Plus, Pro, Business, Edu, and Enterprise plans. New to the tool? OpenAI’s Codex CLI docs cover installation and setup
A StackHawk account; agent skills need the Secure, Scale, or Wingman plan
A Java 17+ JDK if you’re on Linux; the installers for macOS and Windows already include Java
Your app running locally, source code included, listening on a port from 1024 to 65535

Set Up Codex CLI Security Scanning with StackHawk

Step 1: Get a StackHawk API Key

To get an API key, log in to the StackHawk console in the browser and click Settings in the left-side menu, then click API Keys in the menu that appears. On the API Keys screen, click the Create API Key button in the top right corner.

Give your API key a descriptive name like “Codex Agent” and click Continue.

A dialog box titled New API Key asks What is this key for? with Codex Agent entered in the input, highlighting Codex Code Security. Cancel and Continue buttons appear at the bottom right.

The API key has now been created. Leave this screen open or temporarily copy the key somewhere secure so it’s ready for the next step. If you exit before copying it, you’ll need to delete the key and create a new one.

A web interface shows an API Keys section for HawkScan, part of Codex Code Security. A notification warns users to save their API key, as it won’t be shown again. One API key named Codex Agent and a Create API Key button are visible.

Step 2: Install the hawk and hawkop CLIs

One Homebrew line covers both tools on macOS or Linux, followed by an init for each:

brew trust stackhawk/cli && brew tap stackhawk/cli && brew install hawk hawkop
hawk init
hawkop init

hawk init asks for the API key from Step 1, checks that it’s valid, and stores it at $HOME/.hawk/hawk.properties.

A terminal window displays instructions to enter a StackHawk API key, including a URL for obtaining it, followed by a successful authentication message—highlighting seamless integration with Codex Code Security.

hawkop init reads the stored key on its own; the only thing it wants from you is a default organization.

For Windows users: the StackHawk downloads page has MSI installers for both CLIs, with Java baked into the hawk one, and OpenAI calls Codex CLI’s Windows support experimental and recommends running Codex inside WSL.

Three command-line tools are now in play, so keep the roles straight: codex is the agent, hawk runs scans, and hawkop operates on the results.

Step 3: Install the StackHawk agent skill in Codex

From your shell, run two commands to install the Codex skill:

codex plugin marketplace add stackhawk/agent-skills
codex plugin add hawkscan@stackhawk
codex plugin add stackhawk-api@stackhawk

The first command registers StackHawk’s marketplace; the second pulls in the relevant StackHawk skills.

A dark-themed terminal displays code for adding and installing plugins from the StackHawk marketplace, with success messages for stackhawk/agent-skills and stackhawk/hawkscan plugins—showcasing Codex Code Security in action.

Step 4: Verify the skill is active

Ask Codex directly:

What StackHawk skills do you have?

A response describing the HawkScan skill means the install landed.

A dark-themed code editor shows terminal output listing installed plugins, notably the hawkscan plugin with related files and Codex Code Security capabilities like SAST and SCA, alongside some plugins marked as not installed.

Step 5: Ask Codex to scan your app

With your app up, give Codex a scan prompt, swapping the port for the one your app uses:

Scan my app running on localhost:8080 for security vulnerabilities

Codex starts by checking that the app is reachable, and it prompts you to start the app if it isn’t. Then it configures HawkScan, which, in practice, means writing a stackhawk.yml file. The file needs exactly three fields, app.applicationId, app.env, and app.host, and the first of those has an outside origin. It points to an application record in the StackHawk platform, not to anything in your source tree. Setting up that record falls inside the configuration work that the skill walks Codex through.

A Visual Studio Code window shows a YAML file with environment variables for a stack, and a terminal below displays output from the sam build command with Codex Code Security vulnerability scan suggestions.

When the scan finishes, the results are printed to the terminal. Depending on how Codex determines the best output format, there is usually a count of findings by severity first, followed by the details for each one: risk, confidence, which paths, and which methods.

A terminal window displays Codex Code Security scan results: 7 findings—2 high-risk XSS, 4 medium, 1 low. Results mention headers, CSRF, CORS, and server input. Explored files and an update note on HTTP server security are shown below.

The same results also land in the StackHawk platform.

A dashboard displays Codex Code Security scan results for an app called react-js-app, showing 8 findings—1 high, 6 medium, 1 low. A table lists security issues, their criticality, and status. Optimization tips and scan info appear on the right.

Step 6: Let Codex fix the findings and verify

As you can see in the terminal scan result screenshot above, sometimes Codex will take off and start fixing things itself. However, other times the findings list appears, and you’ll need to prompt Codex to fix, like so:

Fix all of these security findings

Codex reads the code around each finding and fixes it the way your codebase would expect. Think parameterized queries where SQL was used with direct string concatenation, output encoding where user input came back untouched, and security headers where none existed.

Screenshot of a code editor showing a Python script with Codex Code Security features. The code sets a secret key, configures a server port, and specifies security headers before serving files. A line deletion is highlighted in red.

After the fixes go in, it rescans and confirms that the findings no longer reproduce.

Terminal window displays Codex Code Security scan results: vulnerabilities fixed (XSS, CORS, CSP, CSRF), zero actionable issues found. StackHawk rescan link provided; no commit created. Scan lasted 9m 52s.

Reviewing and Triaging Findings in the StackHawk Platform

One of the best parts of using StackHawk skills with a coding agent is that the skill can automatically review and triage findings. The agent will then decide whether something should be fixed and add a note. All of this happens without any intervention (as shown above).

If you still want to review scans and triage manually, that’s also possible in the StackHawk console in the browser.

A cybersecurity dashboard powered by Codex Code Security displays scan results for “react-xss-app,” showing 33 findings, 10 open issues, and a table of vulnerabilities with severity, path counts, and assignment status. Side menus and tips are visible.

In the console, unprocessed findings are marked New, and the Finding Details page provides each one with three triage paths: Assigned, Risk Accepted, or False Positive. Whichever you pick, the platform asks for a comment, which is how a triage decision survives team turnover.

A Codex Code Security dashboard displays details about the CSP: Wildcard Directive vulnerability, including an overview, remediation steps, and a table with actions and HTTP method statuses. The page has dark mode enabled.

When a finding looks questionable, the Validate action generates a ready-to-run curl command that reproduces the attack, including the correct verb, headers, and data. Fire it at your local app and trace exactly what the scanner saw.

However, in most cases, you’ll want to (and can) rely on the agent to take care of this whole workflow without any intervention. Below is a screenshot showing Codex automatically triaging an issue.

Screenshot of a terminal window displaying commands and output related to fetching and remediating Codex Code Security alerts, including the use of GET and POST methods, JSON formatted data, and alert IDs highlighted in blue.

Wrap-Up

Codex already runs your code all day; the skill just widens its job description. Now the same agent attacks the running app, patches what gives way, and hands you a clean rescan as proof. That’s what secure code with Codex looks like when none of it requires a separate workflow. Start a free StackHawk trial and run the loop against your own app today. For the long-form version of this setup, the Agentic StackHawk Setup Guide has all the steps in one place.

Writing Secure Code with Google Antigravity: Scan, Fix, and Verify with StackHawk

Jun 19, 2026 | AI Security

On May 19, 2026, Google announced it was folding its agentic developer tooling into Google Antigravity, its agent-first development platform, and made the new Antigravity CLI available to everyone the same day. Since its release, many developers have been using it for agentic...

Writing Secure Code with Claude Code: Scan, Fix, and Verify with StackHawk

Jun 18, 2026 | AI Security

Claude Code can build an entire feature while you're still reading the ticket. Checking that feature for security holes is now the slowest part of shipping, and it's the step most of us quietly postpone. This guide is about making Claude run DAST against your live, running app....

How to Embed AppSec Testing Into the AI-DLC for Secure Apps from the Start

May 13, 2026 | AI Security, Shift Left Security

Learn how the AI-DLC is changing the AppSec testing equation and best practices for keeping up.