COMS4995: AI for Software Security

This is a project-first course about building AI-assisted software security auditors that can work on real codebases. The course is centered on AuditZoo, an agent framework with built-in program abstractions such as control-flow and call graphs. See the Course GitHub org for the shared repos and updates. Instead of one-off class projects that disappear at the end of the semester, we will collaborate around a shared infrastructure so that work can accumulate across student cohorts and, if students want, be merged into a public open-source version.

Software security is at a turning point. AI can help with reasoning over code, triage, explanation, and workflow automation, but it also fails in systematic ways. The most promising direction is combining AI with strong program representations and measurable evaluation so AI does not guess, it reasons with evidence. AuditZoo is actively developed, and I am building it together with students in this course.

Course at a glance

  • Meeting time: Tue/Thu 5:40-6:55 PM
  • Location: 601B Sherman Fairchild Life Sciences Building
  • Zhuo's Office: CSB 457
  • Course Assistant: Sungjun Lee
  • Modern systems are too large and fast-moving for purely manual auditing.
  • Traditional static and dynamic analysis can be powerful, but often hits hard limits in precision, scalability, and engineering cost.
  • AI can help, but it can also hallucinate, lose grounding, or generalize poorly.
  • The most promising direction is combining AI with strong program representations and measurable evaluation.

This course is designed to teach two things simultaneously:

  1. AI for software security (what works, what does not, and how to make it work better)
  2. Real-world engineering collaboration (how teams build tools together in a shared repo)

Components

  • Early instructor-led lectures (first two weeks): Shared foundations on traditional security analysis challenges, AI challenges, and how to combine them.
  • AuditZoo architecture session (end of week 2): A guided tour so everyone builds on a common substrate.
  • Student paper presentations (starting week 3): Each presentation is 20 minutes plus 10 minutes Q&A.
  • Semester-long project (teams of 1-3): Build an auditor (recommended) or extend AuditZoo itself.
  • AuditZoo updates + Q&A (weekly): We track infrastructure progress and unblock contributors.
  • Industry talks (up to 4 total): Practitioners share how AI4Sec works in production and where the hard problems are.

Guest talks may shift; three sessions are reserved as "Guest talk / flexible slot" dates.

  • Python programming.
  • Basic understanding of program analysis: control flow graph, data flow graph, taint analysis, and related concepts.
  • Git and GitHub.

We use GitHub as the system of record for coordination, collaboration, and communication. Please read the GitHub guide and check the private repo updates and discussions.

In short: Issues for tracking, Pull Requests for integration, Discussions for Q&A and monthly updates.

  • Strengths and limitations of static, dynamic, and symbolic analysis, and where they break down in practice.
  • How to design AI-assisted auditors that use program structure to ground decisions in evidence.
  • How to turn a research idea into an implementable approach with scope, threat model, and failure modes.
  • How to evaluate a security tool rigorously with metrics, test cases, and honest limitations.
  • How to work like an engineering team in a shared repository with PRs and integration discipline.
  • How to communicate technical work through paper talks, proposals, progress updates, and final presentations.

All projects live in a shared private AuditZoo repository during the semester. Teams choose one of two tracks.

Track A (strongly recommended): Auditor projects

Build an AI auditor agent specialized for one vulnerability class or defect pattern.

Examples:

  • Find SQL injection in a Python web app backed by PostgreSQL or MySQL.
  • Find authorization bypass or insecure direct object reference in a React-based admin dashboard.
  • Find command injection in CI/CD scripts or deployment pipelines.
  • Find path traversal and unsafe file handling in a document processing service.
  • Find SSRF patterns in a cloud-integrated service (metadata or internal API access).
  • Find unsafe deserialization in Java or Kotlin microservices.
  • Find access-control or reentrancy bugs in Ethereum smart contracts.
  • Find Inconsistent specification-to-code mapping in go-ethereum (geth) or other Ethereum clients.

Expectations:

  • A working auditor integrated into AuditZoo so others can run it
  • Clear output format with findings and evidence
  • An evaluation section in the final report

You are also welcome to re-implement a published paper or existing method and integrate it into AuditZoo with a clean evaluation.

Track B: AuditZoo infrastructure projects

Extend the framework itself.

Examples:

  • Add a CodeQL backend or strengthen existing integrations
  • Add tree-sitter-based parsing to support more languages (e.g., Ada)
  • Extend program abstraction layers with new graph queries or IR adapters
  • Improve scalability and automation (documentation and unit tests) that enables Track A auditors

Expectations:

  • A working infrastructure feature integrated into AuditZoo
  • A small demonstration auditor or example showing why the feature matters
  • An evaluation of what capability it enables and what constraints remain
  • More frequent PR merges to keep in sync with main (Track B touches core infrastructure)

Students are welcome to open issues on the current framework in the corresponding private repo. We will keep a small set of issue templates to keep triage fast:

  • Feature Request
  • Bug Report
  • Integration or Build Help

To motivate real impact, the course includes a bug bounty program and an all-time leaderboard. Each unique vulnerability that is confirmed by the project developer or maintainer earns +1 extra course point.

  • The target must be a well-known project or a repository with at least 1000 GitHub stars.
  • Points are shared across team members.
  • If multiple groups report the same bug independently, the point is split evenly.
  • No cap on extra points; we will maintain a course leaderboard.

That means, if a student reaches 100 maintainer-confirmed vulnerabilities, the student may skip all presentation and proposal requirements. Students should follow each project's security policy and responsible disclosure norms.

See the bug bounty leaderboard for current standings.

Paper presentations

  • 20 minutes presentation plus 10 minutes Q&A.
  • Students choose papers within scope and sign up for dates in GitHub Discussions (first-come-first-confirm-first-in).
  • We will also provide a list of papers that do not require instructor confirmation.

Project presentations

  • Midterm progress presentation: 5-10 minutes.
  • Final presentation: 20-30 minutes, demo encouraged.

Anonymous rating (1-10)

For every presentation, the audience submits an anonymous 1-10 rating with optional written feedback. These ratings provide structured feedback and contribute to presentation scoring in a controlled way, with normalization to reduce popularity bias.

  • Attendance: 5% (light-touch; at most one attendance check if needed).
  • Paper presentation (individual): 15%.
  • Project proposal (team): 10% (1-2 pages, IEEE S&P format; see IEEE S&P author guidelines.
  • Midterm progress presentation (team): 10%.
  • Final project presentation (team): 20%.
  • Final report + evaluation (team): 40%.
  • Bug bounty: uncapped extra points (maintainer-confirmed only).
  • Each team has 2 guaranteed late days for written deliverables.
  • Late days do not apply to scheduled presentations.
  • Beyond that, we will be flexible as long as it does not disrupt scheduling and coordination.

All deadlines are 11:59 PM unless noted.

Date Item Type
Tue Jan 20 Classes begin Academic date
Thu Jan 29 (end of class) Paper sign-up deadline Deadline
Fri Jan 30 Last day to add Spring courses (end of Change of Program) Academic date
Thu Feb 5 (end of class) Team formation deadline (1-3 students) Deadline
Thu Feb 19, 11:59 PM Project proposal due (PDF + GitHub Discussion) Deadline
Tue Feb 24 Last day to drop courses via SSOL Academic date
Fri Feb 27, 11:59 PM Monthly project update (GitHub Discussion) Deadline
Mon Mar 9 Midterm date (university) Academic date
Mar 16-20 Spring recess (no classes) Academic date
Tue Mar 31, 11:59 PM Monthly project update (GitHub Discussion) Deadline
Tue Apr 14, 11:59 PM Monthly project update (GitHub Discussion) Deadline
Mon Apr 27 Last day to withdraw with W Academic date
Mon May 4 Last day of classes Academic date
Mon May 4, 11:59 PM Final report + final submission; bug bounty leaderboard cutoff Deadline
May 8-15 Final exams window Academic date

Meeting time: Tue/Thu 5:40-6:55 PM. Location: 601B Sherman Fairchild Life Sciences Building. Guest talks may shift; flexible slots are used for paper presentations or project Q&A.

Lecture Paper Project Guest Flexible Q&A Deadline

# Date Focus Tags / notes
1 Tue Jan 20 Lecture: security analysis challenges + course overview Lecture
2 Thu Jan 22 Guest talk by Hari Mulackal Guest
3 Tue Jan 27 Lecture: AI for software security - opportunities and limitations Lecture
4 Thu Jan 29 Lecture: AI for software security - opportunities and limitations (cont) LectureQ&ADDL: paper sign-up
5 Tue Feb 3 Student paper presentations (2): Paper
6 Thu Feb 5 Student paper presentation (1) + AuditZoo update/Q&A: PaperQ&ADDL: team formation
7 Tue Feb 10 Student paper presentations (2): Paper
8 Thu Feb 12 Guest talk / flexible slot or paper + AuditZoo update/Q&A: GuestFlexiblePaperQ&A
9 Tue Feb 17 Student paper presentations (2): Paper
10 Thu Feb 19 Student paper presentation (1) + AuditZoo update/Q&A: PaperQ&ADDL: proposal due
11 Tue Feb 24 Student paper presentations (2): Paper
12 Thu Feb 26 Student paper presentation (1) + AuditZoo update/Q&A: PaperQ&A
13 Tue Mar 3 Student paper presentations (2): Paper
14 Thu Mar 5 Student paper presentation (1) + AuditZoo update/Q&A: PaperQ&A
15 Tue Mar 10 Midterm project progress presentations (part 1) Project
16 Thu Mar 12 Midterm project progress presentations (part 2) Project
17 Tue Mar 24 Student paper presentations (2) or midterm overflow: Paper
18 Thu Mar 26 Student paper presentation (1) + AuditZoo update/Q&A PaperQ&A
19 Tue Mar 31 Student paper presentations (2):
  • Samarth Kumbla (@Samarth2709) - Paper not chosen yet
  • TBD
PaperDDL: monthly update
20 Thu Apr 2 Student paper presentation (1) + AuditZoo update/Q&A PaperQ&A
21 Tue Apr 7 Student paper presentations (2) or project Q&A PaperQ&A
22 Thu Apr 9 Guest talk / flexible slot or project Q&A + AuditZoo update GuestFlexibleQ&A
23 Tue Apr 14 Project Q&A / bug bounty Q&A / make-up paper presentations Q&ADDL: monthly update
24 Thu Apr 16 AuditZoo update/Q&A + project Q&A Q&A
25 Tue Apr 21 Final project presentations (3) Project
26 Thu Apr 23 Final project presentations (3) Project
27 Tue Apr 28 Final project presentations (3) Project
28 Thu Apr 30 Final project presentations (3) + closing notes Project