Skip to main content

PNNL

  • About
  • News & Media
  • Careers
  • Events
  • Research
    • Scientific Discovery
      • Autonomous Science
      • Biology
        • Chemical Biology
        • Computational Biology
        • Ecosystem Science
        • Human Health
          • Cancer Biology
          • Exposure Science & Pathogen Biology
        • Integrative Omics
          • Advanced Metabolomics
          • Chemical Biology
          • Mass Spectrometry-Based Measurement Technologies
          • Spatial and Single-Cell Proteomics
          • Structural Biology
        • Microbiome Science
          • Biofuels & Bioproducts
          • Human Microbiome
          • Soil Microbiome
          • Synthetic Biology
        • Predictive Phenomics
      • Chemistry
        • Computational Chemistry
        • Chemical Separations
        • Chemical Physics
        • Catalysis
      • Earth & Coastal Sciences
        • Global Change
        • Atmospheric Science
          • Atmospheric Aerosols
          • Human-Earth System Interactions
          • Modeling Earth Systems
        • Coastal Science
        • Ecosystem Science
        • Subsurface Science
        • Terrestrial Aquatics
      • Materials Sciences
        • Materials in Extreme Environments
        • Nondestructive Examination
        • Precision Materials by Design
        • Science of Interfaces
        • Smart Advanced Manufacturing
          • Cold Spray
          • Friction Stir Welding & Processing
          • ShAPE
      • Nuclear & Particle Physics
        • Dark Matter
        • Fusion Energy Science
        • Neutrino Physics
      • Quantum Information Sciences
    • Energy Resiliency
      • Electric Grid Modernization
        • Emergency Response
        • Grid Analytics
          • AGM Program
          • Tools and Capabilities
        • Grid Architecture
        • Grid Cybersecurity
        • Grid Energy Storage
        • Transmission
        • Distribution
      • Energy Efficiency
        • Appliance and Equipment Standards
        • Building Energy Codes
        • Building Technologies
          • Advanced Building Controls
          • Advanced Lighting
          • Building-Grid Integration
        • Commercial Buildings
        • Federal Buildings
          • Federal Performance Optimization
          • Resilience and Security
        • Residential Buildings
          • Building America Solution Center
          • Energy Efficient Technology Integration
          • Home Energy Score
        • Energy Efficient Technology Integration
      • Energy Storage
        • Electrochemical Energy Storage
        • Flexible Loads and Generation
        • Grid Integration, Controls, and Architecture
        • Regulation, Policy, and Valuation
        • Science Supporting Energy Storage
        • Chemical Energy Storage
      • Environmental Management
        • Waste Processing
        • Radiation Measurement
        • Environmental Remediation
      • Fossil Energy
        • Subsurface Energy Systems
        • Advanced Hydrocarbon Conversion
      • Nuclear Energy
        • Fuel Cycle Research
        • Advanced Reactors
        • Reactor Operations
        • Reactor Licensing
        • Nondestructive Examination
      • Renewable Energy
        • Solar Energy
        • Wind Energy
          • Wind Resource Characterization
          • Wildlife and Wind
          • Wind Systems Integration
          • Wind Data Management
          • Distributed Wind
        • Marine Energy
          • Environmental Monitoring for Marine Energy
          • Marine Biofouling and Corrosion
          • Marine Energy Innovation
          • Marine Energy Resource Characterization
          • Testing for Marine Energy
        • Hydropower
          • Environmental Performance of Hydropower
          • Hydropower Cybersecurity and Digitalization
          • Hydropower and the Electric Grid
          • Materials Science for Hydropower
          • Pumped Storage Hydropower
          • Water + Hydropower Planning
        • Grid Integration of Renewable Energy
        • Geothermal Energy
      • Transportation
        • Bioenergy Technologies
          • Algal Biofuels
          • Aviation Biofuels
          • Waste-to-Energy and Products
        • Hydrogen & Fuel Cells
        • Vehicle Technologies
          • Emission Control
          • Energy-Efficient Mobility Systems
          • Lightweight Materials
          • Vehicle Electrification
          • Vehicle Grid Integration
    • National Security
      • Chemical & Biothreat Signatures
        • Contraband Detection
        • Pathogen Science & Detection
        • Explosives Detection
        • Threat-Agnostic Biodefense
      • Cybersecurity
        • Discovery and Insight
        • Proactive Defense
        • Trusted Systems
      • Nuclear Material Science
      • Nuclear Nonproliferation
        • Radiological & Nuclear Detection
        • Nuclear Forensics
        • Ultra-Sensitive Nuclear Measurements
        • Nuclear Explosion Monitoring
        • Global Nuclear & Radiological Security
      • Stakeholder Engagement
        • Disaster Recovery
        • Global Collaborations
        • Legislative and Regulatory Analysis
        • Technical Training
      • Systems Integration & Deployment
        • Additive Manufacturing
        • Deployed Technologies
        • Rapid Prototyping
        • Systems Engineering
      • Threat Analysis
        • Advanced Wireless Security
          • 5G Security
          • RF Signal Detection & Exploitation
        • Border Security
        • Internet of Things
        • Maritime Security
        • Millimeter Wave
        • Mission Risk and Resilience
    • Data Science & Computing
      • Artificial Intelligence
      • Graph and Data Analytics
      • Software Engineering
      • Computational Mathematics & Statistics
      • Future Computing Technologies
        • Adaptive Autonomous Systems
    • Lab Objectives
    • Publications & Reports
    • Featured Research
  • People
    • Inventors
    • Lab Leadership
    • Lab Fellows
    • Staff Accomplishments
  • Partner with PNNL
    • Education
      • Undergraduate Students
      • Graduate Students
      • Post-graduate Students
      • University Faculty
      • University Partnerships
      • K-12 Educators and Students
      • STEM Education
        • STEM Workforce Development
        • STEM Outreach
      • Internships
    • Community
      • Philanthropy
      • Volunteering
    • Industry
      • Why Partner with PNNL
      • Explore Types of Engagement
      • How to Partner with Us
      • Available Technologies
      • Procurement
      • Technology Ombuds
  • Facilities & Centers
    • All Facilities
      • Atmospheric Radiation Measurement User Facility
      • Electricity Infrastructure Operations Center
      • Energy Sciences Center
      • Environmental Molecular Sciences Laboratory
      • Grid Storage Launchpad
      • Institute for Integrated Catalysis
      • Interdiction Technology and Integration Laboratory
      • PNNL Portland Research Center
      • PNNL-Seattle
      • PNNL-Sequim (Marine and Coastal Research)
      • Radiochemical Processing Laboratory
      • Shallow Underground Laboratory

PermitAI

  • Team
  • Publications
  • Applications
  • Data Lakehouse
  • Models and Benchmarks
  • News
  • Events
  • Stay Engaged

Breadcrumb

  1. Home
  2. Projects
  3. PermitAI

Models and Benchmarks Thrust

PermitAI Models and Benchmarks

The Models and Benchmarks team at PermitAI is dedicated to advancing and rigorously evaluating AI systems for environmental review and permitting, with a particular focus on NEPA workflows. Through systematic evaluation of off-the-shelf large language models, the team has identified key limitations in applying general-purpose models to regulatory domains such as NEPA. In response, they leverage the NEPATEC data lakehouse to develop domain-adapted models and, critically, a growing suite of benchmarks that measure model performance on real-world permitting tasks.

Custom Model Development

The team's approach prioritizes smaller language models, ranging from 1 to 7 billion parameters, which strike a balance between performance efficiency and resource consumption, thus keeping inference costs and energy usage low. These models are tailored for tasks including comment processing evaluation, GIS analytics, and NEPA document drafting.

Benchmarking and Evaluation

A core initiative of this thrust area is the creation of NEPA-Bench, a comprehensive set of benchmarks that assess AI model performance on real-world NEPA tasks. 

NEPABench

NEPABench is PermitAI’s comprehensive benchmark suite for environmental permitting. Rather than focusing on a single task, NEPABench evaluates AI systems across the full permitting lifecycle, including question answering, document drafting, information extraction, and public comment processing.

The suite integrates a range of task-specific benchmarks, including:

  • NEPAQuAD (question answering and regulatory reasoning)
  • DraftNEPABench (document drafting)
  • EIS-Bench (metadata extraction from EIS documents)
  • EA-Bench (metadata extraction from EA documents)
  • Tribe-Bench (tribal entity identification and consultation analysis)
  • FedReg-Bench (structured extraction from Federal Register notices)
  • Comment-Bench (public comment delineation, categorization, and summarization) 

By unifying these capabilities into a single framework, NEPABench enables more realistic, end-to-end evaluation of AI systems operating in regulatory environments. The suite currently encompasses more than 10,000 evaluation instances across diverse task types and document sources.

Access NEPABench Data

DraftNEPABench

DraftNEPABench evaluates the ability of large language models and agent-based systems to draft sections of Environmental Impact Statements (EIS). The benchmark consists of expert-curated drafting tasks derived from real-world NEPA documents, requiring models to synthesize information from multiple technical, regulatory, and scientific sources into coherent, structured text. Results demonstrate that agent-based approaches significantly improve drafting performance compared to standard methods, while also highlighting the continued need for human oversight in high-stakes regulatory contexts.

Access DraftNEPABench

Innovative Tools

In addition to developing benchmarks, the team has established automated and human evaluation procedures to rigorously examine the effectiveness and safety of models and applications prior to public release. They have also introduced MAPLE, a cloud API-friendly assessment pipeline designed for seamless evaluation of large language models against benchmarks like NEPABench. 

MAPLE

MAPLE (Multi-context Assessment Pipeline for Language Model Evaluation) is PermitAI’s modular evaluation framework for benchmarking AI models across NEPA tasks. Initially released as MAPLE v1.0, the framework provided a standardized pipeline for evaluating models on question answering and document retrieval tasks across multiple context settings, including no-context, document-level, retrieval-augmented, and gold-context evaluation.

Building on this foundation, MAPLE v2 expands support to a broader range of permitting tasks, including information extraction, structured data processing, and public comment analysis. It introduces task-specific evaluators and enhanced scoring modules, enabling consistent and reproducible evaluation across the full NEPABench suite. Together, these versions establish MAPLE as the core infrastructure for assessing model performance in real-world environmental permitting workflows.

Access MAPLE V2
Access MAPLE V1

PermitAI is committed to enabling rapid AI model prototyping, allowing researchers to experiment with diverse AI model architectures, algorithms, and preprocessing techniques, thus fostering innovation and efficiency in environmental review and permitting processes.

PermitAI Models and Benchmarks Team

  • Anurag Acharya (Model and Benchmark, Technical Lead)
  • Sadie Montgomery (Model and Benchmark, Domain Lead)
  • Rounak Meyur
  • Koby Hayashi
  • Bishal Lakha
  • Anusha Devulapally
  • Henry Warmerdam
     

PNNL

  • Get in Touch
    • Contact
    • Careers
    • Doing Business
    • Environmental Reports
    • Security & Privacy
    • Vulnerability Disclosure Policy
    • Notice to Applicants
  • Research
    • Scientific Discovery
    • Energy Resiliency
    • National Security
Subscribe to PNNL News
Department of Energy Logo Battelle Logo
Pacific Northwest National Laboratory (PNNL) is managed and operated by Battelle for the Department of Energy
  • YouTube
  • Facebook
  • X (formerly Twitter)
  • Instagram
  • LinkedIn