Skip to main content

PNNL

  • About
  • News & Media
  • Careers
  • Events
  • Research
    • Scientific Discovery
      • Autonomous Science
      • Biology
        • Chemical Biology
        • Computational Biology
        • Ecosystem Science
        • Human Health
          • Cancer Biology
          • Exposure Science & Pathogen Biology
        • Integrative Omics
          • Advanced Metabolomics
          • Chemical Biology
          • Mass Spectrometry-Based Measurement Technologies
          • Spatial and Single-Cell Proteomics
          • Structural Biology
        • Microbiome Science
          • Biofuels & Bioproducts
          • Human Microbiome
          • Soil Microbiome
          • Synthetic Biology
        • Predictive Phenomics
      • Chemistry
        • Computational Chemistry
        • Chemical Separations
        • Chemical Physics
        • Catalysis
      • Earth & Coastal Sciences
        • Global Change
        • Atmospheric Science
          • Atmospheric Aerosols
          • Human-Earth System Interactions
          • Modeling Earth Systems
        • Coastal Science
        • Ecosystem Science
        • Subsurface Science
        • Terrestrial Aquatics
      • Materials Sciences
        • Materials in Extreme Environments
        • Nondestructive Examination
        • Precision Materials by Design
        • Science of Interfaces
        • Smart Advanced Manufacturing
          • Cold Spray
          • Friction Stir Welding & Processing
          • ShAPE
      • Nuclear & Particle Physics
        • Dark Matter
        • Fusion Energy Science
        • Neutrino Physics
      • Quantum Information Sciences
    • Energy Resiliency
      • Electric Grid Modernization
        • Emergency Response
        • Grid Analytics
          • AGM Program
          • Tools and Capabilities
        • Grid Architecture
        • Grid Cybersecurity
        • Grid Energy Storage
        • Transmission
        • Distribution
      • Energy Efficiency
        • Appliance and Equipment Standards
        • Building Energy Codes
        • Building Technologies
          • Advanced Building Controls
          • Advanced Lighting
          • Building-Grid Integration
        • Commercial Buildings
        • Federal Buildings
          • Federal Performance Optimization
          • Resilience and Security
        • Residential Buildings
          • Building America Solution Center
          • Energy Efficient Technology Integration
          • Home Energy Score
        • Energy Efficient Technology Integration
      • Energy Storage
        • Electrochemical Energy Storage
        • Flexible Loads and Generation
        • Grid Integration, Controls, and Architecture
        • Regulation, Policy, and Valuation
        • Science Supporting Energy Storage
        • Chemical Energy Storage
      • Environmental Management
        • Waste Processing
        • Radiation Measurement
        • Environmental Remediation
      • Fossil Energy
        • Subsurface Energy Systems
        • Advanced Hydrocarbon Conversion
      • Nuclear Energy
        • Fuel Cycle Research
        • Advanced Reactors
        • Reactor Operations
        • Reactor Licensing
        • Nondestructive Examination
      • Renewable Energy
        • Solar Energy
        • Wind Energy
          • Wind Resource Characterization
          • Wildlife and Wind
          • Wind Systems Integration
          • Wind Data Management
          • Distributed Wind
        • Marine Energy
          • Environmental Monitoring for Marine Energy
          • Marine Biofouling and Corrosion
          • Marine Energy Resource Characterization
          • Testing for Marine Energy
          • The Blue Economy
        • Hydropower
          • Environmental Performance of Hydropower
          • Hydropower Cybersecurity and Digitalization
          • Hydropower and the Electric Grid
          • Materials Science for Hydropower
          • Pumped Storage Hydropower
          • Water + Hydropower Planning
        • Grid Integration of Renewable Energy
        • Geothermal Energy
      • Transportation
        • Bioenergy Technologies
          • Algal Biofuels
          • Aviation Biofuels
          • Waste-to-Energy and Products
        • Hydrogen & Fuel Cells
        • Vehicle Technologies
          • Emission Control
          • Energy-Efficient Mobility Systems
          • Lightweight Materials
          • Vehicle Electrification
          • Vehicle Grid Integration
    • National Security
      • Chemical & Biothreat Signatures
        • Contraband Detection
        • Pathogen Science & Detection
        • Explosives Detection
        • Threat-Agnostic Biodefense
      • Cybersecurity
        • Discovery and Insight
        • Proactive Defense
        • Trusted Systems
      • Nuclear Material Science
      • Nuclear Nonproliferation
        • Radiological & Nuclear Detection
        • Nuclear Forensics
        • Ultra-Sensitive Nuclear Measurements
        • Nuclear Explosion Monitoring
        • Global Nuclear & Radiological Security
      • Stakeholder Engagement
        • Disaster Recovery
        • Global Collaborations
        • Legislative and Regulatory Analysis
        • Technical Training
      • Systems Integration & Deployment
        • Additive Manufacturing
        • Deployed Technologies
        • Rapid Prototyping
        • Systems Engineering
      • Threat Analysis
        • Advanced Wireless Security
          • 5G Security
          • RF Signal Detection & Exploitation
        • Border Security
        • Internet of Things
        • Maritime Security
        • Millimeter Wave
        • Mission Risk and Resilience
    • Data Science & Computing
      • Artificial Intelligence
      • Graph and Data Analytics
      • Software Engineering
      • Computational Mathematics & Statistics
      • Future Computing Technologies
        • Adaptive Autonomous Systems
    • Lab Objectives
    • Publications & Reports
    • Featured Research
  • People
    • Inventors
    • Lab Leadership
    • Lab Fellows
    • Staff Accomplishments
  • Partner with PNNL
    • Education
      • Undergraduate Students
      • Graduate Students
      • Post-graduate Students
      • University Faculty
      • University Partnerships
      • K-12 Educators and Students
      • STEM Education
        • STEM Workforce Development
        • STEM Outreach
      • Internships
    • Community
      • Philanthropy
      • Volunteering
    • Industry
      • Why Partner with PNNL
      • Explore Types of Engagement
      • How to Partner with Us
      • Available Technologies
      • Procurement
      • Technology Ombuds
  • Facilities & Centers
    • All Facilities
      • Atmospheric Radiation Measurement User Facility
      • Electricity Infrastructure Operations Center
      • Energy Sciences Center
      • Environmental Molecular Sciences Laboratory
      • Grid Storage Launchpad
      • Institute for Integrated Catalysis
      • Interdiction Technology and Integration Laboratory
      • PNNL Portland Research Center
      • PNNL-Seattle
      • PNNL-Sequim (Marine and Coastal Research)
      • Radiochemical Processing Laboratory
      • Shallow Underground Laboratory

PermitAI

  • Team
  • Publications
  • Applications
  • Data Lakehouse
  • Models and Benchmarks
  • News
  • Events
  • Stay Engaged

Breadcrumb

  1. Home
  2. Projects
  3. PermitAI

Data Lakehouse

PermitAI is developing an enriched, large-scale database that provides seamless access to thousands of historical environmental reviews and permitting documents. Data collection was initially focused NEPA documents, including categorical exclusions, environmental assessments, environmental impact statements, and other associated documents. This ongoing effort to create a centralized and standardized repository of NEPA documents from across the federal government is known as NEPA Text Corpus (NEPATEC). 

Since its inception, the NEPATEC database was designed to be extensible to other permitting review documents, which now includes adjudication records and other interconnected permitting artifacts. As PermitAI expands, the data lakehouse is designed to integrate information across regulatory processes and decision stages, while beginning to incorporate select state-level permitting data. PermitAI is starting with geothermal energy and critical mineral permitting use cases in Alaska and Nevada, enabling a more complete, end-to-end view of the permitting lifecycle and supporting more transparent, efficient, and data-driven decision-making.

NEPATEC

NEPATEC logo

Released to the public in June 2024, NEPA Text Corpus (NEPATEC) 1.0 is the first iteration of PermitAI’s expansive database of federal agency NEPA documents, consisting of more than 28,000 documents from nearly 3,000 projects across more than 100 agencies. The public release of NEPATEC2.0 delivered an expanded corpus of NEPA documents, consisting of more than 120,000 documents from 60,000 projects prepared by more than 60 different agencies. By using large language models to extract metadata modeled to align with Council of Environmental Quality (CEQ)’s NEPA and Permitting Data and Technology Standard, NEPATEC2.0 promotes consistency in environmental reviews and supports the ongoing effort to modernize permitting technologies by facilitating more transparent, efficient, and data-driven decision-making.  Targeted for a Q3 2026 release, NEPATEC3.0 will add an estimated 60,000 new documents to the database along with Geographic Information System (GIS) elements and refined textual metadata.

Access NEPATEC Database

Beyond NEPA: PermitTEC v0.1

PermitTEC v0.1 expands PermitAI’s data lakehouse to include the legal side of permitting, not just environmental review documents. This public dataset provides clean, standardized, machine‑readable metadata on federal court cases involving NEPA and related environmental laws. It complements NEPATEC by showing how environmental review decisions are challenged and resolved in court, linking cases to projects when possible. By combining litigation data with review documents, PermitTEC gives a more comprehensive picture of the permitting process and helps reveal how project features, environmental factors, and procedural choices relate to legal outcomes. Built to match CEQ data standards, it supports more transparent and data‑driven decision‑making.

Access PermitTEC v.01 Database

Data Innovations

PermitAI uses advanced data engineering and data science to build the data lakehouse that that serves as the foundation for NEPATEC and several PermitAI tools. 

The team works closely with federal agencies and CEQ to gather NEPA and permitting documents, including automatically pulling environmental impact statements from the EPA’s CDX API. All documents are stored in a secure cloud system designed for fast search and long‑term preservation.

PermitAI automates many routine data tasks to improve efficiency and protect data quality. Beyond storing documents, it is also developing standardized, enriched metadata using large language models. This metadata captures key details about projects, processes, documents, public comments, and geographic information, creating a stronger foundation for analysis.

PermitAI Data Lakehouse Team

  • Dan Nally (Data Thrust Lead)
  • Tim Vega (Data Engineering Lead)
  • Kaustav Bhattacharjee (Data Science Lead)
  • Anurag Acharya (Data Thrust PM)
  • Jim Jackson (Domain Lead)
  • Beau Morton (Data Engineering Lead)
  • Rounak Meyur (Data Science Lead)
  • Aaron Moreno
  • Alex Buchko
  • James Bandy
  • Kathy Nwe
  • Nathan Butschli
  • Matthew Raffel
  • Sai Koneru
  • Siddhartha Das
  • Sridevi Wagle
  • Ben Chauhan
  • Heng Wan
  • Johnny Chen
  • Micah Taylor
  • Rizwan Ashraf
  • Paul Rigor
  • Greg Wint
  • Joshua Wassing
  • Thomas Serrano
  • William Zhang
  • Michael Kieburtz
  • Brian Chen
  • Cole Man
  • Meenu Mohankumar
  • Ellyn Ayton
  • Renuka Chintalapati
  • Milan Jain
  • Julianna Puccio
  • Sam Donald

PNNL

  • Get in Touch
    • Contact
    • Careers
    • Doing Business
    • Environmental Reports
    • Security & Privacy
    • Vulnerability Disclosure Policy
    • Notice to Applicants
  • Research
    • Scientific Discovery
    • Energy Resiliency
    • National Security
Subscribe to PNNL News
Department of Energy Logo Battelle Logo
Pacific Northwest National Laboratory (PNNL) is managed and operated by Battelle for the Department of Energy
  • YouTube
  • Facebook
  • X (formerly Twitter)
  • Instagram
  • LinkedIn