Skip to content

uSMART/enhanced-oil-spill-data

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
July 11, 2024 11:38

Enhanced Oil Spill Dataset

Description

This project presents an enhanced version of the oil spill database, based on IncidentNews maintained by NOAA's Office of Response and Restoration (ORR). The data on the IncidentNews platform encompasses 4,473 oil and chemical release incidents recorded as of December 7, 2023. Each incident entry comprises two categories of data: incident-level and post-level.

The incident-level data includes attributes such as location, date, cause, potential maximum release amount, and incident description. All incident-level data can be downloaded as a CSV file from the Raw Incident Data page on the IncidentNews platform. The post-level data provides a series of textual updates following the incident. Post-level data can be accessed separately for each incident via its homepage. For example, the post data for incident #1275 can be found on this page.

The primary limitation of the original dataset is the absence of actual release amounts (RA), only providing potential maximum estimates, which might not reflect real situations. Our enhanced database addresses this gap by adding structured data regarding the actual RA extracted using Natural Language Processing (NLP) tools from incident descriptions and related posts.

Key Features

The enhanced dataset includes 3,550 oil spill incidents from 1967 to 2023. For each incident, we added three new columns:

  • actual RA (gals): The actual oil spill amount identified from incident texts.
  • RA source: Whether the actual RA was extracted from the description, posts, or both.
  • update label: The relationship between the actual RA and the original potential maximum release value. The labels include:
    • "RA confirmed": The actual RA is identical to the potential maximum RA.
    • "RA updated": The actual RA differs from the potential maximum RA.
    • "RA newly acquired": The potential maximum RA is unavailable, but text information provides an actual RA.
    • "No information better than potential maximum RA": The potential maximum RA is available, but no actual RA was identified from the description and post to confirm or update it.
    • "RA still unavailable": Both actual RA and potential maximum release amount information are absent for the incident.

Usage

This dataset can be used for environmental research, risk assessment, and policy-making to better understand oil spill impacts. Analyze the data using your preferred data analysis tools.

Credits

This database was enhanced by Yiming Liu under the supervision of Hua Cai, with special acknowledgments to NOAA's Office of Response and Restoration (ORR) for their assistance in clarifying questions related to the original dataset.

Contact Information

For further inquiries, feel free to contact us via email at liu3285@purdue.edu or huacai@purdue.edu.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published