Skip to main content

Cyber Events Database Enhanced by GDELT's Global News Monitoring

Back to All News
Cyber Network

Cyber events are growing in scale and impact, yet reliable information on threat actors, motives, targeted industries, and consequences remains scarce, fragmented, or locked behind costly private sources. This lack of accessible data leaves both public and private organizations without the tools they need to make informed decisions about prevention and response. To fill this gap, the University of Maryland’s Center for the Governance of Technology and Systems (GoTech) previously launched the Cyber Events Database, providing structured, open-source information on cyber-attacks from 2014 to the present.

In its next iteration, GoTech has improved its cyber event identification process for the Cyber Events Database by leveraging the Global Database of Events, Language, and Tone (GDELT) Project’s Web News NGrams 3.0 and Article List datasets. GDELT’s real-time, multilingual monitoring of millions of global news articles will enable GoTech to:

  • Expand its overall sample of global cyber events
  • Identify news items across 60 languages to enhance coverage of non-English sources
  • Improve timeliness with monthly updates containing the previous month’s reported cyber events

Additionally, the new dataset will include new measures of severity, when available, for both disruptive and exploitative cyber events including the magnitude, duration, and scope of the disruption as well as the type and amount of data stolen, people affected, and so forth.

Our new approach will significantly enhance our ability to deliver cyber event data from more non-English speaking sources, thereby reducing bias. We also aim to speed up the delivery of new data and to expand the scope of the data we capture from these events, incorporating qualitative measures of severity to assist researchers and policy analysts in better contextualizing the threat landscape.
Dr. Charles Harry Director, GoTech

The Cyber Events Database contains structured information on threat actors, motives, victims, industries, and end effects, and is publicly available. The dataset enables researchers and industry partners to distill analytical insights on cyber threats to specific industries, countries, or regions, trends over time, and threat actor behaviors.

The record identification process uses automated techniques, including a Python script to scrape data from the open internet, and specialty cyber sources, as well as the new addition of GDELT's global news monitoring capability. Data is processed daily into two .csv files—one for scraped web data and one for GDELT-derived data—followed by deduplication and manual review to ensure events meet the cyber event definition and to classify attributes like threat actor type, motive, and industry. This process includes distinguishing between events collected via original web scraping methods (pre-2025) and those including events from GDELT (2025 onward) to account for methodological shifts. 

The latest database update now includes newly added records of cyber events reported in January and August 2025. The GoTech team is in the process of backfilling 2025 cyber records using the enhanced method and will incrementally update the dataset with February to July 2025 records as they become available. 

September cyber records will be released on Wednesday, October 8th. Monthly updates will occur on the second Wednesday of every month going forward.

Learn More about the Cyber Events Database Sign up for GoTech updates

For Media Inquiries:
Megan Campbell
Senior Director of Strategic Communications
For More from the School of Public Policy:
Sign up for SPP News