All about Dataset


.nl stats and data - SIDN Labs0day "In the Wild"AMP-Research: Amplification ResearchAPNIC Labs StatsAPNIC RExAVR Instruction SetAlexa Top 1 Million Security AnalysisBGP Looking GlassBGP Routing Leak Detection SystemBGPStream (CAIDA)BGPStream (CISCO)BGPlayBinary Hardening in IoT ProductsBitcoin MonitoringCAIDA BGP Hijacking ObservatoryCAIDA Datasets OverviewCIRCL hashlookupCTF ArchivesCZ.NIC StatisticsCensored PlanetCensysCertificate Search crt.shCitizenlab Censorship Test ListsCloudflare RadarCollection of "bad" packets in PCAPsCommon CrawlComputer Security Conference Ranking and StatisticCorona Dashboards for Germany and EuropaCorona Dashboards for SaarlandCorona Vacine DashboardsDB BaustellenDEF CON CTF ArchiveDMAP Domain Mapper by SIDN LabsDNS Authoritative Server BenchmarksDNS Census 2013DNS CoffeeDNS Core CensusDNS Privacy ProjectDNS Quality/Overview ToolsDNS Queries to Authoritative DNS Server at SURFnet by Google's Public DNS ResolverDNS open zone dataDNSDBDNSMONDNSSEC Deployment MapsDNSSEC Deployment ReportsDNSSEC Early Warning SystemDer Deutschlandatlas: Deutschland neu vermessenDistributed Randomness BeaconDomain Crawling ListsDomain Name System (DNS) ParametersDuckDuckGo Tracker RadarElectricity MapsFile Format Explanations by Ange AlbertiniForward DNS Rapid7GitHub Advisory DatabaseGlobal Security Database (GSD)Google Transparency ReportHTTP Status CodesHurricane Electric Submarine Cable MapHyper-Specific Prefixes on the InternetICANN Indentifier Technologies Health IndicatorsICANN Managed Root Servers StatisticsICLab DataIETF Official RFC BibTeX DownloadsIP Abuse ListsIP Flow Information Export (IPFIX) EntitiesIP to ASN Mapping (CIRCL LU)IP to ASN Mapping (Cymru)IPmap RIPEIPv6 Deployment ReportsIPv6 Hitlist CollectionIPv6 RIPEnessIXP Pricing OverviewInTheWild Vulnerability FeedIntel Management Engine PartitionsInternet Health ReportInternet Maps (RIPE NCC)Internet Society PulseIs BGP safe yet?Known Exploited Vulnerabilities CatalogLinux System Call TableList of Amplification ProtocolsList of BGP Routing DatasetsList of Chrome CLI SwitchesList of Default PasswordsList of Looking Glasses Providing TraceroutesList of Network Speed TestsLists of DNS BlocklistsMANRS ObservatoryMajor DNSSEC Outages and Validation FailuresMalware BazaarManchester Academic PhrasebankMeasurement Factory: DNS SurveyMultipath TCP Measurement ServiceNIST RPKI MonitorNetlab 360 OpenData ProjectNetworkScan MonOnline Hash CrackersOpen INTELOpen Infrastructure MapOpen Observatory of Network Interference (OONI)Open Source Vulnerabilities (OSV)Over The Wire: WargamesPassive DNS (CIRCL)Passive SSL (CIRCL)PeeringDBPrivilege Escalation Cheatsheet (Vulnhub)Public DNS Server ListPublic Suffix ListRDAPRIPE AtlasRIPEstat: Providing open data and insights for Internet resourcesROV Deployment MonitorRPKI BrowsersRSSAC002 DNS Root Server DataRegex to parse router hostnamesRoot ServersRouting Information Service (RIS)Shadowserver DashboardShadowserver Scanning ProjectShodanSystem Security CircusTLD Apex HistoryTeleGeography Map GalleryTransient Execution AttacksTrickest CVE PoCVulnerable (Docker) ContainersWorld Country InformationW³Techs SurveysZonefiles: Domain Listscaniuse.rsdnsdumpsterdnsthoughtgitignore Templatesioda: Internet Outage Detection and Analysislibc Database Searchwhats.newx86 Instruction Set

.nl stats and data - SIDN Labs

DNS | DNSSEC | Dataset | IP | Network

Historic datasets (from 2014 onwards) for the .nl TLD. Datasets are available in JSON format.

Datasets cover information about:

  • DNS
    • Domain Names
    • Query Type
    • Response Codes
    • IPv6 Support
  • Resolvers
    • Location
    • Number of IP addresses
    • Validating Resolvers
    • Popular Networks
    • Port Randomness
    • Validating Queries
    • DANE
    • Used Algorithms
  • Mail
    • Mail Resource Records (RRs)
    • SPF Information

APNIC Labs Stats

Autonomous System | BGP | DNS | DNSSEC | Dataset | IP

APNIC gathers many statistics and offers them on their website. However, they provide way more data than it might initially look like, since many of the datasets are not linked from their main page.

Binary Hardening in IoT Products


Detailed analysis on a 10-year dataset of IoT binaries and their security features. The Cyber ITL focussed on which compiler and toolchain hardenings the vendors use.

CITL identified a number of important takeaways from this study:

  • On average, updates were more likely to remove hardening features than add them.
  • Within our 15-year data set, there have been no positive trends from any one vendor.
  • MIPS is both the most common CPU architecture and the least hardened on average.
  • There are numerous duplicate binaries across multiple vendors, indicating a common build system or toolchain.

CIRCL hashlookup

Dataset | Malware

Lookup files by their md5 or sha1 hashes. The response contains information such as the filename, size or where the file was found, like a Linux package. On the website you have the API documentation which can be used directly from the browser.

CZ.NIC Statistics

DNS | Dataset

The website contains information about the cz. TLD operated by CZ.NIC. It contains information about the query volume, query type, round-trip time (RTT) and geographic location of the traffic sources. It also has information about the registry functions, such as registrar information, domain transfers or whois requests. Lastly, information about the mojeID accounts, a login provider operated by CZ.NIC is also available.

Censored Planet

Censorship | Dataset

Censored Planet is a censorship measurement platform that collects data using multiple remote measurement techniques in more than 200 countries.

The website provides access to many recent scans. The scans are performed using different techniques to find different censors.


Certificate | DNS | Dataset | IP | Network

Censys performs regular scans for common protocols (e.g., DNS, HTTP(S), SSH). Provides a search for TLS certificates.

Access is free, but requires registration. The website no longer provides free bulk access. Bulk access requires a commercial or a research license. The free access is limited to 1000 API calls per day.

    author = {Zakir Durumeric and David Adrian and Ariana Mirian and Michael Bailey and J. Alex Halderman},
    title = {A Search Engine Backed by {I}nternet-Wide Scanning},
    booktitle = {Proceedings of the 22nd {ACM} Conference on Computer and Communications Security},
    month = oct,
    year = 2015

Citizenlab Censorship Test Lists

Censorship | Dataset

The GitHub repository contains multiple lists for finding website censorship. The lists are organized by country and contain URLs specific to each of them. The URLs are also categorized and cover four broad themes:

  • Political, e.g., governmental views or human rights
  • Social, e.g., sexuality or gambling
  • Conflicts, e.g., armed conflicts or border disputes
  • Internet tools, e.g., hosting providers or circumvention methods.

Cloudflare Radar

BGP | DDoS | DNS | Dataset | IP | Network

Cloudflare Radar is Cloudflare's reporting website about internet trends and general traffic statistics. The website shows information about observed attacks and attack types and links to the DDoS report. General traffic statistics are reported, such as the used browser, fraction of human traffic, IP, HTTP, and TLS version.

The website also provides more detailed information on domains and IP addresses. Domains have information about age, popularity, and visitors. IP addresses have ASN and geolocation information.

More information about Cloudflare Radar is available in the introduction blog post.

The Radar data is also available via API, for example the attack data:

Corona Dashboards for Germany and Europa


Robert Koch-Institut Official German dashboard.

COVID Trends Germany Daily updated dashboard with many graphs for Germany.

Berliner Morgenpost Shows sub-country numbers for Europe and worldwide.

WHO European Region Country level information for Europe.

WHO European Region Subnational Explorer Sub-nation information for Europe with incidence rates over the last 7/14 days.

Johns Hopkins University Contains worldwide information.

ECDC COVID-19 Country Overviews Very detailed breakdown for countries worldwide.

Reuters Provides per country and regionally aggregated information.

DNS Coffee

DNS | Dataset | IP | Network | Search

DNS Coffee collects and archives stats from DNS Zone files in order to provide insights into the growth and changes in DNS over time.

The website includes information such as the size of different zones. It tracks over 1200 zone files.

It provides searching through the zones files based on domain names, name servers, or IP addresses. It can also visualize the relationship between a domain, the parent zones and the name server in what they call a "Trust Tree".

DNS Core Census

DNS | Dataset | IP | Network

The DNS Core Census is an ICANN project to gather information about top-level-domains (TLDs). This covers ccTLDs, gTLDs, effective TLDs (like, and entries in arpa. The census contains information about the zone, like metadata and contractual information, about the name servers, about addresses of the name servers, and the route origins. The data is kept for a 35-day rolling window.

Further information about the project can be found in this presentation and OCTO-019 from ICANN's Chief Technology Officer

DNS Quality/Overview Tools

DNS | DNSSEC | Dataset | Network | Tool

Check My DNS

Browser-based DNS resolver quality measurement tool. Uses the browser to generate many resolver queries and tests for features they should have, such as EDNS support, IPv6, QNAME Minimization, etc.

This test is also available as a CLI tool:

DNSSEC Debugger

Analyze DNSSEC deployment for a zone and show errors in the configuration.


Gives an overview of DNSSEC delegations, response sizes, and name servers.



The website has an online test, which performs DNS lookups. These DNS lookups test if certain resource records are overwritten in the cache. The tool can then determine what DNS software is used, where the server is located, how many caches there are, etc.

EDNS Compliance Tester

Test name server of zones for correct EDNS support.

The Transitive Trust and DNS Dependency Graph Portal

Shows the trust dependencies in DNS. Given a domain name, it can show how zones delegate to each other and why. The delegation is done between IP addresses and zones.

Root Canary Project

The project used to monitor the first root KSK key rollover. Now it contains the paper "Roll, Roll, Roll your Root: A Comprehensive Analysis of the First Ever DNSSEC Root KSK Rollover" describing the experiences of the first root KSK rollover

Additionally, it includes a tester for DNSSEC algorithm support, which shows the algorithms supported by the currently used recursive resolver. It provides statistics about support for DNSSEC algorithms. It has a web-based test to test your own resolver and provides a live monitoring using the RIPE Atlas.

DNSSEC algorithms resolver test

DNS Queries to Authoritative DNS Server at SURFnet by Google's Public DNS Resolver

DNS | Dataset | Network

This dataset covers approximately 3.5 billion DNS queries that were received at one of SURFnet's authoritative DNS servers from Google's Public DNS Resolver. The queries were collected during 2.5 years. The dataset contains only those queries that contained an EDNS Client Subnet.

The dataset covers data from 2015-06 through 2018-01.

DOI Identifier


DNS | Dataset | Network

Historical DNS database. Contains information recorded at recursive resolver about domain names, first/last seen, current bailiwick. Allows to see the lifetime of resource records and can be used as a large database.

Distributed Randomness Beacon

Dataset | Tool

The distributed randomness beacon is a verifiable, unpredictable and unbiased random numbers as a service. A network of multiple entities computes the random numbers. They are a good source of true entropy. Another use is in verifiable lotteries, by using these random numbers to pick a winner at random.

Domain Crawling Lists

DNS | Dataset

Domain popularity lists provide a starting point for crawling domains with the most users. The most commonly used list for security research is the Alexa list.

  • Alexa
    The list is updated daily and contains one million websites. The ranking is based on page views, but very volatile.
  • CISCO Umbrella
    The list is updated daily and contains one million websites. The ranking is based on traffic seen on the OpenDNS resolvers.
  • Majestic
    The list is updated daily and contains one million websites. The ranking is based on backlinks from other websites.
  • Tranco
    A Research-Oriented Top Sites Ranking Hardened Against Manipulation
    The Tranco list aims to provide a better list for security research. The authors explain on their website and their paper what the flaws in the existing lists.
  • Quantcast
    The list is updated daily and contains around 500,000 websites. It is based on users visiting the site within the previous month and highly US focussed.
  • Cloudflare Radar Cloudflare uses their DNS resolver to create a top 1 million list. The lists are also available on a per country level, e.g., More details are available in their announcement blog post.
  • CrUX Chrome Google Chrome collects the top 1 million visited website and published them as part of the Chrome UX Report. The repository captures the monthly data and provides access to older versions. In an Internet Measurement Conference (IMC) paper this list was shown to best correlate with the HTTP requests as seen by Cloudflare.

DuckDuckGo Tracker Radar

Dataset | Network

Tracker Radar collects common third-party domains and rich metadata about them. The data is collected from the DuckDuckGo crawler. More details are in this blog post.

This is not a block list, but a data set of the most common third-party domains on the web with information about their behavior, classification and ownership. It allows for easy custom solutions with the significant metadata it has for each domain: parent entity, prevalence, use of fingerprinting, cookies, privacy policy, and performance. The data on individual domains can be found in the domains directory.

File Format Explanations by Ange Albertini


The repository contains explanations of many file formats. The graphics usually consist of a hex view on the left side and a file structure on the right side. The values are color matched. The file structure explains the meaning of value, how they are decoded, and the relationship between different values. The explanations exist for many compression, executable, and image formats.

Forward DNS Rapid7

DNS | Dataset | IP | Network

This dataset contains the responses to DNS requests for all forward DNS names known by Rapid7's Project Sonar. Until early November 2017, all of these were for the 'ANY' record with a fallback A and AAAA request if necessary. After that, the ANY study represents only the responses to ANY requests, and dedicated studies were created for the A, AAAA, CNAME and TXT record lookups with appropriately named files.

The data is updated every month. Historic data can be downloaded after creating a free account.

Hyper-Specific Prefixes on the Internet

BGP | Dataset

The common wisdom is that BGP serves /24 prefixes for IPv4 and /48 prefixes for IPv6. However, this is more of a convention, than a hard rule. Larger prefixes are observed in BGP routing tables.

This website summarizes a paper about hyper specific BGP prefixes. It shows how common hyper specifics are over time for IPv4 and v6.

ICANN Indentifier Technologies Health Indicators

DNS | DNSSEC | Dataset | Network

ICANN tracks the general health of the DNS ecosystem and related ecosystems. The data is updated irregularly, but historic data is available. The collected data covers eight major topics:

  1. M1: inaccuracy of Whois Data
  2. M2: Domain Name Abuse
  3. M3: DNS Root Traffic Analysis
  4. M4: DNS Recursive Server Analysis
  5. M5: Recursive Resolver Integrity
  6. M6: IANA registries for DNS parameters
  7. M7: DNSSEC Deployment.
  8. M8: DNS Authoritative Servers Analysis

Each topic has too many sub categories to list here.

IETF Official RFC BibTeX Downloads

Dataset | Paper Writing | TeX

The IETF provides official BibTeXs to download. They work for RFCs, BCPs, and drafts.

The BibTeXs for BCPs work, but only, if the BCP consists of a single RFC. If the BCP consists of multiple RFCs, the BibTeX will only show the first one.

For drafts, the draft version number, the last two digits, have to be removed from the URL.


Available entries can be found in the RFC Index and the BCP Index.


BGP | Dataset | Map | Network | Tool

IP geolocation services feeding itself from geolocation databases, user provided locations, and most importantly, active RTT measurements based on the RIPE Atlas system. It also provides a nice API to query the location. It provides a breakdown on where the results stem from and how much they contribute to the overall result.

IPv6 Hitlist Collection

Dataset | IP | Network

A curated list of IPv6 hosts, gathered by crawling different lists. Includes:

  • Alexa domains
  • Cisco Umbrella
  • CAIDA DNS names
  • Rapis7 DNS ANY and rDNS
  • Various zone files

Access to the full list requires registration by email.

Based on the paper "Scanning the IPv6 Internet: Towards a Comprehensive Hitlist".

The website contains the additional material of the IMC paper Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists. The IPv6 addresses can be downloaded from the website. The website has three lists, responsive IPv6 addresses, aliased prefixes, and non-aliased prefixes. Additionally, the website also has a list of tools used during the data creation.

IPv6 RIPEness

Dataset | IPv6 | Network

RIPE gathers data about the IPv6 deployments worldwide and publishes the information on their IPv6 RIPEness website. The deployments are judged on four points:

  • Having an IPv6 address space allocation or assignment from the RIPE NCC
  • Visibility in the Routing Information Service (RIS)
  • Having a route6 object in the RIPE Database
  • Having a reverse DNS delegation set up

Internet Health Report

Dataset | Network

The Internet Health Report reports on significant disruption events between networks. They use BGP and traceroutes as their data sources. The report contains information about the connectives of ASes, such as the most common upstream networks and RPKI status of announcements. Link quality information is included, like historic network delay, forwarding anomalies, or network disconnects.

Internet Society Pulse

Autonomous System | BGP | Dataset | Network | Tool

The Internet Society gathers data to show the general health and availability of the internet. They measure four categories: internet shutdowns, technology use, resilience, and concentration. Under internet shutdowns, they show which countries are performing what kind of disruption, e.g., regional or national. The technology sections lists basic statistics about HTTPS, IPv6, TLS, DNSSEC.

Is BGP safe yet?

BGP | Dataset | Network | RPKI

"Is BGP safe yet?" is an effort by Cloudflare to track the deployment of RPKI filtering across different ISPs. They provide a tester on the website with which each user can test if the current ISP is filtering RPKI invalid announcements. The website includes a list of networks and if and how they use RPKI (signing and/or filtering).

More details for this project can be found in Cloudflare's blog or on the GitHub project.

Linux System Call Table

CTF | Cheatsheet | Dataset | x86

These websites provided an overview of the Linux systemcall interface by listing the syscall numbers, their meanings, and their arguments.

List of BGP Routing Datasets

BGP | Dataset | Network

Packet Clearing House (PCH)

The Packet Clearing House (PCH) publishes BGP data collected at more than 100 internet exchange points (IXP). The snapshot dataset contains the state of the routing tables at daily intervals.

PCH also provides raw routing data in MRT format. These contain all the update information sorted by time.

Routing Information Service (RIS)

The RIS is the main resource from RIPE featuring all kinds of datasets about AS assignments and connectivity.


Routeviews is a project by the University of Oregon to provide live and historical BGP routing data.

Lists of DNS Blocklists

DNS | Dataset | IP | Network | Spam | Tool

These projects either operate DNS based Real-time Blackhole Lists (RBL) or allow checking if an IP is contained. The Multi-RBL websites are helpful in finding a large quantity of RBLs.

MANRS Observatory

BGP | Dataset | Network

Mutually Agreed Norms for Routing Security (MANRS) is an initiative to improve the state of routing security. The observatory shows what kind of incidents occurred and how prepared networks are, e.g., with filtering and coordination efforts. The data is available globally and comparisons between regions are available. Historic data is accessible on the website.

Malware Bazaar

Dataset | Malware

The Malware Bazaar is a project by to create an open repository with malware samples. The repository is small, but it can be freely downloaded and contributed by everyone. It only contains malicious files, which contrasts with common malware feeds like Virustotal.

Manchester Academic Phrasebank

Dataset | Paper Writing

The Academic Phrasebank is a general resource for academic writers. It aims to provide you with examples of some of the phraseological ‘nuts and bolts’ of writing, organized according to the main sections of a research paper or dissertation.

The data bank contains the categories “Introducing Work”, “Referring to Sources”, “Describing Methods”, “Reporting Results”, “Discussing Findings”, and “Writing Conclusions”.


BGP | Dataset | RPKI

The NIST RPKI Monitor shows different statistics about RPKI adoption and about the validation status. It shows the number of validating prefixes, their history, the autonomous systems with the most VALID and INVALID prefixes and how validation changes over time.


DNS | Dataset | IP | Network

Open INTEL is an active DNS database. It gathers information from public zone files, domain lists (Alexa, Umbrella), and reverse DNS entries. Once every 24 hours, data is collected about a bunch of DNS RRsets (SOA, NS, A, AAAA, MX, TXT, DNSKEY, DS, NSEC3, CAA, CDS, CDNSKEY). The data is openly available as AVRO files and dates back until 2016.

The data can be freely downloaded. There is documentation on the layout of the AVRO files.

The project is similar to Active DNS but seems to be larger in scope.

RIPEstat: Providing open data and insights for Internet resources

Autonomous System | BGP | DNS | Dataset | Network | Tool

RIPEstat is a network statistics platform by RIPE. The platform shows data for IP addresses, networks, ASNs, and DNS names. This includes information such as the registration information, abuse contacts, blocklist status, BGP information, geolocation lookups, or reverse DNS names. Additionally, the website links to many other useful tools, such as an address space hierarchy viewer, historical whois information, and routing consistency checks.

Root Servers

DNS | Dataset | Tool

Overview page for the DNS root servers. It contains links to general news and all the supporting organizations.

The website features a map with all geographic locations. It contains information about locations, IPv4/IPv6 reachability and IP addresses.

Each root server has its own subdomain in the form of It contains access to historical performance data like:

  • Size and time of zone updates
  • RCODE volume
  • query and response sizes for UDP and TCP
  • traffic volume (packets per time)
  • Unique sources

Shadowserver Dashboard

DNS | Dataset | Malware | Network

The Shadowserver Scanning projects performs regular Internet wide scans for many protocols. The dashboard shows the gathered data about botnet sinkholes, Internet scans, honeypots, DDoS, and IoT data. This includes information about the size of botnets, the number of IP addresses with open ports like MySQL, the botnets as seen by honeypots, or the used protocols for DDoS attacks.

The blog post provides an introduction to the new dashboard.

Shadowserver Scanning Project

DNS | Dataset | Malware | Network

The Shadowserver Scanning projects performs regular Internet wide scans for many protocols. They scan for four main types of protocols:

  1. Amplification protocols, e.g., DNS or NTP
  2. Botnet protocols, e.g., Gameover Zeus or Sality
  3. Protocols that should not be exposed, e.g., Elasticsearch, LDAP, or RDP
  4. Vulnerable Protocols, e.g., SSLv3

The website is a great resource to get general statistics about the protocols, like the number of hosts speaking the protocol, their geographic distribution, associated ASNs, and the historic information.

Transient Execution Attacks

Dataset | Security

The website lists all known speculation side channel attacks. Each attack contains information about the attacked buffer, the affected vendors, and working state. They are sorted into a hierarchy. Each attack is also linked to proof-of-concepts and the academic papers.

W³Techs Surveys


W³Techs crawls a large part of the web, with over 10 million sites (Alexa). It focuses on the technologies used to implement the websites. The website offers various statistics, such as the most used languages, frameworks, web servers, and hosting information.


The website lists all the existing *.new domains. They generally allow you to open a new document or work on something.