Historic datasets (from 2014 onwards) for the .nl TLD. Datasets are available in JSON format.
Datasets cover information about:
https://www.activednsproject.org/
Datasets | DNS | IP
Historical DNS database. Access can be requested for academic use.
Activly queries many DNS records, e.g., .com
zone.
It can contain information not in DNSDB, if the information was never seen by a resolver.
It does not contain all informatin, as some domains may be unknown to the project and thus cannot be crawled.
It uses popular zones, domain lists (e.g., Alexa, blacklists) and other domain feeds.
They normally maintain a rolling 14-day window.
Copy files (for date 2017-10-05) (ddos@gladbeck
):
sftp -B1024000 -C -rp "activedns@kokino.gtisc.gatech.edu:active-dns/20171005/" .
The data is encocded in AVRO format, which can also be parsed as JSONL. Python has a AVRO library. AVRO schema:
{
"namespace": "astrolavos.avro",
"type": "record",
"name": "ActiveDns",
"fields": [
{"name": "date", "type": "string"},
{"name": "qname", "type": "string"},
{"name": "qtype", "type": "int"},
{"name": "rdata", "type": ["string", "null"]},
{"name": "ttl", "type": ["int", "null"]},
{"name": "authority_ips", "type": "string"},
{"name": "count", "type": "long"},
{"name": "hours", "type": "int"},
{"name": "source", "type": "string"},
{"name": "sensor", "type": "string"}
]
}
Some more information about some fields that are unique to that schema. The IPs in Authority IP are the collection of the authority name server IPs that replied to our query. We gather all the IPs that gave us the same answer for an entire day and concatenate them on the same field, mostly in order to reduce the number of records that we have to keep. The only field that might be slightly confusing, is the "hours" field. This is a 24bit integer that encodes the time of day we saw this RR for date date (for example, 000000000000000001000010 = 18:00 and 23:00). Another important thing to keep in mind, is NXDOMAINs. A resolved QNAME does not exist when both the rdata and ttl fields are equal to null. If rdata exists but ttl is null then the record was part of the glue of the DNS packet and not in the answer section.
A similar active DNS project is Open INTEL which seems to be larger in scope and the data is publicly available.
https://github.com/Phenomite/AMP-Research
Amplification | Datasets | Denial-of-Service | Tools
The AMP-Research project collects information about amplification vectors in protocols including reproduction possibilities. For each vector the port and protocol are listed, as well as, the amplification factor. A scanning script or payload for scanning with zmap is included too.
https://stat.ripe.net/special/bgplay
BGP | Datasets | Tools
BGPlay shows a graph of the observed BGP routes. It allows replaying historical BGP announcements and displays route changes.
Downloadable dataset of historic BGP information from different vantage points.
An open-source software framework for live and historical BGP data analysis, supporting scientific research, operational monitoring, and post-event analysis.
BGP streams are freely accesible and provided by Route View, RIPE, and BGPmon.
BGP Stream is a free resource for receiving alerts about hijacks, leaks, and outages in the Border Gateway Protocol.
BGP Steam provides real-time information about BGP events. It includes information about affected IPs, ASNs, and even a replay feature how the BGP announcements changed.
A live alert bot also exists on Twitter.
https://dev.hicube.caida.org/feeds/hijacks/events
BGP | Datasets
The BGP hijacking observatory lists potential BGP hijacks. It can observe different kinds of hijacks, e.g., shorter path or more specific prefix. It lists the hijacking time, potential victims and attackers, and the affected prefix.
More details about the different hijacking methods are in the AIMS-KISMET presentation.
https://www.caida.org/catalog/datasets/overview/
Autonomous Systems | BGP | Datasets | IP
Overview of datasets, monitors, and reports produced and organized by Caida. Also contains links to other datasets.
Censys performs regular scans for common protocols (e.g., DNS, HTTP(S), SSH). Provides a search for TLS certificates.
Access is free, but requires registration. The website no longer provides free bulk access. Bulk access requires a commercial or a research license. The free access is limited to 1000 API calls per day.
@InProceedings{censys15,
author = {Zakir Durumeric and David Adrian and Ariana Mirian and Michael Bailey and J. Alex Halderman},
title = {A Search Engine Backed by {I}nternet-Wide Scanning},
booktitle = {Proceedings of the 22nd {ACM} Conference on Computer and Communications Security},
month = oct,
year = 2015
}
Cloudflare Radar is Cloudflares reporting website about internet trends and general traffic statistics. The website shows information about observed attacks and attack types and links to the DDoS report. General traffic statistics are reported, such as the used browser, fraction of human traffic, IP, HTTP, and TLS version.
The website also provides more detailed information for domains and IP addresses. Domains have information about age, popularity, and visitors. IP addresses have ASN and geolocation information.
More information about Cloudflare Radar are available in the introduction blogpost.
https://github.com/DNS-OARC/bad-packets
Datasets | DNS | IP | PCAPs
Collection of "bad" packets in PCAPs that can be used for testing software.
The Common Crawl project builds an openly accessible database of crawled websites. They index can be searched.
https://github.com/TW-NCERT/ctifeeds
DNS | IP | Spam
Provides an outdated list of different Cyber Thread Intelligence Feeds of other organizations.
Provides a search interface to search for domain names and IP addresses under attacks. Shows results for the last 30 days. Provides an API, which requires special authorization.
DMAP is a scalable web scanning suit which supports DNS, HTTPS, TLS, and SMTP. It works based on domain names and crawls the domain for all supported protocols. The advantage over other tools is the unified SQL data model with 166 features and the easy scalability over many crawling machines.
dn42 is a big dynamic VPN. It employs various Internet technologies, such as BGP, whois, DNS, etc.
Users can experiment with technology, they normally would not use in a separated environment.
Mostly different hackerspaces participate in the dn42 network, such as different locations of the CCC.
https://dnscensus2013.neocities.org/index.html
Datasets | DNS | IP
The DNS Census 2013 consist of about 2.5 billion DNS records collected in 2012/2013. The data is gathered from some available zone files and passive or active DNS collecting. The DNS records are written into CSV files containing one DNS record per line.
DNS Coffee collects and archives stats from DNS Zone files in order to provide insights into the growth and changes in DNS over time.
The website includes information such as the size of different zones. It track over 1200 zone files.
It provides searching through the zones files based on domain names, name servers, or IP addresses. It can also visualize the relationship between a domain, the parent zones and the name server in what they call a "Trust Tree".
Browser-based DNS resolver quality measurement tool. Uses the browser to generate many resolver queries and tests for features they should have, such as EDNS support, IPv6, QNAME Minimisation, etc.
This test is also available as a CLI tool: https://github.com/DNS-OARC/cmdns-cli
Analyze DNSSEC deployment for a zone and show errors in the configuration.
Gives an overview over DNSSEC delegations, response sizes, and name servers.
GitHub: https://github.com/dnsviz/dnsviz
The website has an online test, which performs DNS lookups. These DNS lookups test if certain resource records are overwritten in the cache. The tool can then determine what DNS software is used, where the server is located, how many caches there are, etc.
Test name server of zones for correct EDNS support.
Shows the trust dependencies in DNS. Given a domain name it can show how zones delegate to each other and why. The delegation is done between IP addresses and zones.
The project used to monitor the first root KSK key rollover. Now it contains the paper "Roll, Roll, Roll your Root: A Comprehensive Analysis of the FirstEver DNSSEC Root KSK Rollover" describing the experiences of the first root KSK rollover
Additionally, it includes a tester for DNSSEC algorithm support, which shows the algorithms supported by the currently used recursive resolver. It provides statistics about support for DNSSEC algorithms. It has a web based test to test your own resolver and provides a live monitoring using the RIPA Atlas.
This dataset covers approximately 3.5 billion DNS queries that were received at one of SURFnet's authoritative DNS servers from Google's Public DNS Resolver. The queries were collected during 2.5 years. The dataset contains only those queries that contained an EDNS Client Subnet.
The dataset covers data from 2015-06 through 2018-01.
https://www.dns-oarc.net/tools/drool
DNS | IP | PCAPs | Tools
Tool to replay DNS queries captured in a pcap file with accurate timing between queries. Allows modifying the replay like changing IP addresses, speed up or slow down the queries.
https://www.dns-oarc.net/tools/dnscap
DNS | IP | PCAPs | Tools
DNS network capture utility. Similar in concept to tcpdump, but with specialized options for DNS.
Historical DNS database. Contains information recorded at recursive resolver about domain names, first/last seen, current bailiwick. Allows to see the lifetime of resource records and can be used as a large database.
https://atlas.ripe.net/dnsmon/
Datasets | DNS
Historical information about the reachability of root and some TLD name servers.
https://www.dns-oarc.net/tools/dnsperf
DNS | IP | PCAPs | Tools
DNS performance measurement tools.
https://rick.eng.br/dnssecstat/
Datasets | DNS | DNSSEC
Regularly updated reports about current DNSSEC deployment. Contains information per TLD and global distribution.
@dnsstream is a Twitter bot, which sends out notifications for important DNS changes of domains.
https://dnsthought.nlnetlabs.nl/
Datasets | DNS | DNSSEC
Dnsthought list many statistics about the resolvers visible to the .nl-authoritative name servers. The data is gathered from the RIPE Atlas probes. There is a dashboard which only works partially.
Raw data access is also available.
http://dns.measurement-factory.com/tools/dnstop/
DNS | Tools
Top-like utility showing information about captured DNS requests. It shows information about the domains queries, the types, and responses.
https://github.com/deiv/driftnet
CTF | Tools
Driftnet watches network traffic, and picks out and displays JPEG and GIF images for display.
https://dublin-traceroute.net/README.md
IP | Tools
This is an improvement on Paris traceroute and the classical traceroute. It can detect changing routes and detect NATs along the path.
https://github.com/duckduckgo/tracker-radar
Datasets
Tracker Radar collects common third party domains and rich metadata about them. The data is collected from the DuckDuckGo crawler. More details are in this blogpost.
This is not a block list, but a data set of the most common third party domains on the web with information about their behavior, classification and ownership. It allows for easy custom solutions with the significant metadata it has for each domain: parent entity, prevalence, use of fingerprinting, cookies, privacy policy, and performance. The data on individual domains can be found in the domains directory.
https://github.com/tim-fiola/network_traffic_modeler_py3
Tools
FD.io is a very fast userspace networking library, which allows to create programs for packet processing. While DPDK allows fast read and write access to the NICs, FD.io is foccussed on processing the packets. Possible use cases are a packet forwarder, implementing a NAT, or a VPN.
More details also in this APNIC blogpost: https://blog.apnic.net/2020/04/17/kernel-bypass-networking-with-fd-io-and-vpp/
https://github.com/DNS-OARC/flamethrower
DNS | IP | Tools
Flamethrower is a small, fast, configurable tool for functional testing, benchmarking, and stress testing DNS servers and networks. It supports IPv4, IPv6, UDP, TCP, DoT, and DoH and has a modular system for generating queries used in the tests.
https://opendata.rapid7.com/sonar.fdns_v2/
Datasets | DNS | IP
This dataset contains the responses to DNS requests for all forward DNS names known by Rapid7's Project Sonar. Until early November 2017, all of these were for the 'ANY' record with a fallback A and AAAA request if neccessary. After that, the ANY study represents only the responses to ANY requests, and dedicated studies were created for the A, AAAA, CNAME and TXT record lookups with appropriately named files.
The data is updated every month. Historic data can be downloaded after creating a free account.
3D map showing submarine cables and the backbone network of Hurricane Electric.
https://atlas.ripe.net/results/maps/
Datasets | DNS | Maps
Maps of measurements done with the RIPE Atlas.
IODA is a project by CAIDA to use different data sources to detect macroscopic internet outages in realtime. It measures the internet activity using BGP, darknets, and active probing. The website provides a realtime feed and a historical view of outages.
These websites have lists of abusive IP addresses. They can be checked with a web form or some websites also provide a feed.
The cheatsheet describes in few words what the different subcommands of ip
do.
It includes some other helpful networking commands for arping
, ethtool
, and ss
, and provides a comparison with the older net-tools commands.
https://www.circl.lu/services/ip-asn-history/
Autonomous Systems | Datasets | IP
Historical dataset about IP to ASN mappings.
https://team-cymru.com/community-services/ip-asn-mapping/
Autonomous Systems | Datasets | IP
Historical dataset about IP to ASN mappings.
IP geolocation services feeding itself from geolocation databases, user provided locations, and most importantly active RTT measurements based on the RIPE Atlas system. It also provides a nice API to query the location. It provides a breakdown on where the results stem from and how much they contribute to the overall result.
Per continent, region, or country measurements of IPv6 deployment and preference. Allows to access historical data.
Per continent, region, or country measurements of IPv6 deployment and preference.
A curated list of IPv6 hosts, gathered by crawling different lists. Includes:
Access to the full list requires registration by email.
Based on the paper "Scanning the IPv6 Internet: Towards a Comprehensive Hitlist".
The website contains the additional material of the IMC paper Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists. The IPv6 addresses can be downloaded from the website. The website has three lists, responsive IPv6 addresses, aliased prefixes, and non-aliased prefixes. Additionally, the website also has a list of tools used during the data creation.
"Is BGP safe yet?" is an effort by Cloudflare to track the deployment of RPKI filtering accross different ISPs. They provide a tester on the website with which each user can test if the current ISP is filtering RPKI invalid announcements. The website includes a list of networks and if and how they use RPKI (signing and/or filtering).
More details for this project can be found in Cloudflare's blog or on the GitHub project.
Contains a list of pricing information of different IXP.
https://www.us-cert.gov/ncas/alerts/TA14-017A
Amplification | Datasets | Denial-of-Service
Contains a list of UDP-based protocols, which can be used for amplification attacks.
Isolario also provides historial routing data in MTR format for their route collectors. The data contains snapshots every two hours and updates with a granularity of five minutes.
The Packet Clearing House (PCH) publishes BGP data collected at more than 100 internet exchange points (IXP). The snapshot dataset contains the state of the routing tables in daily intervals.
PCH also provides raw routing data in MRT format. These contain all the update information in sorted by time.
The RIS is the main resource from RIPE featuring all kinds of datasets about AS assignments and connectivity.
Routeviews is a project by the University of Oregon to provide live and historical BGP routing data.
https://powerdns.org/dns-camel/
DNS
Contains information about the state of the RFC and what kind of information they contain.
http://www.traceroute.org/#Looking%20Glass
Datasets
The websites shows links to different looking glasses which provide either traceroute information or are usable as route servers.
These projects either operate DNS based Real-time Blackhole Lists (RBL) or allow checking if an IP is contained. The Multi-RBL websites are helpful in finding a large quantity of RBLs.
https://github.com/nsg-ethz/mini_internet_project
BGP | IP | Tools
The mini internet project is part of the curiculum by the Networked Systems Group of ETH Zurich. It teaches the students the basic steps how to create a mini internet. It starts with the basics of intra-network routing, by setting up multiple L2 switches. Then the students have to configure L3 routers to connect multiple L2 sites together. Lastly, in a big hackathon style, the students need to connect their local network with the network of the other students, by properly configuring BGP routers and setting up routing policies.
The code and the tasks are all available in the GitHub repository.
The APNIC Blog has a nice introduction to the project too.
https://gitlab.planet-lab.eu/cartography/
IP | Tools
Multi-level MDA-Lite Paris Traceroute is a traceroute tool, which understands and learns more complex network topologies. Often times the network is not just a line, but multiple paths are possible and chosen at random.
A good description of the tool can be found in the RIPE Labs post or in the IMC 2018 paper.
This website measures support for NAT64 in other websites.
The Netlab of 360.com provides some open data streams.
One dataset concerns the number of abused reflectors per protocol.
The Internet Observatory is a project by the RWTH Aachen University. It combines different scanning projects.
As of writing it contains information about:
Overview over IP addresses scanning the internet and which ports are scanned.
https://github.com/honze-net/nmap-bootstrap-xsl/
The nmap stylesheet converts the nmap XML output into a nice website. A sample report can be found under this link.
The nPrint project is a collection of open source software and benchmarks for network traffic analysis that aim to replace the built-to-task approach currently taken when examining traffic analysis tasks.
Open INTEL is an active DNS database.
It gathers information from public zone files, domain lists (Alexa, Umbrella), and reverse DNS entries.
Once every 24 hours data is collected about a bunch of DNS RRsets (SOA
, NS
, A
, AAAA
, MX
, TXT
, DNSKEY
, DS
, NSEC3
, CAA
, CDS
, CDNSKEY
).
The data is openly avaible as AVRO files and dates back until 2016.
The data can be freely downloaded. There is documentation on the layout of the AVRO files.
The project is similar to Active DNS but seems to be larger in scope.
This is an improvement on the traditional traceroute program. It is able to detect multiple distinct routes and display them accordingly. The classical traceroute would produce weird results on changing network routes.
Another similar program is Dublin traceroute.
https://www.circl.lu/services/passive-dns/
Datasets | DNS | IP
Passive DNS dataset from circl.lu.
https://peering.ee.columbia.edu/
BGP | Tools
PEERING is an environment where researchers and educators can play with BGP announcements in a real but sandboxed environment.
Description from the website:
The long-term goal of the PEERING system is to enable on-demand, safe, and controlled access to the Internet routing ecosystem for researchers and educators:
Contains information for some networks about peering information. This includes peering partnes, transfer speeds, peering requirements and similar.
The public suffix list gives a way to easily determine the effective second level domain, i.e., the domain which a domain owner registered and which can be under different owners.
https://github.com/tim-fiola/network_traffic_modeler_py3
Tools
pyNTM allows to create a network with circuits between layer 3 nodes. This model then allows to simulate and evaluate how traffic will traverse the topology. This can be used to test different network topologies and fail over scenarios.
https://catalog.caida.org/details/paper/2019_learning_regexes_extract_router
https://catalog.caida.org/details/paper/2020_learning_extract_use_asns
These two papers focus on how to extract information from the hostname of routers. These hostnames occur when performing traceroutes. The Regexs can be use to extract identifiers and AS numbers. The generated datasets of the papers are openly accessible.
https://emaillab.jp/dns/dns-rfc/
DNS
The site contains two PDFs showing the relationship between the DNS RFCs. There is a simplified overview and a full overview. The graphs are regularly updated. The RFCs are organized by time, by topic area (e.g., DNSSEC, RR), and the RFC status (e.g., Standard, Best Current Practice).
https://gitlab.labs.nic.cz/knot/respdiff
DNS | Tools
DNS responses gathering and differences analysis toolchain.
RIPE operates a set of probes, which can be used to send pings or similar measurements. The probes are mainly placed in Europe but some are also in other continents.
All the collected measurements can be found in the RIPE Atlas Daily Archives. The blog post gives some more details.
RIPEstat is a network statistics platform by RIPE. The platform shows data for IP addresses, networks, ASNs, and DNS names. This includes information such as the registration information, abuse contacts, blocklist status, BGP information, geolocation lookups, or reverse DNS names. Additionally, the website links to many other useful tools, such as an address space hierarchy viewer, historical whois information, and routing consistency checks.
https://www.ripe.net/analyse/internet-measurements/routing-information-service-ris
BGP | Datasets | DNS | Tools
Different information regarding reachability and connectiveness of ASs.
The Route Origin Validation (ROV) Deployment Monitor measures how many AS have deployed ROV. It uses PEERING for BGP annoucements and uses BGP monitors to see in which ASs the wrong announcements are filtered. A blogpost at APNIC describes it in more detail.
These websites allow you to browser the valid RPKI announcements. They show which address ranges are covered by RPKI and who the issuing authority is.
https://www.ripe.net/s/rpki-test
RPKI | Tools
Website, which tests, if your provider filters invalid announcements using RPKI.
The website contains no usable data anymore and only links to empty pages on Censys.
The website used to host many free Internet scans of different kinds. It included historical data and activly maintained datasets.
https://scan.shadowserver.org/
Datasets | DNS
The Shadowserver Scanning projects performs regular Internet wide scans for many protocols. They scan for four main types of protocols:
The website is a great resource to get general statistics about the protocols, like the number of hosts speaking the protocol, their geographic distribution, associated ASNs, and the historic information.
Shodan performs regular scan on common ports.
Access is free, but requires registration. More results can be gained with a paid account.
https://github.com/kontaxis/snidump
CTF | Tools
This is a tcpdump-like program for printing TLS SNI and HTTP/1.1 Host fields in live or captured traffic.
https://baturin.org/docs/iproute2/
Cheatsheet | Tutorials
Userguide for the newer ip
command under Linux.
The guide consists of different tasks one might want to perform and their corresponding ip
commands.
https://blog.wains.be/2007/2007-10-01-tcpdump-advanced-filters/
Cheatsheet | Tools | Tutorials
The website contains different tcpdump
filters.
It starts with basic filters and then builds up ever more complex ones.
This is a good source for looking up complicated filters, if one does not want to write them themself.
TeleGeography provides different maps about the Internet. They contain information about submarine cables, global traffic volume, latency, internet exchange points. The data for the Submarine Map and the Internet Exchange Map can also be found on GitHub in text format.
http://www.inspire.edu.gr/traIXroute/
IP | Tools
A traceroute like tool, that detects where a path crosses an IXP.
The website shows ongoing DDoS attacks in real time. Attacks are shown with source and destination country. They have futher information, such as the used protocols and attack bandwidth.
https://stats.apnic.net/vizas/
Autonomous Systems | BGP | Tools | Datasets
vizAS by APNIC shows the connectiveness between different ASs split by countries. It is usefull to find the ASs which are most central in the graph.
AMP is a system designed to continuously perform active network measurements between a mesh of specialist monitor machines, as well as to other targets of interest. These measurements are used to provide both a view of long-term network performance as well as to detect notable network events when they happen.
The project is run with a custom client and server software. The measurement results can be viewed on the website. It includes traceroutes, latencies (DNS, HTTP, ICMP, TCP), HTTP page sizes, and packet loss. The software is available as open source.
These services allow you to create a domain name for any IP address. The IP address is encoded into the domain name. An overview over different services can be found here.
https://nip.io/ provides IPv4 only
.
and -
separators.10.0.0.1.nip.io
resolves to 10.0.0.1
192-168-1-250.nip.io
resolves to 192.168.1.250
customer1.app.10.0.0.1.nip.io
resolves to 10.0.0.1
magic-127-0-0-1.nip.io
resolves to 127.0.0.1
https://sslip.io/ provides IPv4 and IPv6
.
and -
separators.192.168.0.1.sslip.io
resolves to 192.168.0.1
192-168-1-250.sslip.io
resolves to 192.168.1.250
www.192-168-0-1.sslip.io
resolves to 192.168.0.1
–1.sslip.io
resolves to ::1
2a01-4f8-c17-b8f--2.sslip.io
resolves to 2a01:4f8:c17:b8f::2
https://github.com/aaptel/qtwirediff
PCAPs | Tools
WireDiff is a debugging tool to diff network traffic leveraging Wireshark.
Wirediff lets you open 2 network traces packets side-by-side. You can select a packet from each trace and diff their content at the protocol level you want.
A more thourough introduction is available in the APNIC blog: https://blog.apnic.net/2020/07/01/wirediff-a-new-tool-to-diff-network-captures/.
Yarrp is a active network topology discovery tool. It's goal is to identify router interfaces and interconnections on internet scale. Conceptually this is similar to running many traceroutes and stiching them together into one view. However, traceroutes are designed to understand the connection between two hosts and do not scale easily.
https://github.com/NLnetLabs/ziggy
RPKI | Tools
Ziggy is a tool to inspect the RPKI ecosystem at arbitrary points in the past. It is developed by NlNetLabs. More details abouut the ziggy tool can be found in the announcement blogpost.
Different utilities for network scanning. Most imporantly the zmap component, which is a packet scanner for different protocols. It also contains other tools like ways to iterate over the IPv4 address space and blacklist/whitelist management.
https://zonefiles.io/detailed-domain-lists/
Datasets
The website provides download access to domains in many TLDs. Most lists are updated daily. However, not all of the lists seem complete. For example, DENIC reports that they manage over 17 million domains, whereas zonefiles.io only reports over 6 million domains.