Dataset Creation¶
All files relating to the creation of graph datasets. Boxes to be checked once read & a summary line added in italics.
Papers added to summer2023.bib file and are stored by SurnameYear.pdf here: Reference Papers
Benchmark Datasets¶
Synthetic Graph Generation¶
- [ ] Graph500 v3.0.0 Generator. Written in C to meet the standards of the Graph500 benchmark.
- [ ] R-MAT (Chakrabarti et al 2004).
- [ ] plantri and fullgen planar graph generators. A planar graph is a graph that can be drawn on a plane in such a way that its edges intersect only at their endpoints - that is, no edges cross each other. May be of use for subgraph matching? The Paper is here.
- [ ] Synthetic Data and Graph Generation for Modeling Adversarial Activity. Looks to be useful for generating things that look like Knowledge Graphs.
- [ ] C++ Implementation of Louvian. Also looks to include some graph generation code that might be useful.
Existing Graph Datasets¶
- [ ] Stanford Large Network Dataset Collection - Large collection of datasets, including several appropriate for community detection applications.
- [ ] SuiteSparse Matrix Collection - Large collection of sparse matrix datasets for a large number of applications. Noting that these are matrices they may require a bit of work for us to convert into Adjacency Lists if we choose to use them.
- [ ] Stanford Open Graph Benchmark - Datasets to support Machine Learning on Graphs. Mainly focused on Node, Link, and Graph property prediction (2023).
Graph Sampling¶
- [ ] Network Sampling via Edge-based Node Selection with Graph Induction - Paper on how to sample graphs. Note that commentary from Ben Johnson that graph sampling while maintaining topological properties is a hard problem.