Benchmark evaluation is an essential part of predictive toxicology because the approach and methodology employed for qualitative or quantitative toxicity prediction need to be validated. The EGSB team has developed a series of benchmark datasets and benchmarking metrics that are used in our research projects. Our benchmark datasets either come directly from the public domain such as the Tox21 Challenge dataset (https://tripod.nih.gov/tox21/challenge/data.jsp), or curated from publicly accessible databases such as the FTP website of NCBI’s PubChem (ftp://ftp.ncbi.nlm.nih.gov/pubchem/).