![]() The benchmark process is easy enough to cover a wide range of systems. The dataset has been filtered to avoid difficulties with parsing and loading. The tables and queries use mostly standard SQL and require minimum or no adaptation for most SQL DBMS. The dataset is published and made available for download in multiple formats. The test process is documented in the form of a shell script, covering the installation of every system, loading of the data, running the workload, and collecting the result numbers. The test setup is documented and uses inexpensive cloud VMs. You can quickly reproduce every test in as little as 20 minutes (although some systems may take several hours) in a semi-automated way. The main goals of this benchmark are: Reproducibility The set of queries was improvised to reflect the realistic workloads, while the queries are not directly from production. ![]() It is anonymized while keeping all the essential distributions of the data. The dataset from this benchmark was obtained from the actual traffic recording of one of the world's largest web analytics platforms. It covers the typical queries in ad-hoc analytics and real-time dashboards. This benchmark represents typical workload in the following areas: clickstream and traffic analysis, web analytics, machine-generated data, structured logs, and events data. ClickBench: a Benchmark For Analytical Databases ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |