March 27, 2023

I’ve been step by step enhancing my data wrangling tool, Easy Data Transform, placing out 70 public releases since 2019. Whereas the product’s emphasis is on ease of use, reasonably than pure efficiency, I’ve been making an attempt to make it quick as properly, so it might deal with the multi-million row datasets clients wish to throw at it. To see how I used to be doing, I did a easy benchmark of the latest model of Simple Knowledge Remodel (v1.37.0) towards a number of different desktop knowledge wrangling instruments. The benchmark did a learn, type, be part of and write of a 1 million row CSV file. I did the benchmarking on my Home windows growth PC and my Mac M1 laptop computer.

Easy Data Transform screenshot

Right here is an summary of the outcomes:

Time by process (seconds), on Home windows with out Energy Question (smaller is healthier):

data wrangling/ETL benchmark Windows

I’ve left Excel Energy Question off this graph, as it’s so sluggish you possibly can hardly see the opposite bars when it’s included!

Time by process (seconds) on Mac (smaller is healthier):

data wrangling/ETL benchmark M1 Mac

Reminiscence utilization (MB), Home windows vs Mac (smaller is healthier):

data wrangling/ETL benchmark memory Windows vs Mac

So Simple Knowledge Remodel is almost as quick because it’s nearest competitor, Knime, on Home windows and a good bit quicker on an M1 Mac. It’s also makes use of rather a lot much less reminiscence than Knime. Nonetheless we have now bought some approach to go to meet up with the Pandas library for Python and the information.desk bundle for R, in terms of uncooked efficiency. Hopefully I can get nearer to their efficiency in time. I used to be forbidden from together with benchmarks for Tableau Prep and Alteryx by their licensing phrases, which appears unnecessarily restrictive.

simply the Simple Knowledge Remodel outcomes, it’s fascinating to note {that a} newish Macbook Air M1 laptop computer is considerably quicker than a desktop AMD Ryzen 7 desktop PC from a couple of years in the past.

Windows vs Mac M1 benchmark

See the total comparability:

Comparison of data wrangling/ETL tools : R, Pandas, Knime, Power Query, Tableau Prep, Alteryx and Easy Data Transform, with benchmarks

