One of the benefits of “going to the Cloud” is cloud bursting when using managed services.
A Cloud Frontier has worked on different Cloud architectures but using Dataiku and AWS EMR are a great combination. A Cloud Frontier is currently helping an Enterprise customer to analyse and anonymize billions of records from their core network. Dataiku is handling the end-to-end process using Scala and SparkSQL for anonymization and also provisioning AWS EMR clusters on demand.
The whole process takes less than one hour from starting the process to output that is GDPR compliant and fully anonymized. The process is fully automated and runs every 24 hours when new core data have been collected.
Contact us for more information regarding the solution using Dataiku or AWS EMR.