Big Data frameworks on an Apple M1/Silicon for educational purposes

Dear Raúl,

It’s been a while since the last time you stopped by this blog; hectic times lately but no worries, all good 🙂

This time two of your top students, Michelle Michalovski and Göktuğ Aşcı, helped you with this topic that seems to be hot in your classes for the last months given that this new Apple computer is giving very good results, and more and more students are buying it.

Long story short, the three of us were able to make our beloved course environment work on this new fancy and, apparently, very powerful new Apple computer; as usual, we’re going to drive you through the whole process for you to reproduce it whenever you needed. Of course, this is going to be beneficial for those who want to give it a try on their own as well.

About my guests

The following words coming from Michelle introduce herself very quickly:

“During my bachelor in economics I collected some experience in consulting working part time at KPMG. After having finished with my studies I worked half a year in a start up (SME financing) based in Manila in their data science department (mostly on the analytics side in risk). Besides that, I also used the time during the pandemic as well as the Venture Lab at IE to found and develop a platform from scratch for students helping them to better assess their admission chances:”

These are some words from Göktuğ about himself:

“I am a medical doctor and a data scientist with 4 years of full-stack, cross-platform software development experience.

During my medical training, I witnessed that hospital processes were carried out without efficiency and realized the great potential of tech-enabled health care services. Starting from my medical school education, I have been practicing Python (Django, Django Rest Framework), SQL, Apache Cordova, Vue.js, and AWS components such as RDS, S3, EC2, Comprehend Medical for four years on production-level applications. To gain more competence in the data science area, I started my master’s degree in Business Analytics and Big Data at IE University in Spain with a full scholarship that covers 100% of the fee and life expenses given to one person each year by the Turkish Education Foundation. During the master’s degree, we built production-level machine learning projects.

Currently, I am working on a passion project called Mealdoc ( in parallel with my education and my job search. Mealdoc is a patient data management platform for dietitians struggling to manage patients online. It provides machine learning-powered meal plan suggestions and an appointment management infrastructure. Mealdoc has 10000+ registered users; we aim at automating the meal plan creation process currently supervised by dietitians.”

What we did

I won’t elaborate what we did that much because the video at the end of the post is very self-descriptive and easy to follow 🙂

In summary, we accomplished the following steps:

  1. I introduce what OSBDET, Open Source Big Data Educational Toolkit, is all about and where the motivation comes from: Building an analytics and multi data-set OVA for learners.
  2. Göktuğ drives us through the process of building a Ubuntu Server 20.04 image for arm64 processors, by using UTM to run Virtual Machines on an Apple M1/Silicon based computer.
  3. Michelle shows how to execute one of the labs from the “Stream Processing and Real-time Analytics” course.

If you’re curious about this, feel free to check out the recording of the session; as you can see, we had a lot of fun making it:

Let’s see how much time it’ll take me to stop by with a new entry again 🙂


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s