Virtualization makes it attainable to run a number of digital machines (VMs) on a single piece of bodily {hardware}. These VMs behave like unbiased computer systems, however share the identical bodily computing energy. A pc inside a pc, so to talk.
Many cloud providers depend on virtualization. However different applied sciences, similar to containerization and serverless computing, have turn into more and more necessary.
With out virtualization, lots of the digital providers we use every single day wouldn’t be attainable. After all, it is a simplification, as some cloud providers additionally use bare-metal infrastructures.
On this article, you’ll learn to arrange your personal digital machine in your laptop computer in only a few minutes — even when you’ve got by no means heard of Cloud Computing or containers earlier than.
Desk of Contents
1 — The Origins of Cloud Computing: From Mainframes to Serverless Structure
2 — Understanding Virtualization: Why it’s the Foundation of Cloud Computing
3 — Create a Digital Machine with VirtualBox
Closing Ideas
The place are you able to proceed studying?
1 — The Origins of Cloud Computing: From Mainframes to Serverless Structure
Cloud computing has basically modified the IT panorama — however its roots return a lot additional than many individuals suppose. The truth is, the historical past of the cloud started again within the Nineteen Fifties with big mainframes and so-called dumb terminals.
- The period of mainframes within the Nineteen Fifties: Firms used mainframes in order that a number of customers might entry them concurrently by way of dumb terminals. The central mainframes have been designed for high-volume, business-critical knowledge processing. Massive corporations nonetheless use them at this time, even when cloud providers have decreased their relevance.
- Time-sharing and virtualization: Within the subsequent decade (Nineteen Sixties), time-sharing made it attainable for a number of customers to entry the identical computing energy concurrently — an early mannequin of at this time’s cloud. Across the similar time, IBM pioneered virtualization, permitting a number of digital machines to run on a single piece of {hardware}.
- The start of the web and web-based functions within the Nineties: Six years earlier than I used to be born, Tim Berners-Lee developed the World Extensive Net, which revolutionized on-line communication and our whole working and dwelling atmosphere. Are you able to think about our lives at this time with out web? On the similar time, PCs have been changing into more and more well-liked. In 1999, Salesforce revolutionized the software program business with Software program as a Service (SaaS), permitting companies to make use of CRM options over the web with out native installations.
- The massive breakthrough of cloud computing within the 2010s:
The fashionable cloud period started in 2006 with Amazon Net Companies (AWS): Firms have been capable of flexibly lease infrastructure with S3 (storage) and EC2 (digital servers) as a substitute of shopping for their very own servers. Microsoft Azure and Google Cloud adopted with PaaS and IaaS providers. - The fashionable cloud-native period: This was adopted by the following innovation with containerization. Docker made Containers well-liked in 2013, adopted by Kubernetes in 2014 to simplify the orchestration of containers. Subsequent got here serverless computing with AWS Lambda and Google Cloud Features, which enabled builders to put in writing code that mechanically responds to occasions. The infrastructure is totally managed by the cloud supplier.
Cloud computing is extra the results of a long time of innovation than a single new expertise. From time-sharing to virtualization to serverless architectures, the IT panorama has constantly advanced. As we speak, cloud computing is the muse for streaming providers like Netflix, AI functions like ChatGPT and world platforms like Salesforce.
2 — Understanding Virtualization: Why Virtualization is the Foundation of Cloud Computing
Virtualization means abstracting bodily {hardware}, similar to servers, storage or networks, into a number of digital cases.
A number of unbiased techniques might be operated on the identical bodily infrastructure. As an alternative of dedicating a complete server to a single utility, virtualization allows a number of workloads to share sources effectively. For instance, Home windows, Linux or one other atmosphere might be run concurrently on a single laptop computer — every in an remoted digital machine.
This protects prices and sources.
Much more necessary, nonetheless, is the scalability: Infrastructure might be flexibly tailored to altering necessities.
Earlier than cloud computing grew to become extensively accessible, corporations usually needed to preserve devoted servers for various functions, resulting in excessive infrastructure prices and restricted scalability. If extra efficiency was abruptly required, for instance as a result of webshop visitors elevated, new {hardware} was wanted. The corporate had so as to add extra servers (horizontal scaling) or improve current ones (vertical scaling).
That is completely different with virtualization: For instance, I can merely improve my digital Linux machine from 8 GB to 16 GB RAM or assign 4 cores as a substitute of two. After all, provided that the underlying infrastructure helps this. Extra on this later.
And that is precisely what cloud computing makes attainable: The cloud consists of giant knowledge facilities that use virtualization to offer versatile computing energy — precisely when it’s wanted. So, virtualization is a elementary expertise behind cloud computing.
How does serverless computing work?
What should you didn’t even need to handle digital machines anymore?
Serverless computing goes one step additional than Virtualization and containerization. The cloud supplier handles most infrastructure duties — together with scaling, upkeep and useful resource allocation. Builders ought to give attention to writing and deploying code.
However does serverless actually imply that there aren’t any extra servers?
After all not. The servers are there, however they’re invisible for the person. Builders now not have to fret about them. As an alternative of manually provisioning a digital machine or container, you merely deploy your code, and the cloud mechanically executes it in a managed atmosphere. Assets are solely supplied when the code is working. For instance, you should utilize AWS Lambda, Google Cloud Features or Azure Features.
What are some great benefits of serverless?
As a developer, you don’t have to fret about scaling or upkeep. Because of this if there’s much more visitors at a specific occasion, the sources are mechanically adjusted. Serverless computing might be cost-efficient, particularly in Operate-as-a-Service (FaaS) fashions. If nothing is working, you pay nothing. Nonetheless, some serverless providers have baseline prices (e.g. Firestore).
Are there any disadvantages?
You may have a lot much less management over the infrastructure and no direct entry to the servers. There’s additionally a danger of vendor lock-in. The functions are strongly tied to a cloud supplier.
A concrete instance of serverless: API with out your personal server
Think about you will have a web site with an API that gives customers with the present climate. Usually, a server runs across the clock — even at instances when nobody is utilizing the API.
With AWS Lambda, issues work in another way: A person enters ‘Mexico Metropolis’ in your web site and clicks on ‘Get climate’. This request triggers a Lambda operate within the background, which retrieves the climate knowledge and sends it again. The operate is then stopped mechanically. This implies you don’t have a completely working server and no pointless prices — you solely pay when the code is executed.
3 — What Knowledge Scientists ought to Learn about Containers and VMs — What’s the Distinction?
You’ve most likely heard of containers. However what’s the distinction to digital machines — and what’s notably related as an information scientist?
Each containers and digital machines are virtualization applied sciences.
Each make it attainable to run functions in isolation.
Each supply benefits relying on the use case: Whereas VMs present robust safety, containers excel in pace and effectivity.
The principle distinction lies within the structure:
- Digital machines virtualize your complete {hardware} — together with the working system. Every VM has its personal operational system (OS). This in flip requires extra reminiscence and sources.
- Containers, then again, share the host working system and solely virtualize the applying layer. This makes them considerably lighter and sooner.
Put merely, digital machines simulate whole computer systems, whereas containers solely encapsulate functions.
Why is that this necessary for knowledge scientists?
Since as an information scientist you’ll come into contact with machine studying, knowledge engineering or knowledge pipelines, it is usually necessary to know one thing about containers and digital machines. Certain, you don’t have to have in-depth data of it like a DevOps Engineer or a Website Reliability Engineer (SRE).
Digital machines are utilized in knowledge science, for instance, when a whole working system atmosphere is required — similar to a Home windows VM on a Linux host. Knowledge science initiatives usually want particular environments. With a VM, it’s attainable to offer precisely the identical atmosphere — no matter which host system is out there.
A VM can also be wanted when coaching deep studying fashions with GPUs within the cloud. With cloud VMs similar to AWS EC2 or Azure Digital Machines, you will have the choice of coaching the fashions with GPUs. VMs additionally fully separate completely different workloads from one another to make sure efficiency and safety.
Containers are utilized in knowledge science for knowledge pipelines, for instance, the place instruments similar to Apache Airflow run particular person processing steps in Docker containers. Because of this every step might be executed in isolation and independently of one another — no matter whether or not it includes loading, remodeling or saving knowledge. Even if you wish to deploy machine studying fashions by way of Flask / FastAPI, a container ensures that all the things your mannequin wants (e.g. Python libraries, framework variations) runs precisely because it ought to. This makes it tremendous straightforward to deploy the mannequin on a server or within the cloud.
3 — Create a Digital Machine with VirtualBox
Let’s make this slightly extra concrete and create an Ubuntu VM. 🚀
I take advantage of the VirtualBox software program with my Home windows Lenovo laptop computer. The digital machine runs in isolation out of your most important working system in order that no adjustments are made to your precise system. When you’ve got Home windows Professional Version, you may also allow Hyper-V (pre-installed by default, however disabled). With an Intel Mac, you must also be capable of use VirtualBox. With an Apple Silicon, Parallels Desktop or UTM is outwardly the higher various (not examined myself).
1) Set up Digital Field
Step one is to obtain the set up file for VirtualBox from the official Digital Field web site and set up VirtualBox. VirtualBox is put in together with all crucial drivers.
You possibly can ignore the notice about lacking dependencies Python Core / win32api so long as you do not need to automate VirtualBox with Python scripts.
Then we begin the Oracle VirtualBox Supervisor:

2) Obtain the Ubuntu ISO file
Subsequent, we obtain the Ubuntu ISO file from the Ubuntu web site. An ISO Ubuntu file is a compressed picture file of the Ubuntu working system. Because of this it accommodates a whole copy of the set up knowledge. I obtain the LTS model as a result of this model receives safety and upkeep updates for five years (Lengthy Time period Help). Word the situation of the .iso file as we are going to use it later in VirtualBox.

3) Create a digital machine in VirtualBox
Subsequent, we create a brand new digital machine within the VirtualBox Supervisor and provides it the identify Ubuntu VM 2025. Right here we choose Linux as the sort and Ubuntu (64-bit) because the model. We additionally choose the beforehand downloaded ISO file from Ubuntu because the ISO picture. It could even be attainable so as to add the ISO file later within the mass storage menu.

Subsequent, we choose a person identify vboxuser2025 and a password for entry to the Ubuntu system. The hostname is the identify of the digital machine throughout the community or system. It should not comprise any areas. The area identify is non-compulsory and can be used if the community has a number of units.
We then assign the suitable sources to the digital machine. I select 8 GB (8192 MB) RAM, as my host system has 64 GB RAM. I like to recommend 4GB (4096) at the least. I assign 2 processors, as my host system has 8 cores and 16 logical processors. It could even be attainable to assign 4 cores, however this manner I’ve sufficient sources for my host system. You could find out what number of cores your host system has by opening the Activity Supervisor in Home windows and looking out on the variety of cores beneath the Efficiency tab beneath CPU.

Subsequent, we click on on ‘Create a digital laborious disk now’ to create a digital laborious disk. A VM requires its personal digital laborious disk to put in the OS (e.g. Ubuntu, Home windows). All packages, recordsdata and configurations of the VM are saved on it — identical to on a bodily laborious disk. The default worth is 25 GB. If you wish to use a VM for machine studying or knowledge science, extra cupboard space (e.g. 50–100 GB) can be helpful to have room for big knowledge units and fashions. I maintain the default setting.
We are able to then see that the digital machine has been created and can be utilized:

4) Use Ubuntu VM
We are able to now use the newly created digital machine like a traditional separate working system. The VM is totally remoted from the host system. This implies you possibly can experiment in it with out altering or jeopardizing your most important system.
In case you are new to Linux, you possibly can check out fundamental instructions like ls, cd, mkdir or sudo to get to know the terminal. As an information scientist, you possibly can arrange your personal improvement environments, set up Python with Pandas and Scikit-learn to develop knowledge evaluation and machine studying fashions. Or you possibly can set up PostgreSQL and run SQL queries with out having to arrange a neighborhood database in your most important system. You may as well use Docker to create containerized functions.
Closing Ideas
Because the VM is remoted, we are able to set up packages, experiment and even destroy the system with out affecting the host system.
Let’s see if digital machines stay related within the coming years. As corporations more and more use microservice architectures (as a substitute of monoliths), containers with Docker and Kubernetes will definitely turn into much more necessary. However figuring out how you can arrange a digital machine and what it’s used for is actually helpful.
I simplify tech for curious minds. In the event you get pleasure from my tech insights on Python, knowledge science, knowledge engineering, machine studying and AI, think about subscribing to my substack.