For businesses that want to make the most of their data, moving to the cloud can be a daunting undertaking. But the cloud is the basis for the modern data warehouse, which lets you bring together all your data in one easily accessible place to gain insights through analytical dashboards, operational reports and advanced analytics. It’s not a matter of if to make the move, but when.
As Microsoft® moves away from upgrading on-premise software and databases, we’re seeing more businesses make the leap to Azure, Microsoft®’s public cloud computing platform. This version of the cloud is already replacing on-premises software because it’s fast and easy to use, offers scalable and reliable cloud storage, and gives businesses all sizes the backup and recovery solutions they need.
Azure provides a range of cloud solutions that can be used for services such as analytics, virtual computing, storage, networking, and much more. Users can pick and choose from these services to develop and scale new applications, or run existing applications.
Easy to use – Microsoft configures everything, including the servers, for your business.
Scalable – You can scale up or down quickly and easily as the needs of your business evolve and change.
Robust – Azure offers more tools and features than on-premise software systems.
Freedom – Using Azure frees up your employees to focus on their core jobs so you can better manage your IT environment.
Cost-effective – Only pay for what you use, with fewer costs related to the infrastructure, maintenance and management of your data.
Secure – Microsoft’s advanced security technology protects your data and helps you monitor threats in real time, identify and react to suspicious user and device activity on your network, and provide the highest levels of access security.
Reliable – With data centers located all over the world, Azure offers 99.95% uptime – less than five hours of downtime per year.
How can my business use Azure?
We often see businesses using Azure to manage high volumes of data, especially unstructured data. Azure is flexible to meet a wide variety of business data needs:
- Cluster data and transform that data for a SQL data warehouse or database
- Set up Power BI analytics and reports
- Gain insights from your data
- Connect devices to the Internet and harness the power of your IoT solutions
- Host, develop or manage web or mobile apps
- Backup and disaster recovery
The major components of Azure
In Azure, data collection is separated from the preparation and the delivery. The source data is stored in archive files (BLOB, Data Lake) and the file data supports many delivery methods and analysis scenarios (DW, Power BI, Machine Learning via Azure ML, Databricks, etc.) and allows for data warehousing.
There are still similarities to on-prem systems, but Azure is continuously rolling out new features and capabilities every day.
The Azure modern data warehouse comprises six products that flow information through five phases indicated by the green circles in the image below. We break this down into two to simplify and clarify the process:
Phase I: Data Collection
This phase covers how we pull data into Azure.
- Ingest – Extracting data (structured, unstructured, and semi-structured) out of your sources using Data Factory and Azure Logic apps to Azure Blob Storage
The Data Factory is Azure’s data movement and data transformation ETL/ELT tool, similar to on-prem SSIS. It is used to combine all your structured (relational), unstructured (nosql db) and semi-structured data (logs, files, and media). it has built-in connectors to Salesforce, ServiceNow, Dynamics, and more.
This phase also uses Logic Apps, an Azure tool that allows developers to integrate workflows and connect disparate systems with very little code to visually create workflows to ease support.
- Store – Use Azure Blob Storage (media files, pictures, movies, images, etc.) or Data Lake Store to store the data untransformed in its original format, keeping all the fields, so you can do future analysis. By storing the data in its raw format, you maintain the option to support future analytical cases and support different audiences.
In the past, the “classic” approach was a top-down approach where we start with the business case and then build out the solution, extract the data, transform it and keep only what’s required.
This new “modern” approach is a bottom-up approach keeping all the raw data for future storytelling and experimenting. The requirements are less defined, enabling the data to support many different scenarios (data warehouse, Power BI, machine learning, predictive analytics, etc.) and fulfill the needs and support of different audiences.
Phase II: Analysis, Preparation, Delivery
In this stage, you can analyze the data and how you want to present it to consumers.
- Prep and train – Data Science: At this level, data scientists can use the data for analysis – the key feature of a modern data warehouse. This allows you to perform analysis directly on the data files or use the Azure data warehouse. This is where the applications come together, and data scientists can use many different tools and technology options – such as Azure Databricks, an Apache Spark-based analytics platform, HIVE, Azure ML, or Python – for deep learning and streaming applications.
- Model and serve – Enterprise Data Storage: This is where you store the data. It’s still necessary to have a data warehouse, which is a key component to setting up the star schema. You can do this in either Azure SQL data warehouse or Azure SQL database, depending on cost and workload volume.
Azure SQL database is the easiest way to create a “Database-as-a-Service” and is usually the preferred method, unless you’re dealing with large volume of data.
- Semantic layer: This uses Azure analysis services to build operational reports and analytical dashboards on top of the Azure SQL database or Azure data warehouse.
- Data delivery: This is your data delivery method, where you can run ad hoc queries directly on the data within Azure Databricks and you can even use Power BI to source data bricks.
In this phase, you can also use PolyBase, which allows you to combine relational and non-relational data. It makes it easy to query the data by using T-SQL queries you can query external data in Hadoop or Azure blob storage. You can also import and export data from Azure Blob Storage, Data Lake Storage and Hadoop.
Two things to remember as you migrate to Azure
- A good foundation still matters. That is, you still need a presentation layer and schematic layer foundation for everything.
- Clean data is king. Meaningful, cleansed data is critical for performing statistical analysis or predictive analytics.
The bottom line: The sky’s the limit with Azure.