I am testing out Azure Databricks for some data science research. So what is Azure Databricks? According to the source:
Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Designed with the founders of Apache Spark, Databricks is integrated with Azure to provide one-click setup, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.
I am always a bit confused by Azure accounts. I believe that if you have a private account, you can:
- set up a 14-day trial, where you get unlimited usage of Azure Databricks
- get $200 of credit that can be used for 30 days (useful from days 15 to 30)
- get one year of free services (not clear what this will cover)
I will update this and pricing as I get a better sense of how this works.
To be completed
Once you have set up your free trial:
- Navigate to the databricks blade: https://portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.Databricks%2Fworkspaces
- Create/add a databrick workspace in a new resource group.
- Launch Workspace (the databrick workspace that you just created)
- If your sign in hangs while trying to launch the databricks workspace, you can either switch to an incognito browser or use IE.
- Download a sample notebook https://databricks.com/resources/type/example-notebook
- Import this notebook into your workspace
- Select “create table in notebook”
- Attach your notebook to a cluster (you need to create one)