Learn Machine Learning with Datacamp (on the Side)

I have worked with Machine Learning off and on for the last couple of years. Unfortunately, it is not part of my day job, so I need to find ways to integrate machine learning into an already busy day and week.

I stumbled across Datacamp, and so far I love what I see. After creating a free account in about 10 seconds, I was watching my first sub-10 minute video. Yes, no biggie there, but that is where it gets great: Datacamp starts providing an interactive R session, along with detailed instructions and readily available datasets, where you can actually start doing your very own work – no software required. Pretty impressive.


I would love to know more about Datacamp: who are the founders, what is their business model, and how did they get so awesome? If you know the founders, please drop a comment below.

Thanks, and if you end up testing it out let me know what you think!


Mongo v Cassandra

Here is a post from my good friend, Dharshan Rangegowda, Founder and CEO at Scalegrid. Check it out. And if you are looking to host Mongo on AWS, give him a shout. He will take care of you!


Deploying TimeXtender on Azure

When I was an IT contractor for the Marine Corps in Washington DC, the summers could get pretty hot. The only respite from the heat was, you guessed it, the server room. Sometimes, I would just fire up one of the servers and do some research on a web browser to bask in that high-powered AC (please do not tell anybody). The biggest problem from my IT worker perspective was that when the AC went down or some other piece of hardware decided to die over the weekend, one of us would have to drive in and fix it.

Fast forward 10 years, and the days of the boutique server room seem to be drawing to a close. Companies are moving to public and private cloud at an accelerated rate. There are a number of benefits to moving to the cloud: shifting capex to opex, just-in-time capacity, and not having to come in on the weekend when the air conditioner goes down.

But what does that mean for on-premise software – software that used to operate inside that soon-to-be-shut-down boutique server room? One trend I am seeing is that companies are shifting on-premise software to the cloud with a Virtual Machine (VM) architecture. Granted, there are differences between a purpose-built cloud offering and a VM-based cloud offering, but it seems that the VM architecture is the optimal approach for companies that have invested time and money creating valuable on-premise software.

With a little bit of creativity, companies can realize all of the key benefits of the of the cloud (e.g., no on-premise footprint, opex, just-in-time capacity, subscription-/consumption-based pricing, and not having to come in on the weekend when the air conditioner goes down), and these benefits are realized without having to throw away a valuable piece of software that is already delivering value to customers.

Deploying TimeXtender on an Azure VM

Let’s get to the Technical Part. I assume that you already have an Azure account. The first step is to create a new SQL Server VM (not a SQL Database…that is for Azure SQL, Microsoft’s SQL as a Service offering). I chose SQL Server 2014 SP1 Standard on Windows Server 2012 R2.


Accept the defaults when you create this VM, but make sure that you write down the username and password that you use to create the VM, because you will log into the VM with this information. I did use a resource group instead of the Classic VM, because I think that is the direction Microsoft is heading.

For the VM sizing, I chose DS1_V2 mostly from a cost perspective. It costs me $104.16 a month, so I can leave it spinning on the cloud for demos without burning through too much budget.  This is not going to handle monster workloads, but particularly if you are just exploring TimeXtender and building super fast data warehouses/data discovery hubs, it will have enough muscle for your needs.

Azure VM Sizing for TimeXtender

For the SQL Settings, select Public – assuming that you are going to want to consume data from your data discovery hub from other machines (e.g., a Qlik application). For the port, you can stick with 1433 or you can be sneaky (just be sure to remember your sneakiness when trying to connect to your database).

Install TimeXtender Software

The easiest way to get the software installed on the VM is to download it to your desktop and then drag it onto the VM: https://support.timextender.com/hc/en-us/articles/210439583-Current-Release-of-TX-DWA (chose 64-bit server version).

For full details on the installation process, http://go.timextender.com/tx-dwa-trial-dl

Connect to a Data Source + Create Data Structures

Now let’s get started. My preferred approach is to grab a copy of the Data Discovery Hub template, Import the Template as a new project, and Run the Connection Wizard.

Let’s start by importing the project. First, download the Data Discovery Hub template at https://dl.dropboxusercontent.com/u/19691948/DDH_Demo.XML. Place it on the desktop of your VM. Then open TimeXtender and select Import/Export, and then Import Project, like this:

Import TimeXtender Solution Template


As soon as you import the project, run the connection wizard. You will see that there are 4 different connections: ODS, Stage, MDW, and Dynamics. Dyanamics represents an existing Dynamics database. The other 3 (ODS, Stage, MDW) are part of the data discovery hub. This is the data repository that we are going to create to store our Dyanamics data.

Run TimeXtender Connection Wizard

In short, the data discovery hub includes the ODS (Operational Data Store) where we land non-transformed data right out of the source, the Stage Data Warehouse where we do our transformations and denormalizations, and the MDW (or Modern Data Warehouse) where we house the “business ready” data. Power Users can be granted access to the ODS so that they have access to all data across the enterprise and can conduct unfettered data discovery. The MDW is used to populate Qlik (or other data visualization/front-end systems). The data discovery hub is the most effective way that I have ever seen to get world class data to all employees within a company.

TimeXtender Data Discovery Hub

But I digress…let’s get back to setting up our system. When we run the wizard, it will first ask us where we want to store our ODS. Keep the defaults, but you have to click the “Create” button before hitting OK to actually create this database on your machine. Once you hit “Create” then you can hit OK.

Next, the wizard prompts us to connect to a Dynamics database. You can either connect to your own Dynamics database, or you can connect to a sample database we are hosting on Azure. To connect to our Dynamics database, the server name is txvm.westus.cloudapp.azure.com, the SQL Username is TxSQL, and the SQL Password is TimeXtenderRocks! (with an exclamation point!). Your wizard should look something like the image below.

Connect to Remote Data Source

Notice that there is no option to create this database, because we are pulling data from an existing data source, not creating a repository where we can store data.

Run through MDW and Stage (do not forget to click Create for ODS, MDW, and Stage!), and then click the Test Connections button on the wizard. Everything should be green, like shown below.

TimeXtender Connection Wizard Green

For starters, I have just created the Data Discovery Hub architecture, and I will leave it to you to pull tables into your ODS, structure tables from your ODS into business ready data in your Stage DW, and then present business ready data to your business users in the MDW.

If you have any problems with any of these, let me know!



Creating a Map of Your Travels with Power BI

I have been hearing a lot of great things about Power BI recently, so I decided to take it for a spin. We use Hubspot at work for our CRM, and Hubspot does not have meaningful reporting (sorry Hubspot). For example, you cannot create a report for all of your active deals in a particular location (e.g., to figure out who you should visit during a trip), because deals do not have a location – they have an associated company that has a location, but there is no way to “chain” (or denormalize) data in Hubspot.

Download Power BI

Instead of lamenting, I decided to get to work with Power BI. Simply head to this URL: https://powerbi.microsoft.com/en-us/get-started/ and select “Download” on the left.


A minute or so later, and I was ready to roll. Now, what I did was export my Deals data and then my Companies data from Hubspot, import this into Power BI, associate the Company>Name and Deal>Associated Company, and then add these to a map. Super easy, but too much for an intro to Power BI, so let’s just create some interesting data that we can visualize with Power BI.

Enter Your Travel Data

After my download completed, I opened Power BI and clicked on the Enter Data icon on the Home tab at the top.


This allowed me to enter the countries I visited, and the years I visited those countries, into an Excel like field.


Click on the Load button and I am ready to go.

Add Data to a Map

There are 2 types of maps in Power BI: bubble maps (or point maps) and fill maps (and shape maps). Microsoft has provided an excellent primer for how to add these maps to your Power BI application. I went with a bubble map, and simply dragged the Country column from Table 1 to the Location field and ended up with a map that looks like this:


How Did I Add Smarter Labels to Power BI Maps?

For all that I love about Power BI, this seemed to be a weakness: you cannot add content to the text of the pin. For example, when I hover over Ecuador, all I see is Ecuador. I cannot add the year that I visited Ecuador in any logical way. What I did to overcome this was to drag the year into the “color saturation” column, and this adds the year to the dropped pin hover text (it also changes the color intensity, but not in a way that is noticeable). You should end up with a view like this when you hover over Ecuador:


Finally, it is always good to have a listing of your data outside of your map. I did this by clicking on the table field and then adding the Year and the Country (note: I do not think that you can chose the order of these fields, so if you chose Country as the first column you will end up with an alphabetic listing of countries, and there is nothing you can do to change this…and that is why I put year first so that Power BI defaults to order by year…which kind of makes sense here)


And that is a pretty fun map that you can go ahead and share on social media, like I did here:


How to Change Data in Power BI

Everything is easy once you know how to do it, but when I first tried to change the data that I entered into Power BI’s “Enter Data” interface, I was a bit stuck. Here is what I did: click on the elipses next to Table1 on the right side under Fields, select Edit Query, and on the right side of the modal window that pops up, I selected the gear under Source, which is under Applied steps. Seems a bit tricky, so if you have a better way, lemme know!

Elipses   Source

Initial Impression of Power BI

Overall, I am impressed with Power BI. It is powerful, and it is affordable, and Microsoft has made it very easy to experiment with. If you end up creating a map of the places you have been, let me know so I can check it out!