Up and running with TensorFlow on Windows

I was having some problems installing TensorFlow the way that is described on the webiste https://www.tensorflow.org/install/install_windows

I tried both pip and Anaconda, and failed.

The pip installation gave me this error:

Importing the multiarray numpy extension module failed. Most
likely you are trying to import a failed build of numpy.
If you’re working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control). Otherwise reinstall numpy.

The Anaconda gave me this error message:

tensorflow-1.2.1-cp35-cp35m-win_amd64.whl is not a supported wheel on this platform

and later

ImportError: No module named ‘tensorflow’

I then tried to pip using this:

pip install tensorflow

And, if I used an Anaconda command prompt, everything worked like a (sort of) charm. You can test it out by:

activate tensorflow

python

and then running this in the python console:

>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))

Not sure if one of the original 15 attempts set this one up for success, but if you are having a hard time getting Tensorflow installed on your windows machine, this mine be the trick for you as well.

Regards,

Matt

New Product Development Process

There is no one-size-fits-all process for new product development, but I liked this one:
New-Product-Development

I saw it while reading CTL.SC1x Key Concepts􀀁MITx MicroMasters in Supply Chain Management (link), and I believe that it is extracted from “Cooper, Robert (2001) Winning at New Products.”

I am hoping to do a write up on how this conventional product development pipeline struggles to add value to an enterprise. The primary challenge is connecting an unpredictable, non-linear invention process with a corporate framework that requires predictability. More to follow soon.

Setting up an Azure Data Lake and Azure Data Factory using Powershell

Login-AzureRmAccount
#first ensure that you have an Azure Data Lake that you want to use for ODX
#$resourceGroups = Get-AzureRmResourceGroup
#$azureDataLakeNames = "";
# foreach ($resourceGroup in $resourceGroups) {
# $azureDataLake = Get-AzureRmDataLakeStoreAccount -ResourceGroupName $resourceGroup.ResourceGroupName
#$azureDataLake
# $azureDataLakeName = $azureDataLake.Name
# $azureDataLakeNameLength = $azureDataLakeName.Length
# $azureDataLakeNameLength -gt 0
# if ($azureDataLakeNameLength -gt 0) {
# $azureDataLakeNames += " " + $azureDataLake.Name + " (resource group: " + $resourceGroup.ResourceGroupName + " & location: " + $resourceGroup.Location + ")"
# }
# }
# "-----------"
#"DataLakeNames: " + $azureDataLakeNames
#-------------------------------------------------------------------------------
#-------------------------------------------------------------------------------
#REQUIRED: you must enter a unique appname which will be used as the security principal
$appname = "inventose"
#OPTIONAL: change the password for the security principal password
$password = "Xyzpdq"
#run the above script, and replace DATALAKESTORENAME with the appropriate name/rg/location from your existing data lake store; or enter a new name to have a data lake created
$dataLakeStoreName = $appname
$odxResourceGroup = $appname
$dataLakeLocation = "Central US" #Central US, East US 2, North Europe
#recommended to use the same resource group as the data factory for simplicity, but you can use any resource group or enter a new name to create
$dataFactoryResourceGroup = $dataLakeStoreResourceGroup
#specify where you want your data factory - current options are East US, North Europe, West Central US, and West US
$dataFactoryLocation = "West US"
#-------------------------------------------------------------------------------
#-------------------------------------------------------------------------------
#create odxResourceGroup, if it does not exist
Get-AzureRmResourceGroup -Name $odxResourceGroup -ErrorVariable notPresent1 -ErrorAction 0
if ($notPresent1)
{
New-AzureRmResourceGroup -Location $dataLakeLocation -Name $odxResourceGroup
}
#create data lake, if it does not exist
Get-AzureRmDataLakeStoreAccount -Name $dataLakeStoreName -ErrorVariable notPresent2 -ErrorAction 0
if ($notPresent2)
{
New-AzureRmDataLakeStoreAccount -Location $dataLakeLocation -Name $dataLakeStoreName -ResourceGroupName $odxResourceGroup
}
$homepage = "https://ODXPS.com/" + $appname
#create security principal, if it does not exist
$app = New-AzureRmADApplication -DisplayName $appname -HomePage $homepage -IdentifierUris $homepage -Password $password
$app = Get-AzureRmADApplication -DisplayName $appname
$servicePrincipal = New-AzureRmADServicePrincipal -ApplicationId $app.ApplicationId
Start-Sleep 10
New-AzureRmRoleAssignment -RoleDefinitionName "Contributor" -Id $servicePrincipal.Id -ResourceGroupName $odxResourceGroup
New-AzureRmRoleAssignment -RoleDefinitionName "Data Factory Contributor" -Id $servicePrincipal.Id -ResourceGroupName $odxResourceGroup
New-AzureRmRoleAssignment -RoleDefinitionName "Reader" -Id $servicePrincipal.Id -ResourceGroupName $odxResourceGroup
#Set-AzureRmDataLakeStoreItemAclEntry -AccountName $dataLakeStoreName -Path / -AceType User -Id $app.ApplicationId -Permissions All
Set-AzureRmDataLakeStoreItemAclEntry -AccountName $dataLakeStoreName -Path / -AceType User -Id $servicePrincipal.Id -Permissions All
Get-AzureRmDataLakeStoreItem -Account $dataLakeStoreName -Path /ODX -ErrorVariable notPresent3 -ErrorAction 0
if ($notPresent3)
{
New-AzureRmDataLakeStoreItem -Folder -AccountName $dataLakeStoreName -Path /ODX
}
Set-AzureRmDataLakeStoreItemAclEntry -AccountName $dataLakeStoreName -Path /ODX -AceType User -Id $servicePrincipal.Id -Permissions All
#Start-Sleep 60 #there seems to be a lag between when these permissions are added and when they are applied...trying 1 minutes to start
$subscription = Get-AzureRmSubscription
$subscriptionId= ($subscription).Id
$tenantId = ($subscription).TenantId
#ensure there are permissions
#Get-AzureRmDataLakeStoreItemAclEntry -Account $dataLakeStoreName -Path /
#get information on datalake
$dataLake = Get-AzureRmDataLakeStoreAccount -Name $dataLakeStoreName
#here is a printout
"---------------------------------------------------------------"
"---------------------------------------------------------------"
$text1= "Azure Data Lake Name: " + $dataLakeStoreName + "`r`n" +
"Tenant ID: " + $tenantId + "`r`n" +
"Client ID: " + $app.ApplicationId + "`r`n" +
"Client Secret: " + $password + "`r`n" +
"Subscription ID: " + $subscriptionId + "`r`n" +
"Resource Group Name: " + $odxResourceGroup + "`r`n" +
"Data Lake URL: adl://" + $dataLake.Endpoint + "`r`n" +
"Location: " + $dataFactoryLocation
"---------------------------------------------------------------"
"---------------------------------------------------------------"
Out-File C:UsersMattDyorDesktopDataLake.ps1 -InputObject $text1

This is the Azure Powershell You Are Looking For

https://github.com/Azure/azure-powershell/releases/tag/v3.7.0-March2017

I was getting an error telling me to login to Azure even after I had just logged in:

Run Login-AzureRmAccount to login

There was something out of alignment with whatever version of PowerShell and Azure I had installed. After installing this version of powershell, I was up and running in no time. I read a number of other articles that told me to do things like Update-Module, and that did not work for me…but your mileage may vary.

Good luck!

Matt

 

Deleting Multiple Items on Azure

Probably for good reason, there is no easy way to delete a bunch of items from the Azure portal. This means drilling into each of the different items, clicking on delete, possibly confirming something by typing its name into a confirmation screen, waiting a minute for the operation to complete, and then advancing to the next. Totally acceptable for production assets, where you may seriously regret deleting the wrong database (and there is no undo!).

But, when you are developing on Azure, you end up creating a ton of assets, and deleting them one by one is slow (and you actually may go on auto-pilot and end up deleting something that DOES matter). Even worse, some assets do not have a delete option in the portal (hello ADF V2…still preview, so I get it).

The trick to deleting a bunch of items is to use a resource group. Start by navigating to one of the items that you want to delete, click “change” next to resource group, and then you have the option to move a number of items into a new resource group called…trash or something like that. Once all of the items have finished moving to the new resource group, you can delete everything in once fell swoop by deleting the resource group. Pretty clever, huh:).

Have any tips and tricks to share on the Azure portal? I would love to hear them.

Matt

Add New Related Table for Entity Framework

If you are building an ASPNET MVC application (or Rails, or CakePHP) that leverages migrations, the productive speed is pretty amazing…until you do something that is not on the critical path. One area that I THINK falls in this category is adding a NEW table that an EXISTING table will use as a foreign key (e.g., the EXISTING table needs to add a reference to the NEW table).

Here is how I did it.

  • Create the NEW model / class / table
  • Update the model for the EXISTING table having a reference to the NEW table
  • NOTE: do NOT add a reference from the EXISTING table to the NEW table, because this will break referential integrity tests
  • Create a migration for that NEW table and updated EXISTING table
    • Update the migration to specify that the default value for foriegn key (e.g., typically it is 0 for ASPNET MVC, and my first record will be a value of 1, so I specified a default value of 1)
  • Run the migration so that your NEW table will appear in your database
  • Create the CRUD scaffolding for the NEW table (so that you can create a proper entity
    • you can just insert a record into the database, but if you have relationships to user accounts or other logic that may be tricky
  • Using the web interface, create your first entry for your NEW table
    • You will need to do this for each environment…dev/test/production…where you want it to apply; if you just push the final solution to production, you will get an error until you manually create the first record and populate the reference to the first record
    • NOTE: my favorite approach for updating different environments is to point my local dev instance at the production database, run the migrations, and even run the web app – not good for significant implementations, but super fast for smaller projects
    • If needed, manually update your database for the EXISTING table so that all of them point to that newly created record
  • Now add the reference from the NEW table to the EXISTING table, create another migration, and update the database again

It seems a bit hacky, particularly because you cannot just deploy. But, given that it took me longer to write it up than to do it, I went with the approach.

If you have a better approach, I would love to hear it.

Regards,

Matt

What is a Product Manager?

There are a lot of great posts about what a product manager is (like here and here), but there is only one problem: none of them say the same thing, because there is no globally accepted definition of the role. In some organizations, a product manager keeps the trains running on time, in others the product manager is “CEO of the product”, and in others the product manager is the voice of the customer.

One thing that most product managers (and organizations that employ product managers) agree on is that data beats opinion, so I decided to put together a highly scientific test to figure out what a product manager really is, as defined by the “customer” that is buying product managers. In other words, I grabbed a couple dozen product manager job postings from LinkedIn, selected  9 high-quality postings (well written, specific requirements, clear scope), and then identified common features present in posting. This experiment was far from perfect, but I did see some patterns that I found interesting.

So without further ado, here is a visualization of the data:

Product Manager Features

The most interesting point to me: no job feature was required by more than 2/3 of these product management jobs. Other interesting points:

  • Most companies want experience managing the product roadmap, from “Define, develop and communicate the product roadmap” to “You have experience in creating product roadmaps that are focused on achieving business goals” to “Own the roadmap, influence product and technology strategy and direction (roadmap).” Fun fact: I bet no 2 companies agree on what a good product roadmap is:)
  • Product management experience is typically expected, and 5 years experience seems to be a sweet spot.
  • Continuous customer discovery / voice of the customer was common, but not universally explicitly required, which I find crazy. If I was to pick one attribute that a solid product manager needs, it is an obsession with the customer. Product management experience you can acquire (everybody starts somewhere), and skills like building product roadmaps can be learned, but a lack of focus on the customer seems like it should be a non-starter.

One job feature that I found interesting in its absence was DevOps. I have a hunch that as customer / product feedback loops get tighter, the product manager is going to need to get closer to the DevOps process. Knowing how to get an experiment/feature into the wild and ready to gather data on whether it is working, and gathering this data as quickly as possible (and either double down or revise) seems like a natural experience to demand of your product manager. I am not saying that product managers will drive DevOps, but they should be familiar with it…where a feature is in the release process, when it is live (and for whom), etc.  But, alas, I am on the outside here, and not one job description mentioned anything about DevOps. Five of the postings did mention Agile, which in combination with DevOps, form the foundation of the feedback loops, so perhaps it is just assumed some familiarity with how features get operationalized and released to the wild is part of the product manager’s bag of tricks.

Let me know if anything jumps out at you, or if you have opinions that diverge from this highly scientific study. 🙂

Thanks.

Matt

 

Ps-here is the text associated with the graphic above:

Roadmap Development 6
Continuous Customer Discovery 6
Product Management Experience 6
Excellent Written & Oral Communications 6
Analytics/Data Driven 6
Bachelors Degree ++ 5
Agile Expertise 5
Strategy/Market Requirements 5
Project Management 4
Dev/Sales/Marketing Bridge 4
Business Planning 4
Technical Expertise 3
Proven PM Capabilities 3
Detailed product/business requirements 3
User Acceptance Testing 3
Backlog Management 3
Deliver V1 (version one) 3
Trendspotter 2
Public Speaking 2
User Experience 2
Roadmap Execution 2
Go to Market 2
Creative 2
Influence, Relationships and Teamwork 2
Stakeholder Management 1
Product Marketing 1
AB Testing 1
Team Management 1
New Product Pitches 1

 

Product Roadmaps (and Product Managers) Discussion

Just listened to a good podcast from Janna Bastow on building product roadmaps. In 24 minutes, here is what you will here:

  • Start with what problem are you solving and for whom.
  • 2 sources of input for roadmap
    1. Top down product management: vision + objectives + big steps
    2. Bottom up approach: conversations with customers, team, competitors – in the trenches.
  • Challenge – dealing with tons of data from customer, prioritizing, and communicating
  • How to address getting input from the team into the product plan: product tree game
    • Innovation Games was the inspiration
    • Put people from multiple disciplines in a room and put a huge tree on the white board
      • Trunk represents the core – foundation
      • Infrastructure is represented by the roots.
      • Ideas are the branches.
      • Have a bunch of post it notes and put them on the tree
  • Separate release plan (2-4 weeks out) from roadmap to give you agility – roadmaps do not have dates.
  • 3-6 months out for roadmap is reasonable for an immature product; 2 weeks is sprint for them. 
  • Roadmap should not be a set of features; distill out the higher level meaning of the features into more meaningful themes.
  • Every product should have objectives and metrics – but how do you measure the effectiveness of a product manager? Who knows:)
  • Is the HIPPO hijacking the roadmap? Stakeholder management and saying no are critical skills for a product manager.
  • Who else is doing interesting work in the product management thought space? Nate Walkingshaw, CXO at Pluralsight https://www.pluralsight.com/blog/career/product-development-directed-discovery

Getting NYC Taxi Data into Azure Data Lake

I wanted to get a meaningful dataset into Azure Data Lake so that I could test it out. I came across this article, that walks through using the NYC Taxi Dataset with Azure Data Lake:

https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-data-science-process-data-lake-walkthrough

The article kind of skips over the whole part of getting the dataset into Azure. Here is how I did it:

  • Spin up a VM on Azure
  • On Server Manager, click on Local Server, next to IE Enhanced Security Configuration click the On link, and at least set Admin to Off (or else you will have to click ok a dozen times a web page)
  • Download the files from the NYC Taxi Trip website to your VM http://www.andresmh.com/nyctaxitrips/
  • Install 7-Zip so that you can unzip the 7z files.
    • Once you install it from http://www.7-zip.org/download.html, go to the install folder (probably C:Program Files7-Zip) and right click the 7z.exe file. Select the 7zip > open archive option and then click the + sign and browse to your downloads folder
  • Because the files in the trip_data.7z file are larger than 2GB, you cannot upload them using the portal, and you need to use Powershell.
  • You need to install the Azure PowerShell Commandlets – look for the Windows Install link a bit down this page https://azure.microsoft.com/en-us/downloads/
  • You will probably need to restart the VM for the Azure commands to be available in PowerShell
  • Go wild on Azure Data Lake Store using this doc https://github.com/Microsoft/azure-docs/blob/master/articles/data-lake-store/data-lake-store-get-started-powershell.md – here are the key steps:

 # Log in to your Azure account
Login-AzureRmAccount

# List all the subscriptions associated to your account
Get-AzureRmSubscription

# Select a subscription
Set-AzureRmContext -SubscriptionId “xxx-xxx-xxx”

# Register for Azure Data Lake Store
Register-AzureRmResourceProvider -ProviderNamespace “Microsoft.DataLakeStore”

#Verify your ADL account name
Get-AzureRmDataLakeStoreAccount

#Figure out what folder to put the files
Get-AzureRmDataLakeStoreChildItem -AccountName mlspike -Path “/”

NOTE: if you do not want to copy the files one-by-one, you can just copy the whole folder using this format: Import-AzureRmDataLakeStoreItem -AccountName mlspike -Path “C:UsersTaxiDesktopfiles2trip_data” -Destination $myrootdirTaxiDataFiles

Once you have the files uploaded to Azure Data Lake, you can delete the VM.

If you know of a faster way of getting them there (without downloading them to your local machine), I would love to hear it!

Thanks.

Matt