Value of SME’s (subject matter experts) to Data Scientists

Standard

SMEs (Subject Matter Experts) are critical to the success of Data Science projects, yet you rarely see them getting credit for passing their knowledge along to a Data Scientist. This post is to give them the credit they deserve and give an example of expected time they could put into a successful Data Science project.

Unless the Data Scientist is an expert in the domain (which is rare), they need a SME to work hand-in-hand with, especially at the beginning of a project. They can bring the following assets:

  • where to get data & tools they use today
  • nuances in the data
  • domain (business knowledge), which is how & why things are done a certain way
  • who are the other key players in the process

An example of a SMEs could be the Marketing Analyst, DBA, Director of Sales, Engineer, and the list goes on. Data Scientists benefit greatly from their help, and whether or not the project is successful, they should be recognized as partner and rewarded for doing so.

If looking to build a ML Model, SMEs are much more involved in initial stages of ML model development, and levels off to 90/10 workload after initial POC. They can help Data Scientists think through scenarios critical for model training and target variables.

However, SMEs are most likely very busy, and carving out time to help Data Scientists are limited, so it’s up the Data Scientist to be prepared, do research, take notes, and clearly communicate. How much time is needed by SMEs? Every project is different, so it really depends. In my experience, if there were such a thing as an ideal project (I wish there was), the time commitment could look something like this:

Again, this is an example of ideal project, but at least a starting point. Thank you SMEs!

This is a personal blog. Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated. Read entire disclaimer here.

ML vs AutoML

Standard

I’ve been working on building AutoML Apps and explaining to others the difference between traditional ML and AutoML. I created visualization below to help tell the story:

This is a personal blog. Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated. Read entire disclaimer here.

MinneAnalytics Presentation

Standard

Wanted to thank attendees of last week’s MinneAnalytics conference for attending and providing great feedback to my presentation on leveraging cloud platforms for ETL and creating ML/Data Products. Slideshare deck is available.

This is a personal blog. Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated. Read entire disclaimer here.

Adding Machine Learning to iOS Apps

Standard

Use case: using machine learning and iPhone’s camera, identify certain types of objects in real-time.

Steps:

  1. create a cNN (convolutional neural network) in python with a ML (machine learning) package called Keras with TensorFlow backend.
  2. convert the newly created cNN to a format that can be used by iPhones by leveraging the Core ML iOS package.
  3. run the converted Core ML model on the iPhone to make predictions on what the phone’s camera is viewing.
  4. save the distribution of predicted probabilities from the Core ML model and send to API created with Firebase which is a BaaS (backend as a service).
  5. return the custom output based on API back to iPhone

Technologies/languages used:  Keras, python, iOS swift, Core ML, Firebase SDK, javascript

How the models and algorithms worked together:

I ended up creating combination Core ML models, in which loosely based on OOD (object orientated design).  How this works, is 1st model identifies domain, if greater than 50% probability, it would then call the next model, and if that is greater than 50% probability, which then call the API to get a returned result.

High-level diagram of how iOS can use models to make useful prediction:

Here is the iOS swift function to call both models.

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {

// initial model always called for domain detection
guard let model_one = try? VNCoreMLModel(for: imagenet_ut().model) else { return }
let request = VNCoreMLRequest(model: model_one) { (finishedRequest, error) in
guard let results = finishedRequest.results as? [VNClassificationObservation] else { return }
guard let Observation = results.first else { return }
DispatchQueue.main.async(execute: {
confidence_one = Int(Observation.confidence * 100)
})
}
guard let pixelBuffer: CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
// executes request
try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])

// ADDING 2ND MODEL TO RUN AT SAME TIME AS 1ST MODEL

// model for issue 1

let chosen_all_model = getAllModel(cise: self.Model.cise)

guard let model_two = try? VNCoreMLModel(for: chosen_all_model) else { return }

let request_two = VNCoreMLRequest(model: model_two) { (finishedRequest, error) in
guard let results = finishedRequest.results as? [VNClassificationObservation] else { return }
guard let Observation = results.first else { return }

DispatchQueue.main.async(execute: {
_confidence_two = Int(Observation.confidence * 100)

}})}

In the complete iOS project, I have 12 models available to run based on various context of what the user was doing. The API create on Firebase could handle various combinations of calls based on these contexts, and send back a useful prediction to the User.

This is a personal blog. Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated. Read entire disclaimer here.

CNN: Transfer Learning vs build from scratch

Standard

When building a CNN (convolutional neural network), there are some things you’ll need and some things you should consider .  What you’ll need is access to GPU, and the next is you’ll need a lot of labeled images.  And when I say a lot, it could be minimum of 1,000 per class.  However, using transfer learning you may be able to get away with less.  Tensorflow and Theono backed packages such as Keras, provides the ability to use pre-trained models learning as the inputs to your newly created model, and with out a doubt, this helps model performance metrics.  Especially works if you training images are some what closely related to ImageNet dataset.  The main aspect to consider is just building the CNN from a transfer model or giving a shot at building it from scratch.

Regarding transfer learning, the reality is however, that most real-world applications of CNN for image recognition are not going to be that similar to ImageNet base of images.  Not all is lost as you can still use those pre-trained model to help you achieve higher model accuracy.  But what’s the cost?   I ran a test of some image recognition project.  And here are the considerations with using transer learning:

  1.  training time – this could substantially increase your processing time, depending on your model architecture
  2. size of model – instead of a model that is 50mb, now how about 300mb.  For some people in academics this is no big deal.  But I’m talking a web service or having this model work locally on a phone or simple CPU, smaller is better
  3. can only use RGB images when using ImageNet pre-trained model.  Bummer, b/c many times grayscale is all that is needed to perform well, and RGB requires more processing power and size of final model

To understand the trade offs between a CNN backed by transfer learning versus building CNN from scratch, I tested it out on a small dataset I’m working on.  Details on my dataset:

  • 2 classes; class 0: 250 labeled images, class 1: 1,000 labeled images (noticed classes are unbalanced?  It’s a real-world problem)
  • images do not closely resemble ImageNet (again, this is more real-world)

I’m running two models, one will be CNN from scratch, and the other will be leveraging transfer learning in which I’ll freeze the top 7 layers.

Both will use image augmentation, edge detection, and cross-validation to help with getting the most out of limited images in my training set.  Will be running up to 300 epochs, with patience of 10,  and callbacks to minimize log loss.  I sure I could spend more time on trying to make marginal improvements on both models, but in this case I wanted to time box this initial model building to help me decide which path I go.

Results of CNN from scratch (on the smaller, more difficult class: class 1)

Results of CNN with transfer learning (on the smaller, more difficult class: class 1)

No surprise, the F1 score is better on the model with transfer learning at 0.93 vs 0.91.  But add the expense of a model that is 10x as large.  You make the call on the path you choose.