The terms data science and data engineering get thrown around a log, but what is the difference? What are the similarities? Both have to do with vast amounts of data, but where do they diverge, and where do they overlap?

Data Engineering

Data engineering involves preparing the data infrastructure for analysis. This…

Python and R both have their strengths and weaknesses when it comes to data science. One language isn’t necessarily better than the other, but it comes down to the application and the solution to the questions you’re trying to answer. Data scientists should know both languages to some degree as…

Clustering is an important technique for unsupervised learning algorithms. It refers to grouping similar data points by their attributes. In this post I will go over Gaussian Mixture Models for clustering.

Gaussian Mixture Models (GMMs) differ from other clustering models in that they assume a certain number of Gaussian distributions…

Mark Subra

I am a data scientist having recently graduated from the Flatiron School Immersive Data Science Bootcamp

