Devin's Portfolio

About Me

I am a graduate student currently enrolled in the Data Science and Analytics program at the University of Calgary. My undergraduate degree is also from the University of Calgary, where I graduated in 2019 with a Bachelor of Science in Mathematics. I am fascinated by the way data science and machine learning apply mathematical and statistical principles to the real world in order to solve complex problems. In my personal life, I am an avid record collector and music and film enthusiast, and I have played drums both in the studio and live on stage around Calgary.

Feel free to reach out! For a quick response, you can contact me by email at devinnorris@live.ca. My resume with more information can also be found here, or at the bottom right side of this page. Thanks for checking out my site!

Technical Aptitudes

Programming Languages and Tools
  • Python
  • Jupyter Notebooks
  • R
  • SQL
  • Tableau
  • GitHub
  • Data Science Packages and Libraries
  • NumPy
  • Pandas
  • Matplotlib
  • Seaborn
  • Plotly
  • Scikit-learn
  • TensorFlow
  • Keras

Project Experience

A collection of personal and academic projects demonstrating my proficiency and interest in areas including machine learning, statistical modelling, and more. Click the photo to access the GitHub repository for each project.

Personal Projects

Exploratory Data Analysis and Classification Algorithms on Insurance Claim Data

  • Cleaned, formatted, and stored car insurance claim data using NumPy and Pandas. Performed feature engineering to extract new information.
  • Used Matplotlib and Seaborn to create visualizations and explore trends in the data. Fit multiple classification models and tuned parameters to improve performance.
  • Final product is a ML model which takes in # of Demerit Points, Age, Car Crash Status, and Urbanicity and predicts with 75.35% accuracy whether a customer will make more than one insurance claim.

A Data Driven Exploration of Video Games — Sales and Scores

  • A data story in which I explore global sales of video games and their Metacritic ratings using data scraped from the web.
  • Generated insight about video game sales over time, the most popular video games and consoles, and how critic and user scores compare to a game's popularity.
  • Published in Analytics Vidhya on Medium.

Academic Projects

Multiple Regression Analysis of Canadian COVID-19 Data

  • Created a statistical model in R using COVID-19 and provincial health measures datasets.
  • Model predicts per capita COVID-19 infection rates using predictor variables like rural population percentage and prevalence of COPD and mood disorders.
  • Worked collaboratively with three teammates to formulate the project and present our findings.

Exploratory Data Analysis of Calgary's Traffic Data

  • Analyzed traffic trends in Calgary using data visualizations such as bar charts, treemaps, geospatial visualizations, and more with pandas, matplotlib, geopandas, and plotly.
  • Results could be meaningful for traffic control, policing, insurance, urban planning, municipal budgeting, driver education, medical and emergency services.
  • Worked collaboratively with two teammates to formulate the project and present our findings.

Data Science Coursework

Fall 2020

DATA 601 - Working with Data and Visualization                               A+
Fundamental data science concepts including data organization, data collection, and data cleaning in Python. Includes a review of programming concepts in Python, as well as an introduction to the fundamentals of data visualization and critical thinking with data. Also provides an introduction to data ethics, security, and privacy.
DATA 602 - Statistical Data Analysis                                   A
The foundations of statistical inference including the application of probability models to data, as well as an introduction to simulation-based and classical statistical inference, and the creation of statistical models with R.
DATA 603 - Statistical Modelling with Data                                A+
The creation of complex statistical models, including exposure to multivariate model selection, prediction, the statistical design of experiments and analysis of data in R.
DATA 604 - Big Data Management                                   A+
Data storage and manipulation at both desktop and cloud scales. Introduces core database concepts and provides a practical introduction to both SQL and NoSQL systems. Also introduces parallel and distributed computing concepts including distributed storage and large scale parallel data processing using MapReduce. Design and implementation of new data visualizations to aid analysis, with emphasis on the practical and ethical implications of design and analysis decisions.

Winter 2021

DATA 605 - Actionable Visualization and Analytics                                            A+
Deeper tools, skills, and techniques for collecting, manipulating, visualizing, analyzing, and presenting a number of different common types of data. With a data life-cycle perspective, looks into data elicitation and preparation as well as the actual usage of data in a decision-making context. Introduces techniques for visualizing and supporting the interactive analysis and decision making on large complex datasets. Focus on critical thinking and good analysis practices to avoid cognitive biases when designing, thinking, analyzing, and making decisions based on data.
DATA 606 - Statistical Methods in Data Science                            In Progress
Design of surveys and data collection, bias and efficiency of surveys. Sampling weights and variance estimation. Multi-way contingency tables and introduction to generalized linear models with emphasis on applications.
DATA 607 - Statistical and Machine Learning                              In Progress
Advancement of the linear statistical model including introduction to data transformation methods, classification, model assessment and selection. Exposure to both supervised learning and unsupervised learning.
DATA 608 - Developing Big Data Applications                            In Progress
Provides advanced coverage of tools and techniques for big data management and for processing, mining, and building applications that leverage large datasets. Addresses database and distributed storage design for both SQL and NoSQL systems, and focuses on the application of distributed computing tools to perform data integration, apply machine learning, and build applications that leverage big data.