Total Parenteral Nutrition Formula Algorithm 

Beyond just the data analysis and modeling, this 6-week project gave me the opportunity to learn the numerous factors that could influence the success of an analytics or decision support system for the Neonatal ICU. The project covered issues like IRB approval, organizational issues, legal, pilot testing, presenting to stakeholders, managing expectations, provider level issues, IT department issues and several other factors that could affect the success of a deploying an algorithm in a real world production setting.

There is incredible potential to improve the quality of clinical care through data-driven decision making.

Echocardiogram data (EKG)

In this project we used EKG data such as the heart wall motion score, left ventricular end-diastolic dimension and E-point septal separation to predict if patient was either dead or still alive 1 year after suffering a heart attack.

Parkinson’s Disease

We used numeric data representing a range of biomedical voice measurements to discriminate healthy people from those with Parkinson’s Disease. This was a classification problem where Support Vector Machines and Naive Bayes Classifiers were used to train the model.

FDA Foods and Nutrition data

This 4 week long project focused on using SQL within R to query the FDA 2011 foods database to gain key insights on nutrition values of common US meals and their contribution to health. GGPLOT was used for visualization while SVMs and Random Forests were used to classify several foods as either fruits or vegetables based on their nutritional content.

Lung disease

This was a multiple regression problem that took patient data such as age, years smoked, and several lung function test metrics to determine the risk of lung cancer.

Liver disease

This was a classification problem that used cellular markers like mean corpuscular volume, alkaline phosphotase and gamma-glutamyl transpeptidase along with patient data like number of half-pint equivalents of alcoholic beverages drunk per day as independent predictors to determine if a patient will come down with liver disease or not.


In this classification project, predictors like HbA1c levels, highest serum glucose levels, change in diabetes medication, presence or absence of insulin in treatment regimen, number of procedures, and number of medications were used to predict the likelihood of readmission in patients with a primary or secondary diagnosis of diabetes.

Phenylketonuria mother-baby data

13 attributes of mother-baby pairs where the child had been diagnosed with the PKU inherited disorder were analyzed. The dataset was explored for possible relationships between the predictor variables (such as Mother’s age, gestation, verbal and performance IQs, weight gain, phenylalanine exposure) and the cognitive development of the 1000 babies at age one.

My Analytics Projects

Leave a Reply

Your email address will not be published. Required fields are marked *