From the Professor-
This question is often asked: “Why Excel and Word?” As I have stated in class before, individuals that will deal with data, whether analysts, accountants, product managers, etc…, need to be able to find patterns in the data and report on them. This is a brief introduction to that world. Given that this is an economics class, the project will focus on economic data.
It has been heavily studied that inflation-adjusted GDP (aka Real GDP) is affected by a myriad of variables, such as incomes, education rates, unemployment levels, and inflation. Multiple proxies and models have been put together to try and find better measurements of real GDP. Well, it is our turn. For this project, you will investigate which of the following variables has the greatest explanatory ability on real GDP: (1) employed individuals, (2) a human capital index (think of this as a proxy for training and education), and/or (3) the inflation of popular foods.
The data set is based on the 2019 real GDP levels for over 150 nations.
You have until December 18, 2020 11:59:59pm PST to turn in an EXCEL/WORD document via Canvas. The subject header must be E3060 – PROJECT – LAST NAME, FIRST NAME. Any deviation from this will result in a 10 point penalty.
You will download the file called ECON3060PROJECT.XLS from the Class Project Module found on Canvas.
There are two sections (see QUESTIONS FOR THE PROJECT):
- Data Exploration and Preliminary Analysis
- Multiple Regression/Model Building
You will perform some analyses in Excel, save it and then provide a summary in Word. There is no required length. You are going to be graded on completing the EXCEL portion and the summarizing of the data in WORD/PDF document. I will not grade on grammar. You are encouraged to use the class notes for the exercise.
COMMON EXCEL FORMULA
=AVERAGE() =MEDIAN() =STDEV.S() – Std Dev for a sample
=STDEV.P() – Std Dev for a population =VAR.S() – variance for a sample
=VAR.P() – variance for a population
QUESTIONS FOR THE PROJECT
PART 1: Data Exploration and Preliminary Analysis
QUESTIONS, PART I
How many data points are there in each data set?
What is the average, median, variance and standard deviation of each variable?
Graph each variable using a bar chart
Are the data skewed?
Why? (Use a Z-score to prove your reasoning)
Summarize the nation and the dependent variables that are possible outliers.
PART 2: Multiple Regression
QUESTIONS, PART II
Create scatter plots between the dependent and each independent variable? Show the individual R2 value. What are the correlations? Which variable seems to be the strongest and weakest?
Determine the formula for this regression.
What is the coefficient of determination? What is the adjusted coefficient of determination? What does each mean?
At what point is each independent variable significant? What does a 1 unit increase in each independent variable do to the dependent variable?
What is the mean squared error for this model? What does it mean?
Test the overall significance of the model.
UPLOAD YOUR RAW DATA FILE AND A SUMMARY IN WORD/PDF FORMAT