Your cart is currently empty!
STA1004 Fundamental Statistics for Accountants Assessment item 2: Problem Solving (including Problem Solving assignment and three Module Reviews) This soultion contained the work in Excel and Word document : Dataset: for Question 2, Question 3, and Question 4 you will need to use Excel to answer the questions. The dataset RealEstate.xlsx (Excel) can be…
STA1004 Fundamental Statistics for Accountants
Assessment item 2: Problem Solving (including Problem Solving assignment and three Module Reviews)
This soultion contained the work in Excel and Word document :
Dataset: for Question 2, Question 3, and Question 4 you will need to use Excel to
answer the questions. The dataset RealEstate.xlsx (Excel) can be found under the
Assessment tab on the StudyDesk.
Dataset Information – Real Estate Dataset
The real estate file (RealEstate.xlsx) looks at 513 dwellings sold in Melbourne, Australia
(it is a random set of dwellings sold in 2020 in Melbourne). The following variables are
included in this dataset (including their units and an explanation of what they are):
• Suburb – the suburb that the dwelling is located at
• Rooms – The number of rooms (bedrooms and study areas) in the dwelling
• Type – the type of dwelling (1 = House, 2 = Apartment, 3 = Townhouse)
• Price – the selling price ($) of the dwelling
• Distance – the distance the suburb is from the City Centre (Melbourne) in km
• Bedroom – the number of specific bedrooms in the house
• Bathroom – the number of bathrooms in the house (1 = 1 bathroom, 2 = 2
bathrooms, 3 = 3 or more bathrooms)
• Car – the number of car spaces at the dwelling (0 = no car spaces, 1 = 1 car space,
2 = 2 car spaces, 3 = 3 or more car spaces)
• Landsize – size of the land the dwelling is situated on in square metres
• BuildingArea – The size of the building (dwelling) in square metres
• Year – the year the dwelling was built (1 = Built prior to 1920, 2 = built between
1920 and 1970, 3 = Built after 1970)
Question 1 (5 marks) (Module 1)
Google ‘bad graph’. From any of the numerous examples that will be shown in the search
results, select one that relates in some way to Accounting (or Business). Include a
picture of the graph (1 mark) and the URL (1 mark) in your answer. Based on the
criteria discussed in the lecture and tutorial, identify, and explain three things you think
make it a ‘bad graph’ and which could lead to the graph being misinterpreted (3 marks).
Question 2 (18 marks) (Module 1 and 2)
a) (8 marks total)
Data screening is an important procedure prior to starting any analysis to ensure there
are no mistakes in the data entry or unusual values that you need to be aware of.
• For categorial and ordinal variables we use frequency tables (Pivot tables) to
determine if all values are accounted for.
• For quantitative variables we use summary statistics (mean, minimum,
maximum, and standard deviation) to check for any discrepancies.
Use Excel to screen the dataset RealEstate.xlsx.
In your answer provide only the relevant Excel output for the variables Distance
and Type (2 marks). Describe any potential problems or mistakes you can identify
for these variables from this screening process (4 marks). For Distance, how would
you resolve the issue (explain how only – no need to edit the data set)? (2 marks).
b) (10 marks total)
Creating plots is also an important part of screening data before any analyses are
performed.
i. Use Excel to produce boxplots to compare the Price ($) for the YearBuilt (1 =
Prior to 1920, 2 = 1920 – 1970, 3 = After 1970), ensuring you include a title
with your name (4 marks).
ii. Compare the distributions of the Price for each of the YearBuilt by commenting
on the shape, centre, spread and outliers of the distributions (4 marks). What
pattern do you observe for this data? (2 marks).
Question 3 (40 marks) Module 3
A researcher would like to determine whether the Building Area (m2) of a dwelling can be
used to predict its Price ($), for those dwellings that are APARTMENTS only.
NOTE: Only APARTMENTS (ie Type = 2) are to be analysed in this question. (Filter and
select the ‘Apartment’ (type = 2) for all analyses in this question).
a) (2 marks)
Select out the apartments (type = 2) by filtering the variable type. Give a
screenshot of this selection.
b) (10 marks total)
Use Excel to construct an appropriate graph to display the relationship between
‘Price’ and `BuildingArea’ (2 marks) for all Apartments. Label the axes correctly,
include units of measure and provide a meaningful title with your name (2
marks).
Explain why you have chosen this type of graph (1 mark). Explain clearly why
you have chosen your dependent (y-axis) and independent (x-axis) variables (1
mark). Comment on the features of the graph, including whether there appears
to be a linear relationship (4 mark).
c) (3 marks total)
Use Excel analysis to find the correlation coefficient to measure the strength and
direction of the relationship between the two variables. Provide and interpret this
statistic (do not include Excel output).
d) (5 marks total)
Use Excel to perform regression analysis and find the equation of the regression
line which could be used to make predictions. Provide the regression output (1
mark), state the equation (3 marks) and include this line on the graph produced
in part (a) (1 mark).
e) (2 marks total)
What is the predicted Price ($) for an apartment which has a Building Area of
150m2 ? Show all working.
f) (5 marks total) *
Consider the equation of the regression line in part (d) of this question. Interpret
the slope (3 marks) and intercept (2 marks) of this regression line in context.
g) (2 marks total) *
From Excel output, state the value of R2 (1 mark) and interpret what it means in
context (1 mark).
h) (11 marks total)*
Is the regression line produced in part (d) a good prediction model (1 mark)?
Discuss in 200 words or less, with reference to both correlation analysis (4 marks)
and residual analysis (4 marks). An appropriate Excel plot (include your name in
the title) based on the residuals from the regression analysis should be included in
your response (2 marks).
Question 4 (37 marks) Module 5-6
According to the Melbourne Property Market Update from December 2020, the average
price for apartments in the Melbourne CBD was $580000. A researcher suggests that the
sample provided in this study is from 2021 not 2020 and wishes to perform a hypothesis
test to check this.
Perform a hypothesis test to determine if the mean Price of Apartments is greater than
$580 000.
Use the dataset RealEstate.xlsx to answer this question. Use the same sample as per
Question 3 (ie only apartments)
NOTE: Only APARTMENTS (ie Type = 2) are to be analysed in this question. (Filter and
select the ‘Apartment’ (type = 2) for all analyses in this question).
a) (3 marks total)
Use Excel to produce descriptive statistics for the Price of Apartments. Report the
number of observations, the mean, median, standard deviation, minimum and
maximum. Do NOT copy and paste the output table – you should extract the relevant
values.
b) (4 marks total)
Researchers wish to perform a Hypothesis Test to determine if the mean Price of
Apartments is greater than $580000.
i. State the appropriate hypotheses for this test. (2 marks)
ii. What conditions and assumptions are required when performing this test?
You do not need to check the assumptions at this point, just list them. (2
marks)
c) (7 marks total)
Use Excel to perform the Hypothesis Test for the hypotheses in part b). Include the
following in your answer and make sure all tables have relevant titles:
i. The Excel output needed to find the test statistic and P-value for this test.
(2 marks)
ii. State the value of the test statistic for this test. (1 mark)
iii. State the P-value for this test. (1 mark)
iv. Interpret the P-value and write a meaningful conclusion in the context of
the researcher’s question. (3 marks)
d) (13 marks total) *
Check the assumptions related to this Hypothesis Test (4 marks), including producing
an appropriate graph (3 marks).
State in particular if any of the assumptions need to be carefully considered and why
(2 marks). As a statistician leading this project what suggestion would you make after
you had considered the assumptions? (4 marks)
e) (10 marks total)*
Using the data produced in part (a) find a 98% confidence interval for the mean of
Price for Apartments. Complete this by hand showing:
• Formula used to calculate this confidence interval (1 mark)
• Critical value and degrees of freedom required for this confidence interval,
found using the tables used in this course (2 marks)
• Margin of Error for this confidence interval (2 marks)
• The final 98% confidence interval for the mean of the Price of Apartments (2
marks), including interpretation of what this result means (3 marks)
• Must show full workings to get full marks. Answers can be checked using Excel,
however full steps must be shown here to gain full marks. Summary statistics
from part (a) can be used