Your cart is currently empty!
The covid19 pandemic had a devastating impact on airline industry. As travel restrictions imposed by countries are gradually lifted, the industry is seeking ways to restore customers’ confidence in air travel. The marketing manager of Birdy Airline (BA) is planning to design promotion strategies for customers who flew with BA in the past 3 years.…
The covid19 pandemic had a devastating impact on airline industry. As travel restrictions imposed by countries are gradually lifted, the industry is seeking ways to restore customers’ confidence in air travel. The marketing manager of Birdy Airline (BA) is planning to design promotion strategies for customers who flew with BA in the past 3 years. He wonders if data mining techniques can be applied to divide the customer base into several segments so that the strategies can be designed for each segment independently.
2(a) As a data analyst of BA, you are asked to plan a data mining project for market segmentation. With reference to the CRISP-DM framework, discuss how you are going to plan the project for BA. (30marks)
2(b) in view of the covid19 impacts on air travel demand, BA decided to create a data visualization dashboard that can monitor the number of covid19 cases in various countries where BA flights will fly. If there is a country where the covid19 cases increase rapidly in a short period of time, new entry requirements may be imposed by the country. This however introduces new changes to BA operational procedures. For example, BA ground staff may need to ensure that passengers present certain documents (e.g. negative pre-departure test results) before they can be onboard.
In order to better prepare for any potential changes that BA may need to incorporate during the pandemic, BA would like to develop the dashboard on a cloud platform (e.g. IBM Cloud, Google Colab) to visualize the global covid19 situation and update it daily for close monitoring. Apart from getting better prepared for changes in operational procedures, identify 2 additional advantages of using a cloud platform in the given context. (10marks)
1(a) Before model construction, it is noticed that there is 1 patient whose donor’s age is missing. Suggest how to solve this data quality issue before model construction. (10marks).
1(b) the dataset is split into 2 subsets: 70% of the data is used for training a classification tree while 30% is used for testing. Below shows the classification tree obtained. The testing accuracy is 70%. Based on the given classification tree, compute the training accuracy. Then, evaluate the performance of the classification tree and comment if you would recommend deploying it to predict the survival of patients. (15marks)
1(c) a 20-year-old patient suffered a malignant disease relapse. To treat the disease, she received a bone marrow transplant from a donor aged 40 with the same blood type. After the transplant, she developed an extensive chronic GVHD. Use the given classification tree in 1(b) to predict the chance of her survival. In your answer, explain how you arrive at the final prediction based on the classification tree. (10marks)
1(d) propose a data mining objective where association analysis can be used on the dataset. Then, discuss if there are any data preparation steps required in order to mine association rules from the dataset. (15marks)