Foundations of Data Analysis for Business
1.Because customers value flexibility in their commuting plans, CVE allows customers to cancel a booking without penalty up until the van they booked arrives at their chosen stop. As a result, not all ride bookings result in a ride actually taking place. Estimate a simple linear regression model to understand the relationship between daily booking sand daily completed rides. Report the estimated regression equation and R2 value and interpret them in words.
2. CVE would like to know if ride bookings through the mobile app can be predicted using the actions that an app user may perform prior to booking: namely, starting a session,tapping on a stop, tapping on the sidebar, and viewing van ETAs. Estimate a multiple regression model that uses the relevant variables to predict ride bookings. Multiple models involving these variables are possible; select the best model and explain your choice, citing specific numerical evidence from the regression output. Report the estimated regression equation and R2 value and interpret them in words.Forecasting.
3. Create a well-formatted and labeled scatter plot to visually inspect the ‘rides’ variable.Describe any trend and seasonality that appear to be present.
4. Construct a k-period simple moving average for the rides variable, where k is chosen based on your assessment of the seasonality patterns in the data. Explain your choice of k and report MSE, MAD, and MAPE for this forecasting model.
5. Estimate a linear trend model for the ‘rides’ variable. Report the estimated linear trend equation and the R2 of the model, and interpret both the equation and the R2 in words.
6. Estimate a linear trend model with day-of-week dummy variables for the ‘rides’ variable.Interpret both the estimated regression equation and the R2 in words, and comment on the magnitude of the adjusted R2 relative to the adjusted R2 from the regression you performed in (5).
7. (a) Use the estimated regression equation from (6) to calculate a forecast of ‘rides’ for each day in your dataset. Calculate MSE, MAD, and MAPE for this forecast. Comment on which of the two forecasts you have calculated in this problem set (from Q4 and this question) performs the best and why that method is best-suited to this data.
(b) Use the estimated regression equation from (7a) to forecast daily completed rides for
each weekday in the next month (April 1-April 29). Optional: Also forecast revenues for
each day.
8. Write a concise but thorough 1-2 paragraph summary of the forecasting analysis you performed in this problem set, focusing on the most important findings. In other words,think about the work you did for Q3-Q7 and summarize what you would communicate toCVE to help them better understand their ridership data