--- title: "Multiple sources of data" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Multiple sources of data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- Price indexes are usually made from several sources of data. An important benefit of the usual two-step workflow to make price indexes is that the elemental indexes can be built piecemeal---using different sources of data and different index-number formulas---and then aggregated with a consistent structure. Let's extend the example in `vignette("piar")` by having an alternate source of data for business `B5` that is always missing in the `ms_prices` dataset. ```{r} library(piar) # Make an aggregation structure. ms_weights[c("level1", "level2")] <- expand_classification(ms_weights$classification) pias <- ms_weights[c("level1", "level2", "business", "weight")] |> as_aggregation_structure() # Make elemental index. elementals <- ms_prices |> transform( relative = price_relative(price, period = period, product = product) ) |> elemental_index(relative ~ period + business, na.rm = TRUE) elementals ``` Instead of using survey-like data for the other businesses, `B5` is made from scanner-like data with many price and quantity observations at each point in time. ```{r} set.seed(12345) scanner_prices <- data.frame( period = rep(c("201904", time(elementals)), each = 200), product = 1:200, price = round(rlnorm(5 * 200) * 10, 1), quantity = round(runif(5 * 200, 100, 1000)) ) head(scanner_prices) ``` These type of data often require the use of a multilateral index like the GEKS. For the sake of illustration, we'll make a Fisher GEKS index over a 3 quarter rolling window and use a mean splice to make a single time series. ```{r} library(gpindex) geks_elementals <- with( scanner_prices, fisher_geks(price, quantity, period, product, window = 3) ) |> splice_index() |> t() |> as_index(chainable = FALSE) |> set_levels("B5") geks_elementals ``` These values can now be merged with the other elemental indexes, getting turned into a period-over-period index in the process, and then aggregated. ```{r} merge(elementals, geks_elementals) |> aggregate(pias, na.rm = TRUE) ```